Blog Archive

Check out all blog posts in my blog archive. Click on a headline to read the teaser.

Build workflows collaboratively using reusable and shareable packages
Build workflows together without duplicating work. Read More ›

Helm for Kuberenetes, Lessons Learned
Team experience with helm and the lessons we learned Read More ›

Short overview of CSS-in-JS using Emotion Read More ›

Rust, React, and Raspberry Pi
A Parental Leave Project. Read More ›

Gateway, origin
The history of ARGO Gateway Read More ›

Don't use AssertJ
When practicality trumps ergonomics. Read More ›

The Right Tool For The Job
Choosing the right tool for the job is always important. The Jetbrains toolset is a powerful tool that should be part of any programmer's toolbox. Read More ›

Tell Me About Yourself, GraphQL
Using GraphQL introspective queries to explore your schema Read More ›

Accelerate data visualization with Stardust, a WebGL Platform
Data visualization with massive data could be fine in your browser Read More ›

Introducing ZenCrepes - Agile analytics across GitHub orgs & repos
Have you ever tried managing a scrum team across GitHub organizations and repositories? ZenCrepes has been created to facilitate project management for teams operating solely over GitHub issues, and across multiple organizations & repositories. Read More ›

Migrating 8 PB of data from Filestore to Bluestore
How we migrated an 8 PB cluster from Filestore to Bluestore Read More ›

Agile in research - Another take on metrics and estimates
Can data help tell the story of a team? And how, by understanding its story, can a team become better at foreseeing its future? Read More ›

Transitioning away from Jira ? Lessons learnt
Over the past year, we’ve been moving most of our projects away from Jira, in favor of GitHub issues. This blog post will go over the reasons why and lessons we learnt along the way. Read More ›

Fizz buzz in Tensorflow?
I tried teaching a computer some math and learned a couple things along the way Read More ›

Testcontainers FTW (a really big win)
Never configure another test database again, and no we're not talking about H2. Read More ›

Jupyter in ICGC: A Data Science Sandbox for Genomics
The ICGC Data Portal gets a new analysis tool! Read More ›

Programmable Pipeline in Kids-First ETL
Refactored Kids-First ETL to make the Pipeline programmable Read More ›

Let's Encrypt the Collaboratory
SSL all your sites with Let's Encrypt and Ansible Read More ›

Drops of Jupyter...
or, How I Learned To Stop Worrying and Dockerized JupyterHub Read More ›

Using JWT's with Spring Security's @PreAuthorize annotation for method specific security
We explore how to implement a spring security strategy that statelessly authorizes a user using via JWT, allowing for method level permissions using the @PreAuthorize annotation. Read More ›

Scripted Kubernetes installation with dashboard and monitoring.
A quick how-to on deploying Kubernetes on top of Collaboratory VMs with Kubernetes dashboard and Graphana monitoring enabled. Read More ›

Introducing the Kids-First ETL
An introduction of how to apply Scala's functional programming features to the Kids-First ETL Read More ›

Integrating Clinical Annotation Features into the ICGC Data Portal
In February 2018, we integrated new features focusing on clinical annotation of variants into the ICGC data portal. Here, we will explore some of those features and how they can be used by researchers! Read More ›

Introducing Overture
Today we are introducing Overture, our collection of open, composable and extendable components for data science in the cloud. Read More ›

Day 1 as a Collaboratory User
A tutorial - from signup to downloads of Cancer data at the Cancer Genome Collaboratory! Read More ›

Proof of Concept: Implement Directed Acyclic Graph DB with Version Control using Elasticsearch
A step-by-step illustration on how Elasticsearch could be used to implement a version controlled directed acyclic graph database. Read More ›

How to run Kubernetes in Collaboratory.
A quick how-to on deploying Kubernetes on top of Collaboratory VMs. Read More ›

Infrastructure procurement in large-scale genomic projects.
More than 3 years into the project, it's a good time to reflect on our procurement strategy for the Cancer Genome Collaboratory, with a focus on the Storage infrastructure. Read More ›

A performance-tuning practice: from 20 hours to 8 minutes
An introduction of how to tune DCC-Release ID job and reduce the execution time greatly Read More ›

Why Siri Can't Code (Yet)
Read this article to learn why Siri can't code (yet), but how she might soon learn! Read More ›

A Detailed Example of Writing CSV Files using Different CSV Processing Libraries
A Detailed Example of Writing CSV Files using Different CSV Processing Libraries Read More ›

It's the Little Things...
It's often the little things in life, and in coding, that trip us up. Read More ›

Understand velocity and forecast with JIRA Agile
Agile is at the heart of project management methodologies within our software engineering team. While trying our best to follow the Agile principles, we frequently adjust tools, workflows, methodologies in an effort to deliver better software, more efficiently. Understanding the team’s velocity in this evolving context is key to assess remaining effort and guesstimate completion dates. Read More ›

The Major Keys of SSH: Using Jump Servers and Port Forwarding
An introduction and tutorial for key based authentication, jump servers, and port forwarding. Read More ›

Tea Chat with a Business Analyst (BA)
What does a business analyst in the cancer genomics and bioinformatics world do? Read More ›

Migrating to Relay Modern
A guide to migrating a legacy Relay code base to Relay Modern Read More ›

An introduction to life-enhancing JetBrains Tools
An introduction to life-enhancing JetBrains Tools Read More ›

OICR presented Collaboratory at the OpenStack Summit in Boston
George and Jared gave a presentation at the OpenStack summit in Boston in May 2017. Read More ›

Data visualization using R and Shiny
Data visualization plays an essential role in interpreting complex biological datasets. It allows researchers to explore, understand and communicate data in a way descriptive statistics cannot compete with. Read More ›

Using AngularJS components & directives inside React
Using AngularJS components & directives inside React Read More ›

Migrating instances within the cloud
How migrating instances within the cloud can improve user experience and make your life easier Read More ›

Create a mobile app with Laravel PHP Framework
An easy way to create a mobile app from an existing Laravel application Read More ›

Distributed Ledgers, Blockchains and Smart Contracts
An introduction into the ideas, technologies, and use cases of distributed ledgers, blockchains and smart contracts. Read More ›

JS + Physical World
Javascript has been around for a while but in the past couple of years it has gained a lot of attention. Its about time we start using it for purposes other then web development. This article aims to demonstrate what can be achieved in the physical world by using Javascript. It will also point you in the direction of various tools, libraries, frameworks, devices and tutorials that can help you get started. Read More ›

JSON: Like a Boss - Introduction to ./jq
Bob Tiernay explores the fascinating world of jq, "the JSON Processor”. Starting with a motivation, he then covers the language, provides helpful tips, showcases a real world example, cautions some things to avoid and finishes with a discussion of the ecosystem. Read More ›

Azure Blob Storage - Java Examples
Examples in Java for working with Azure Blob Storage Read More ›

Using ELK and Ntopng to monitor data downloads.
How we use ELK (Elasticsearch, Kibana, Logstash) and Ntopng to track and visualize data downloads Read More ›

Supporting D3 v3 and v4
D3 is a popular javascript library for data visualizations. Last year they released version 4, which had a significant rewrite of the API. This introduced a problem for us because we had one project using D3 v3 and another project using d3 v4. These two projects shared some code and therefore we needed to be able to support both versions of D3 depending on which one was available. Read More ›

Migrating a legacy frontend build system to Webpack
Migrating a legacy frontend build system to Webpack Read More ›

GitHub repository as job scheduling system to orchestrate large data transfer
The ICGC Data Coordination Centre was tasked to transfer an over 700TB dataset into cloud storage systems. We developed a simple and reliable job scheduling system based on GitHub repository, and successfully employed it to orchestrate and track the execution of over 45,000 transfer jobs to complete the task. Read More ›

Adding support for multiple authors in Jekyll
OICR being a research institute, a portion of the team, in particular those of us more on the science side, is used to Academic Publishing and its related conventions such as Academic Authorship. It was not long before we were asked for supporting multiple authors in a blog post. Read More ›

tmux your local dev environment
Web developers spend a lot of time in the terminal. Add in a tab for a text editor, another one for running tests and yet another for git and the amount of terminal tabs balloons. Tmux is used a lot on servers to share sessions between users, but it can also be used locally in tandem with tmuxinator and tmux-resurrect to manage this headache. Read More ›

Shading Elasticsearch
Shading or package renaming a.k.a class relocation is a process of creating an uber-jar which contains its dependencies and package names of some of the dependencies are renamed. In this blog post I will provide instructions how to create an Elasticsearch jar file with shaded dependencies to save you from the perils of Jar Hell. Read More ›

Staying up to date with Glance Images
How we keep our public cloud images up to date, improving the user experience and security! Read More ›

New OpenStack Whitepaper
The team has been contributing to a Whitepaper on OpenStack for Scientific Research. It went live today. Read More ›

Hello World!
Welcome to our blog, welcome to our world. We are a team of software engineers, infrastructure specialists and bioinformaticians building infrastructures and tools used by cancer researchers around the world. Read More ›

Another cloud in the sky: Azure in a Unix shop
A guide for Unix shops on how to setup the Azure Storage Emulator running on your team's Openstack cloud. Read More ›

Openstack and Ceph used in large scale cancer research
George was at the OpenStack summit and presented the Cancer Genome Collaboratory infrastructure during a BrownBag talk. Read More ›

Simple is Good: A Look at Vue and mobx+react
Presentation given by Chang during an OICR Software Engineering Club meeting. Read More ›

Extreme Streams: The What, How and Why of Observables
Observables are great for building UIs and RxJS is an amazing implementation of them. Despite the library's awesome power, it’s relatively underutilized mostly due to it being “hard”. This talk gives a high level overview of "what" observables are, "how" you use them, and "why" they are useful, through a basic implementation and a real world example (searching reddit for cute animals). Read More ›