Inspiration for machine learning cell cycle sorting method found in an unlikely place
The lifespan of cells in the human body varies greatly, from as short as a few days, such as those in our stomach lining, to the cells in our bones, which live for 25-30 years. And for each healthy cell, no matter how long it lives, there is a tightly coordinated series of events that lead to its division and the creation of two identical daughter cells. This is called the cell cycle.
The ability to sort and identify these cells based on their cycle is a daunting task—but if achieved, it could help scientists identify cell subpopulations that share a similar transcript, protein expression level, or functional marker to reveal differences between healthy and diseased tissues.
These subpopulations combined with single-cell clinical data can be used to predict important clinical characteristics, such as a response to a particular treatment or the likelihood of relapse, for example in specific cancer types.
Thanks to a Swiss National Foundation grant, IBM researcher Marianna Rapsomaniki and and Maria Rodriguez Martinez have studied the challenge for sorting cell cycles with Xiaokang Lun and Prof. Bernd Bodenmiller at the University of Zurich, but the answer came from an unlikely field – logistics.
It's a rough analogy, but supply chains uses analytics to optimize and sort inventory and logistics, which got Rapsomaniki thinking.
She explains, "In the field of biology, we aren't sorting packages based on zip codes. Biologists want to sort cells based on where they are at in their lifecycle. The algorithms are agnostic to whether it's a package or a cell, so I thought it was worth a try."
As luck would have it, Rapsomaniki has access to two supply chain optimization experts within walking distance of her office in Zurich, Switzerland. The duo, IBM scientists Stefan Woerner and Marco Laumanns, are typically working with clients to make sure there is enough inventory, such as shovels, for an upcoming snow storm, initially saw it as a major stretch to apply their methods at the cellular level.
But Rapsomaniki wasn't deterred. She brought together the unusual team to develop an innovative cloud-based technology for scientists to use to study and sort single cells according to their cell cycles, opening new possibilities for drug testing and cancer research. The results are appearing this week in the peer-reviewed journal Nature Communications.
"An entire tumor could be composed of thousands of cells and to study its composition through bulk approaches greatly masks its variability," said Rapsomaniki. "With single-cell approaches such as mass cytometry and our computational method, called CellCycleTRACER, we can pinpoint the proteins at a single-cell resolution, within the data for the first time."
Currently in beta, CellCycleTRACER is a supervised machine-learning algorithm that classifies and sorts single-cell mass cytometry data according to their cell cycle, and allows scientists to correct for cell-cycle-state and cell-volume heterogeneity. Essentially, it's a tool to find the proverbial needle in the haystack.
The algorithm is implemented as a simple and intuitive web application that can be applied to any mass cytometry dataset. Rapsomaniki and her co-authors are currently bringing it to the cloud, where scientists throughout the world will be able to upload and analyze their datasets for free.
"I'm confident that researchers in life sciences, particular those studying tumor heterogeneity will be able to explore cell cycle behavior better than ever before. Experimentalists will also find applications in examining the trajectories of the different proteins and understand how they fluctuate across their cell cycle," adds Rodriguez Martinez.
Who knew there were such similarities to delivering packages and sorting cells?