'People are thirsting for quantitative information': The COVID-19 virus, by the numbers
When Rob Phillips decided to leave the field of condensed matter physics and become a biologist, it was because of a fascination with viruses. For the last 20 years, Phillips, Caltech's Fred and Nancy Morris Professor of Biophysics, Biology, and Physics, has sought to get a numerical grasp on the study of viruses.
"People are thirsting for quantitative information," he says. "When you have a numerical grasp on the facts, you can make more precise predictions about how processes unfold. It's complicated and hard, but we've made headway."
In 2018, Phillips and his longtime collaborators Ron Milo and Yinon Bar-On made sweeping quantitative global estimates of how much biological matter there is on the planet, illustrating how humans cause a disproportionate impact on our environment.
Now, with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causing a global pandemic, Phillips and the team, once again led by Ron Milo of the Weizmann Institute in Israel, have turned his "by-the-numbers" approach toward studying the novel virus. He and his collaborators have now published a paper in the journal eLife about the key numbers underlying COVID-19—the virus's average concentrations throughout the body, the stability of the virus on various surfaces, the infection rate, and more. We sat down with Phillips to discuss the new paper.
Let's start with the basics: What is a virus?
Viruses occupy the shadowy world between the living and non-living. A virus is essentially a little container, inside of which is its genetic information. They have very few genes. The novel coronavirus has about 25 genes. For comparison, a bacterium has on the order of 4,000 genes and a human has on the order of 20,000 genes. It's fascinating that they have such huge biological impact with so few genes.
During an infection, viruses act like Trojan horses. They carry their genetic material into a host cell and trick the host cell into making hundreds or thousands of new copies of the virus. Then the new viruses burst out of the cell, destroying it, and go do the same to more cells. At that point, it's basically warfare between the virus and the host's immune system, which tries to marshal a defense and eliminate cells that harbor the virus.
What are the numbers showing us? What are some key takeaways?
The genetic data are quite interesting: SARS-CoV-2 is 50 percent genetically similar to the common cold coronavirus, and 96 percent similar to coronaviruses in bats. These zoonotic viruses—ones that originate in animals and transfer to humans—are here to stay. They are a part of the human story.
The data on the evolution of the virus and its mutation rate are going to give insights into the timescale over which the virus will change. Will there be a second wave of infections? Will it be seasonal? For example, the influenza virus evolves basically every year, which is why we have to get a new flu shot every year.
Something that makes it hard to get an overall handle on the infection rates is that there are also wide statistical variations in the data. Some people are much more infectious than others, with the viral loads harbored by different infected people showing a huge variation. It's a dynamic situation, so we're still getting a grasp on the numbers.
What questions are still open? What are you most interested in answering?
There are a wide variety of fascinating and very diverse questions that remain unanswered, ranging from what fraction of the viruses that emerge from an infected cell are infectious, to the number of viruses that are shed from such an infected cell, to the fraction of the population that is asymptomatic. One of my particular passions is the way we can use mathematics to make predictions about the future path of systems that are dynamic. Of course, one of the most famous examples in the history of science took place in the early days of physics and astronomy when scientists such as Tycho Brahe, Johannes Kepler, and Isaac Newton came to terms with how to predict the motion of planets. Similar ideas about so-called "dynamical systems" are relevant to the spread in space and time of a pandemic like we are living through now, and the vast compendium of data now available with high spatial, temporal, and demographic resolution provide an opportunity to really drill down and figure out how this pandemic unfolded.
This raises the hope that the next time such a pandemic arrives, and possibly even for this pandemic itself, if people are able to work fast enough, we will know much better how to think about how it will evolve and perhaps better how to react to it.
What was the impetus to turn your research toward the novel coronavirus?
Six or seven weeks ago, just as the coronavirus was beginning to make its way toward the U.S., one of my friends—a physicist who studies viruses—told me that by April, all of us are going to be confined to our houses and unable to fly anywhere. To be honest, it scared me. I realized that we needed to pay attention and I wanted to figure out where my team could make a difference.
One of the difficulties with research into SARS-CoV-2 is that there are so many individual vignettes. People are all looking at different pieces of the puzzle; it's a bit like the story of the blind men feeling different parts of an elephant and each coming up with different pictures of the creature. In the case of the coronavirus, some of these vignettes are: the survival of viruses on various surfaces, the structural biology of the spike protein on the virus's surface, electron microscopy of the structural elements of the virus, the cell biology of how the virus interacts with its host, and so on.
What we wanted to do was to create a one-stop curated resource where people could find key numbers about the biology and infection process of SARS-CoV-2. My collaborators—Ron Milo and two amazing graduate students, Yinon Bar-On and Avi Flamholz—and I wanted to take stock of the scientific literature and bring it all together in one place. To that end, all of our references are completely transparent, directly quoting sentences from each paper and indicating precisely how we know the things we know. The paper will also be live, meaning that every time the literature updates some number, we can update this paper.
The paper is titled "SARS-CoV-2 (COVID-19) by the numbers."
More information: Yinon M Bar-On et al. SARS-CoV-2 (COVID-19) by the numbers, eLife (2020). DOI: 10.7554/eLife.57309