A national health data infrastructure could manage pandemics with less disruption
If we did not know it before, we know it now: pandemics present dire threats to our lives, similar to climate change and nuclear proliferation. Confronting these threats requires social and technical innovation and the willingness to view potential solutions in entirely new ways.
As Canada struggles with calibrating its response to COVID-19, the limits of our existing crisis strategies are plain to see.
Political leaders are stuck between controlling the spread of the pandemic and resuming commercial and economic activity. How quickly should restrictions on confinement and social distancing be relaxed? And for whom? Their responses rely largely on the extensive use of personal protective equipment (notably masks), deployment of immunity tests and test-and-tracing technologies.
There are two problems with this approach: first, they are based on after-the-fact views of COVID-19's spread. And second, this approach treats the pandemic as a medical problem.
Managing the unknowns
The facts of this virus are becoming clear. While it is hard to know who is infected given that many may be asymptomatic, we do know that the vast majority of those who become infected will not experience severe symptoms. Data from France show that if everyone gets infected, only approximately one percent of the population will experience symptoms severe enough to require admission to an intensive care unit.
Instead of using the blunt instrument approach of designing public health policy for an entire population, would it make more sense to predict who would fall into that highly vulnerable one percent group and then devote the state's resources to protecting them. That way, those who are less vulnerable can continue about their lives, while those who are more vulnerable would be better protected.
Governments are not following this path. They see COVID-19 as primarily a medical problem when it is really an information problem. If it were to be seen as an information problem, then potential solutions are possible. These solutions use advanced information technologies that have proven successful in other contexts.
Consider personalized prediction. Machine-learning models fed with vast quantities of health data, for example, could be trained to make clinical risk predictions. Public health leaders could use these prediction models to identify those who are vulnerable and who would need to be quarantined and prioritized for access to scarce medical resources, such as personal protective equipment, dedicated health support, free delivery of groceries and other necessities.
Personalized prediction, based on machine learning and artificial intelligence, has transformed businesses over the last 20 years. Netflix evaluates consumers' characteristics and past choices to make personalized recommendations about what they might watch next. Amazon uses the same approach to recommend future purchases based on past spending behavior.
A similar approach could be taken to measure individuals' clinical risk of suffering severe outcomes if infected during a pandemic such as COVID-19. What would this look like if rolled out on a country-wide scale?
Each person would receive an electronic message with their clinical risk score, which would be derived automatically from their medical records and reflect how vulnerable they are to a particular virus. Those with predicted scores above a certain threshold would be classiﬁed as "severe" or "high risk." They would be temporarily isolated and supported. Those with scores below a threshold would be able to return to a more-or-less normal life.
A personalized approach to clinical risk during a pandemic outbreak has multiple benefits. It could protect medical systems from being overwhelmed and communities from the economic pain of indiscriminate lock-downs. It could help build herd immunity with lower mortality—and fast. It could also allow a more targeted and fairer allocation of resources, from test kits to hospital beds. Unlike medical tests that are scarce, expensive and slow to deploy, a data-driven digital personalization approach could be applied quickly and is relatively easy to scale.
An approach based on data science and machine learning could also enable safer de-confinement at a much faster rate than current best practices. In one study, my co-authors and I used COVID-19 data from France as of early May 2020 to understand the public health policies regarding the enacting and lifting of restrictions intended to control the spread of disease.
Our simulations show that isolation entry and exit policies could be substantially faster and safer using personalized prediction models. Our simulations indicated that the complete lifting of COVID-19 restrictions could be undertaken in six months, with only 30 percent of the population being under strict isolation for longer than three months—all without overwhelming the medical system. In contrast, using conventional methods, simulations indicated that the complete exit would take 17 months, and 40 percent of the population would be subject to strict isolation for more than one year.
This ideal scenario may seem like a moonshot, but a simple version could be designed and rolled out fairly quickly. Governments can focus on the data and models that can be deployed for COVID-19. For example, age, body mass index and hypertension and diabetes data for each person—all of which can be assessed at a community pharmacy for everyone within weeks and applied to an individual's health card—can be used to train models. Even with just this information, public policy can be much more targeted.
National data infrastructure
What would need to happen to implement this new model on a province- or country-wide basis? For one thing, a deep data pool. Training a machine learning model for a pandemic such as COVID-19 would require data on thousands of people who tested positive and were hospitalized for the virus. It would also require medical data for everyone else in the population, akin to the information dossiers that big tech firms such as Facebook or Netflix have on consumers.
This is why government commitment to building a robust health data infrastructure is so important. Unfortunately, in Canada as elsewhere, the state of electronic health records varies widely. Depending on the jurisdiction, records may be incomplete or difficult to access, and information may not be standardized. A commitment to address these shortcomings is paramount. Privacy protections and cybersecurity provisions would need to be developed and well communicated.
As COVID-19 shows, the upside of applying advanced analytical tools used successfully elsewhere vastly outweighs the downside of staying the course. The question is not whether countries can apply artificial intelligence at a health-system scale. It is already being used at scale for commercial purposes that hardly involve life-or-death issues. The question for policy makers is: Can we afford not to go down this path?