Did you know that filling out your census card will help computer scientists model how diseases spread in the United States?
Over the last four years, researchers at RTI International in North Carolina have been transforming data from the 2000 census—which described the country's 281 million people and 116 million households—into a virtual U.S. population. They finished the "synthetic population" last year, and they plan to update it when the 2010 census results come out.
The scientists developed the synthetic population as part of the National Institute of General Medical Sciences' Models of Infectious Disease Agent Study (MIDAS) at the National Institutes of Health. By integrating the population into their computer models, MIDAS researchers can better simulate the spread of an infectious outbreak through a community and identify the best ways to intervene.
The synthetic population doesn't exactly reproduce your hometown in silico, but it comes pretty close. The census protects citizens' privacy, so the RTI researchers couldn't duplicate John Smith from Manhattan or Jane Doe from Iowa City. Nor could they take each neighborhood home, apartment building, college dorm, family farm and sprawling ranch and plop it down at their exact addresses on the computer.
But the census data did give them the population, household sizes, family incomes and residents' ages and ethnicities for every town, county and state. Plugging all this information into their computers allowed the researchers to create a mirror-country that has the same overall demographics as our actual one.
"The synthetic population looks statistically exactly like the real population," says Irene Eckstrand, who directs the MIDAS program. "It has all the characteristics of real communities but doesn't invade anyone's privacy."
The number and types of houses in your county match those in the corresponding synthetic county. And each home is in an appropriate place—on a residential patch of land, not in a lake or the middle of an airport.
Disease modelers can manipulate all or selected parts of the new, ready-made synthetic population. They can model the entire country or just one town.
They can program the virtual citizens—or agents, as modelers call them—to behave in certain ways. For instance, in an outbreak simulation, one agent may get vaccinated while another refuses.
Having populations already on hand can help speed up disease-spread simulation and allow modelers and policymakers to keep pace with real outbreaks, including the H1N1 pandemic.
Plus, modelers no longer need to wrangle raw census data for each model and can focus instead on refining their simulations, says Bill Wheaton, a research geographer who oversees the project at RTI.
The synthetic population will also help modelers study the impact of social networks on disease spread. Researchers can track where agents work or go to school, who they live with and who they're likely to meet running errands. Since people get sick when they come into contact with others who've been infected, studying these social patterns in models should be helpful in understanding them in the real world.
Next, the researchers want to create international synthetic populations. They've already finished one for the 110 million people in Mexico, and they're currently working on another one for India.
Modeling many countries is important, says Wheaton, because "infectious disease is not a one-country problem—it spreads around the globe."