share this!
3
3
Share
Email

September 23, 2020

The geographic bias in medical AI tools

Just a few decades ago, scientists didn't think much about diversity when studying new medications. Most clinical trials enrolled mainly white men living near urban research institutes, with the assumption that any findings would apply equally to the rest of the country. Later research demonstrated that assumption to be false; examples accumulated of medications that were later determined to be less effective or caused more side effects in populations that were underrepresented in the initial study.

To address these inequities, federal requirements for participation in medical research were broadened in the 1990s, and clinical trials now attempt to enroll diverse populations from the onset of the study.

But we are now at risk of repeating these same mistakes as we develop new technologies, such as AI. Researchers from Stanford University examined clinical applications of machine learning to find that most algorithms are trained on datasets from patients in only three geographic areas, and that the majority of states have no represented patients whatsoever.

"AI algorithms should mirror the community," says Amit Kaushal, an attending physician at VA Palo Alto Hospital and Stanford adjunct professor of bioengineering. "If we're building AI-based tools for patients across the United States, as a field, we can't have the data to train these tools all coming from the same handful of places."

Kaushal, along with Russ Altman, a Stanford professor of bioengineering, genetics, medicine, and biomedical data science, and Curt Langlotz, a professor of radiology and biomedical informatics research, examined five years of peer-reviewed articles that trained a deep-learning algorithm for a diagnostic task intended to assist with patient care. Among U.S. studies where geographic origin could be characterized, they found the majority (71%) used patient data from California, Massachusetts, or New York to train the algorithms. Some 60% solely relied on these three locales. Thirty-four states were not represented at all, while the other 13 states contributed limited data.

The research didn't expose bad outcomes from AI trained on the geographies, but raised questions about the validity of the algorithms for patients in other areas. "We need to understand the impact of these biases and whether considerable investments should be made to remove them," says Altman, associate director of the Stanford Institute for Human-Centered Artificial Intelligence.

"Geography correlates to a zillion things relative to health," Altman says. "It correlates to lifestyle and what you eat and the diet you are exposed to; it can correlate to weather exposure and other exposures depending on if you live in an area with fracking or high EPA levels of toxic chemicals—all of that is correlated with geography."

If these datasets were used for an algorithm to diagnose patients across the United States, "you could be doing actual harm to the people not included in the sample."

Limited data also means limited vision. "The data you have available impacts the problems you can study in the first place," Kaushal says. "If I only have access to data from California, Massachusetts, and New York, I can build algorithms to help people in those places. But problems that are more common in other geographies won't even be on my radar."

The takeaways from this study: Larger and more diverse datasets are needed for the development of innovative AI algorithms. "Stanford has led the way in making diagnostic datasets freely available for science—more than any other center by far," says Langlotz, director of the Stanford Center for Artificial Intelligence in Medicine and Imaging. "But it's expensive and it's not enough. Resources are needed to help centers across the country contribute to more diverse training datasets."

The public also should be skeptical when medical AI systems are developed from narrow training datasets. And regulators must scrutinize the training methods for these new machine learning systems.

"Medicine has been down this road before—early clinical trials didn't think much about gender, racial, or geographic diversity and we are still working to address that oversight," Kaushal says. "As AI is set to enter clinical medicine, we shouldn't have to wait 30, 40 years to make all the same mistakes and fix them again. We should see where this is headed and address it upfront."

More information: Amit Kaushal et al. Geographic Distribution of US Cohorts Used to Train Deep Learning Algorithms, JAMA (2020). DOI: 10.1001/jama.2020.12067

Journal information: Journal of the American Medical Association

Provided by Stanford University

Citation: The geographic bias in medical AI tools (2020, September 23) retrieved 5 July 2024 from https://medicalxpress.com/news/2020-09-geographic-bias-medical-ai-tools.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Neural network can determine lung cancer severity

6 shares

Feedback to editors

Study explores the link between stock market fluctuations and emergency room visits in China

1 hour ago

Researchers map the effects of all potential changes in key cancer gene

3 hours ago

About 1 in 8 Americans has been diagnosed with chronic insomnia

16 hours ago

Researchers identify unknown signaling pathway in the brain responsible for migraine with aura

18 hours ago

Scientists discover new T cells and genes related to immune disorders

18 hours ago

Team succeeds in determining the exact moment when the brain detects another person's gaze direction

20 hours ago

Epilepsy drug could keep chemotherapy for stomach cancer working for longer

20 hours ago

Research harnesses machine learning and imaging to give insight into stem cell behavior

20 hours ago

Key mechanisms identified for regeneration of neurons

21 hours ago

High ambient temperature in pregnancy associated with childhood leukemia

21 hours ago

Load comments (0)

The geographic bias in medical AI tools

Study explores the link between stock market fluctuations and emergency room visits in China

Researchers map the effects of all potential changes in key cancer gene

About 1 in 8 Americans has been diagnosed with chronic insomnia

Researchers identify unknown signaling pathway in the brain responsible for migraine with aura

Scientists discover new T cells and genes related to immune disorders

Team succeeds in determining the exact moment when the brain detects another person's gaze direction

Epilepsy drug could keep chemotherapy for stomach cancer working for longer

Research harnesses machine learning and imaging to give insight into stem cell behavior

Key mechanisms identified for regeneration of neurons

High ambient temperature in pregnancy associated with childhood leukemia

Neural network can determine lung cancer severity

Radiology publishes roadmap for AI in medical imaging

Chemists show how bias can crop up in machine learning algorithm results

AI's carbon footprint problem

Law professor suggests a way to validate and integrate deep learning medical systems

New machine learning tool predicts devastating intestinal disease in premature infants

Deep machine-learning speeds assessment of fruit fly heart aging and disease, a model for human disease

Mobile phone data helps track pathogen spread and evolution of superbugs

Team develops AI model to improve patient response to cancer therapy

A predictive model for cross-border COVID spread

AI-powered tool helps doctors detect rare diseases

Experts discuss new screening tool developed for lipoprotein(a) detection

Phys.org

Tech Xplore

Science X

The geographic bias in medical AI tools

Study explores the link between stock market fluctuations and emergency room visits in China

Researchers map the effects of all potential changes in key cancer gene

About 1 in 8 Americans has been diagnosed with chronic insomnia

Researchers identify unknown signaling pathway in the brain responsible for migraine with aura

Scientists discover new T cells and genes related to immune disorders

Team succeeds in determining the exact moment when the brain detects another person's gaze direction

Epilepsy drug could keep chemotherapy for stomach cancer working for longer

Research harnesses machine learning and imaging to give insight into stem cell behavior

Key mechanisms identified for regeneration of neurons

High ambient temperature in pregnancy associated with childhood leukemia

Related Stories

Neural network can determine lung cancer severity

Radiology publishes roadmap for AI in medical imaging

Chemists show how bias can crop up in machine learning algorithm results

AI's carbon footprint problem

Law professor suggests a way to validate and integrate deep learning medical systems

New machine learning tool predicts devastating intestinal disease in premature infants

Recommended for you

Deep machine-learning speeds assessment of fruit fly heart aging and disease, a model for human disease

Mobile phone data helps track pathogen spread and evolution of superbugs

Team develops AI model to improve patient response to cancer therapy

A predictive model for cross-border COVID spread

AI-powered tool helps doctors detect rare diseases

Experts discuss new screening tool developed for lipoprotein(a) detection

Newsletter sign up

Donate and enjoy an ad-free experience