Researchers will use AI to predict who may develop certain rare diseases
A team of researchers from University of Florida Health and Penn Medicine is using a set of artificial intelligence-powered algorithms called PANDA to find rare "zebras" in patient medical records and help patients affected by certain rare diseases get diagnosed and treated more quickly.
In health care circles, rare diseases are sometimes referred to as 'zebras' because they are so unusual and unexpected. Any disease that affects fewer than 200,000 people nationwide is considered a rare disease. Worldwide, there are about 7,000 known rare diseases. In the United States, the total number of people affected by these conditions is about 10%.
Because the symptoms of rare diseases are often vague and perplexing and because so few people are affected, diagnosing them can be difficult, according to Jiang Bian, Ph.D., a professor in the College of Medicine at the University of Florida and chief data scientist for University of Florida Health.
For this reason, Bian said, "Some patients with rare diseases may go undiagnosed and untreated for years." Bian is part of a team of researchers from UF Health and the Perelman School of Medicine at the University of Pennsylvania that is using artificial intelligence and electronic health records to develop an alert system that will sound the alarm for doctors whose patients appear likely to develop certain rare diseases.
The researchers will develop a set of algorithms powered by machine learning, a form of artificial intelligence, to identify which patients are at risk of five different types of vasculitis and two different types of spondyloarthritis, including psoriatic arthritis and ankylosing spondylitis. These predictions, derived from information already available in patients' electronic health records, could greatly increase the chance of patients being diagnosed sooner.
Efforts to develop this prediction method, called "PANDA: Predictive Analytics via Networked Distributed Algorithms for multi-system diseases," will be led by Bian at UF, and Yong Chen, Ph.D., a professor of biostatistics, and Peter A. Merkel, M.D., M.P.H., chief of rheumatology and a professor of medicine and epidemiology at Penn.
"This is an exciting step forward, building on our current PDA framework, from clinical evidence generation toward AI-informed interventions in clinical decision-making," Chen said. "Despite the clear need to reduce the dangerous and costly delays in diagnosis, individual clinicians, especially in primary care, face important challenges."
Chen used one of the forms of vasculitis under study, granulomatosis with polyangiitis, as an example of the promise the PANDA system holds. This condition involves inflammation of many organs and can be extremely severe or even fatal. Mortality rates for patients remain high in the first year after diagnosis, and the correct diagnosis of this type of vasculitis, and all the other types, can be delayed by months or even years.
"An earlier diagnosis of any of the types of vasculitis and spondyloarhritis we're working on leads to a much better prognosis and better clinical outcomes," Merkel said. "Even if we determine that a patient has just a 10% likelihood of developing one of these diseases, that is a much higher chance of a rare problem, and clinicians can keep that in mind and make better decisions for their patients."
Among the challenges in diagnosis faced by clinicians and their patients are how rare diseases can camouflage themselves as other common diseases. Clinicians also may be stymied by a lack of access to data or other clinicians the patient works with, and, simply, a lack of familiarity with such uncommon conditions. An algorithm that automatically scans known information to identify the possibility of a disease like GPA could be lifesaving.
"The increasing availability of real-world data, such as electronic health records collected through routine care, provides a golden opportunity to generate real-world evidence to inform clinical decision-making," Bian said. "Nevertheless, to leverage these large collections of real-world data, which are often distributed across multiple sites, novel distributed algorithms like PANDA are much needed."
The researchers plan to pull data through PCORnet, the National Patient-Centered Clinical Research Network. This integrated partnership of large clinical research networks contains health data from more than 27 million patients nationwide. De-identified data from these patients, including lab test results, comorbid conditions, past treatments and other commonly available information, will be used to create the algorithms. Once built, the researchers will test each algorithm's predictive power across more than 10 health systems. The methods the team develops will be shared and available to apply to other diseases.
As their name implies, machine learning algorithms are designed to "learn" and refine themselves as they are used and fed more data. For this reason, it's possible that PANDA will become more helpful as time passes.
"Ultimately, we hope to build on the algorithms developed for rare diseases and apply them to other diseases," Bian said.