November 8, 2022

A skewed model for imbalanced health data

by King Abdullah University of Science and Technology

An asymmetric statistical model provides a better fit for imbalanced data with rare "positives," such as longitudinal health datasets.

Sometimes, a more complex but more accurate model is needed when the standard off-the-shelf models just do not cut it. That is the message from researchers from KAUST's Statistics Program.

One interesting example is for large health datasets that contain the occurrence of rare diseases. Particularly in longitudinal studies that track many patients over many years, searching out the few instances of a disease in a large data set poses challenges for standard statistical approaches.

"In longitudinal studies, we might want to find the relationship between a certain disease and several potentially influential factors," says Zhongwei Zhang, a Ph.D. student with Raphael Huser. "To do so, we might collect data over time from hundreds of subjects. The resulting response data would be binary—either disease or no disease—and the responses for the same subject are correlated because they are collected from the same person."

For such correlated binary response data, the state-of-the-art model is the multivariate probit model. However, this model might not be suitable when the data are not distributed symmetrically or are not balanced, with roughly as many positives as negatives.

"The multivariate probit model might not always provide the best fit for highly imbalanced data because of this symmetric link model, possibly resulting in substantial bias in the estimation of the mean response," explains Zhang. "There is a need to develop flexible asymmetric link models for this type of data. In this study, we developed a novel multivariate skew-elliptical link model that can explain the data better."

The skew-elliptical link model is a ﬂexible model that is able to capture the imbalance in the data, such as cases when the majority of the results are zero, but a small and significant portion is equal to one. With the multivariate probit model embedded as a special case, this model's mathematical flexibility allows it to be used for both balanced and imbalanced data.

The new model, developed by Zhang with KAUST professors Marc Genton and Huser, was shown to provide a better fit to a highly imbalanced COVID-19 dataset from a region of California in the United States.

"There is often a tradeoff between flexibility and parsimony," Zhang says. "If you are looking for easily interpretable models with efficient inference, then go for the parsimonious models at hand. But if you are looking for models with the best performance according to certain criterion, there might exist more complicated models that are more suitable."

The research was published in Biometrics.

More information: Zhongwei Zhang et al, Tractable Bayes of skew‐elliptical link models for correlated binary data, Biometrics (2022). DOI: 10.1111/biom.13731

Provided by King Abdullah University of Science and Technology

Citation: A skewed model for imbalanced health data (2022, November 8) retrieved 17 July 2024 from https://medicalxpress.com/news/2022-11-skewed-imbalanced-health.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

A better statistical model for environmental data

14 shares

Feedback to editors

When the brain speaks, the heart feels it—the link between the brain's reward system and acute myocardial infarction

4 minutes ago

Llama nanobodies: New therapy can neutralize a wide variety of HIV-1 strains

4 minutes ago

New study finds cell donor's socioeconomic status shapes cancer treatment outcomes

27 minutes ago

Distinct signaling pathway identified as key driver for epithelial cancer development

41 minutes ago

New gene therapy for muscular dystrophy offers hope

53 minutes ago

Immune cells monitor blood platelet maturation in bone marrow, researchers discover

1 hour ago

Scientists define new type of memory loss in older adults

1 hour ago

Boost in infant genetics research could change lives, say researchers

1 hour ago

Scientists develop first bone marrow model that supports human stem cells

1 hour ago

Study advances efforts to harness psilocybin's mind-altering power to treat mental illness

1 hour ago

Load comments (0)

A skewed model for imbalanced health data

When the brain speaks, the heart feels it—the link between the brain's reward system and acute myocardial infarction

Llama nanobodies: New therapy can neutralize a wide variety of HIV-1 strains

New study finds cell donor's socioeconomic status shapes cancer treatment outcomes

Distinct signaling pathway identified as key driver for epithelial cancer development

New gene therapy for muscular dystrophy offers hope

Immune cells monitor blood platelet maturation in bone marrow, researchers discover

Scientists define new type of memory loss in older adults

Boost in infant genetics research could change lives, say researchers

Scientists develop first bone marrow model that supports human stem cells

Study advances efforts to harness psilocybin's mind-altering power to treat mental illness

A better statistical model for environmental data

Robust and realistic general method for dealing with wind-driven phenomena

A model for millions of locations enables better prediction of climate and environmental conditions

Competition sheds light on approximation methods for large spatial datasets

Going to extremes to predict natural disasters

A sharper view of flood risk

Machine learning helps define new subtypes of Parkinson's disease

Study shows AI tool successfully responds to patient questions in electronic health record

Off-the-shelf wearable trackers provide clinically-useful information for patients with heart disease

Accurate and continuous remote monitoring of step length can be sensitive marker for neurological diseases and aging

Beyond algorithms: The role of human empathy in AI-enhanced therapy

Artificial intelligence outperforms clinical tests at predicting progress of Alzheimer's disease

Phys.org

Tech Xplore

Science X

A skewed model for imbalanced health data

When the brain speaks, the heart feels it—the link between the brain's reward system and acute myocardial infarction

Llama nanobodies: New therapy can neutralize a wide variety of HIV-1 strains

New study finds cell donor's socioeconomic status shapes cancer treatment outcomes

Distinct signaling pathway identified as key driver for epithelial cancer development

New gene therapy for muscular dystrophy offers hope

Immune cells monitor blood platelet maturation in bone marrow, researchers discover

Scientists define new type of memory loss in older adults

Boost in infant genetics research could change lives, say researchers

Scientists develop first bone marrow model that supports human stem cells

Study advances efforts to harness psilocybin's mind-altering power to treat mental illness

Related Stories

A better statistical model for environmental data

Robust and realistic general method for dealing with wind-driven phenomena

A model for millions of locations enables better prediction of climate and environmental conditions

Competition sheds light on approximation methods for large spatial datasets

Going to extremes to predict natural disasters

A sharper view of flood risk

Recommended for you

Machine learning helps define new subtypes of Parkinson's disease

Study shows AI tool successfully responds to patient questions in electronic health record

Off-the-shelf wearable trackers provide clinically-useful information for patients with heart disease

Accurate and continuous remote monitoring of step length can be sensitive marker for neurological diseases and aging

Beyond algorithms: The role of human empathy in AI-enhanced therapy

Artificial intelligence outperforms clinical tests at predicting progress of Alzheimer's disease

Newsletter sign up

Donate and enjoy an ad-free experience