Credit: CC0 Public Domain

A team of researchers from Universidad Nacional del Litoral–Consejo Nacional de Investigaciones Cient´ıficas and Universidad Nacional de Entre R´ııos, both in Argentina, has found evidence of gender-imbalanced datasets affecting the performance of pathology classification with AI-based diagnostic systems. In their paper published in Proceedings of the National Academy of Sciences, the group describes testing three open-source machine algorithms used for analyzing X-ray images to detect various medical conditions, and what they found.

Though it may not be , AI systems are currently being used in a wide variety of commercial applications, including article selection on news and , which movies get made,and maps that appear on our phones—AI systems have become trusted tools by big business. But their use has not always been without controversy. In recent years, researchers have found that AI apps used to approve mortgage and other loan applications are biased, for example, in favor of white males. This, researchers found, was because the dataset used to train the system mostly comprised white male profiles. In this new effort, the researchers wondered if the same might be true for AI systems used to assist doctors in diagnosing patients.

The work involved evaluating three open-source AI systems that are still in the experimental stage. Each was trained on chest X-rays obtained from NIH and Stanford University databases, both of which contained slightly more male profiles. To find out if the systems would produce biased results, the researchers skewed the data in various ways. In some cases, they used primarily male profiles, in others primarily female.

In looking at their results, the researchers found that there was a definite bias—when the data was mostly male, the error rates for processing female profiles rose. The same was true if the ratios were reversed. They also found that over-representing one gender or the other did not confer an advantage—the error rates remained relatively stable.

The researchers were not able to provide a reason for the differences other than that male and female torsos have obvious physical differences. They suggest the take a serious look at how AI systems are trained in real-world medical applications.

More information: Agostina J. Larrazabal el al., "Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis," PNAS (2020). www.pnas.org/cgi/doi/10.1073/pnas.1919012117

Journal information: Proceedings of the National Academy of Sciences