A team of researchers from Universidad Nacional del Litoral–Consejo Nacional de Investigaciones Cient´ıficas and Universidad Nacional de Entre R´ııos, both in Argentina, has found evidence of gender-imbalanced datasets affecting the performance of pathology classification with AI-based diagnostic systems. In their paper published in Proceedings of the National Academy of Sciences, the group describes testing three open-source machine algorithms used for analyzing X-ray images to detect various medical conditions, and what they found.
Though it may not be common knowledge, AI systems are currently being used in a wide variety of commercial applications, including article selection on news and social media sites, which movies get made,and maps that appear on our phones—AI systems have become trusted tools by big business. But their use has not always been without controversy. In recent years, researchers have found that AI apps used to approve mortgage and other loan applications are biased, for example, in favor of white males. This, researchers found, was because the dataset used to train the system mostly comprised white male profiles. In this new effort, the researchers wondered if the same might be true for AI systems used to assist doctors in diagnosing patients.
The work involved evaluating three open-source AI systems that are still in the experimental stage. Each was trained on chest X-rays obtained from NIH and Stanford University databases, both of which contained slightly more male profiles. To find out if the systems would produce biased results, the researchers skewed the data in various ways. In some cases, they used primarily male profiles, in others primarily female.
In looking at their results, the researchers found that there was a definite bias—when the data was mostly male, the error rates for processing female profiles rose. The same was true if the ratios were reversed. They also found that over-representing one gender or the other did not confer an advantage—the error rates remained relatively stable.
The researchers were not able to provide a reason for the differences other than that male and female torsos have obvious physical differences. They suggest the medical community take a serious look at how AI systems are trained in real-world medical applications.
Bob Yirka , Medical Xpress