An old adage in the computing world has become a principle in the development of machine learning models: “garbage in, garbage out.” In other words, the quality of your output depends on the quality of your input.
A new Yale study finds that this same principle applies to medical artificial intelligence, where the threat of bias can compromise clinical decision making.
“Bias in; bias out,” said John Onofrey, an assistant professor of radiology & biomedical imaging and of urology at Yale School of Medicine. “The same idea absolutely applies.”
Writing in the journal, PLOS Digital Health, Onofrey and his team identified medical AI bias in scientific literature by looking for instances, factors, or prejudice that affected the outcome of AI algorithms. They also provided practical steps to combat it.
The study is among the first to focus on the implications and consequences of biased AI in health care.
“Having worked in the machine learning/AI field for many years now, the idea that bias exists in algorithms is not surprising,” said Onofrey, who was the senior author of the study. “However, listing all the potential ways bias can enter the AI learning process is incredible. This makes bias mitigation seem like a daunting task.”
But doing nothing is not an option, said co-author Michael Choma, an associate professor adjunct of radiology & biomedical imaging.
“One danger is that the bias becomes digitally embedded into medical decision-making,” he said. “It is also important that education keeps up with technology, which includes teaching about bias — both human cognitive biases and bias in AI.”
The foundation for the study grew out of a Yale School of Medicine class co-taught by Onofrey and Choma on “Data and Clinical Decision-Making.” James L. Cross, a first-year medical student and the study’s first author, used the topic as a basis for one of his papers. In the research, Cross and his teachers identified at least four ways that bias can be infused into medical AI: within training data, model development, model implementation, and publication.
For instance, one factor causing biases in training is the insufficient evidence that race and ethnicity can or should be used as predictive factors in clinical algorithms, the researchers said. Research shows that when race is used to estimate kidney function, for example, it can lead to longer wait times for Black patients to get on transplant lists. This is often due to an overestimation of when the kidneys will give out.
In the new study, the Yale team described efforts to revisit the use of race and ethnicity as factors in algorithms used to assist decisions. Instead of using race, new algorithms can include more precise measures, such as social class and genetic variation. Some groups, they noted, are already doing this. For instance, the Organ Procurement and Transplantation Network, a nonprofit organization that administers the only organ procurement and transplantation in the United States, recently required kidney programs to assess their waitlists and reduce wait times for any Black individuals who are kidney candidates and who are disadvantaged by race-inclusive calculations.
“Greater capture and use of social determinants of health in medical AI models for clinical risk prediction will be paramount,” said Cross.
Despite AI’s growth and its mounting biases, the team is optimistic about what is next for the technology. Researchers say AI itself can be used to uncover and even solve future bias. But more research and awareness among scientists is needed.
“Bias is a human problem,” said Choma. ”When we talk about ‘bias in AI,’ we must remember that computers learn from us.”