At a glance:
A new study reveals that pathology AI models for cancer diagnosis perform unequally across demographic groups.
The researchers identified three explanations for the bias and developed a tool that reduced it.
The findings highlight the need to systematically check for bias in pathology AI to ensure equitable care for patients.
Pathology has long been the cornerstone of cancer diagnosis and treatment. A pathologist carefully examines an ultrathin slice of human tissue under a microscope for clues that indicate the presence, type, and stage of cancer.
To a human expert, looking at a swirly pink tissue sample studded with purple cells is akin to grading an exam without a name on it — the slide reveals essential information about the disease without providing other details about the patient.
Yet the same isn’t necessarily true of pathology artificial intelligence models that have emerged in recent years. A new study led by a team at Harvard Medical School shows that these models can somehow infer demographic information from pathology slides, leading to bias in cancer diagnosis among different populations.
Analyzing several major pathology AI models designed to diagnose cancer, the researchers found unequal performance in detecting and differentiating cancers across populations based on patients’ self-reported gender, race, and age. They identified several possible explanations for this demographic bias.
The team then developed a framework called FAIR-Path that helped reduce bias in the models.
“Reading demographics from a pathology slide is thought of as a ‘mission impossible’ for a human pathologist, so the bias in pathology AI was a surprise to us,” said senior author Kun-Hsing Yu, associate professor of biomedical informatics in the Blavatnik Institute at HMS and HMS assistant professor of pathology at Brigham and Women’s Hospital.
Identifying and counteracting AI bias in medicine is critical because it can affect diagnostic accuracy, as well as patient outcomes, Yu said. FAIR-Path’s success indicates that researchers can improve the fairness of AI models for cancer pathology, and perhaps other AI models in medicine, with minimal effort.
The work, which was supported in part by federal funding, is described Dec. 16 in Cell Reports Medicine.
Testing for bias
Yu and his team investigated bias in four standard AI pathology models being developed for cancer evaluation. These deep-learning models were trained on sets of annotated pathology slides, from which they “learned” biological patterns that enable them to analyze new slides and offer diagnoses.
The researchers fed the AI models a large, multi-institutional repository of pathology slides spanning 20 cancer types.
Authorship, funding, disclosures
Additional authors on the study include Shih-Yen Lin, Pei-Chen Tsai, Fang-Yi Su, Chun-Yen Chen, Fuchen Li, Junhan Zhao, Yuk Yeung Ho, Tsung-Lu Michael Lee, Elizabeth Healey, Po-Jen Lin, Ting-Wan Kao, Dmytro Vremenko, Thomas Roetzer-Pejrimovsky, Lynette Sholl, Deborah Dillon, Nancy U. Lin, David Meredith, Keith L. Ligon, Ying-Chun Lo, Nipon Chaisuriya, David J. Cook, Adelheid Woehrer, Jeffrey Meyerhardt, Shuji Ogino, MacLean P. Nasrallah, Jeffrey A. Golden, Sabina Signoretti, and Jung-Hsien Chiang.
Funding was provided by the National Institute of General Medical Sciences and the National Heart, Lung, and Blood Institute at the National Institutes of Health (grants R35GM142879, R01HL174679), the Department of Defense (Peer Reviewed Cancer Research Program Career Development Award HT9425-231-0523), the American Cancer Society (Research Scholar Grant RSG-24-1253761-01-ESED), a Google Research Scholar Award, a Harvard Medical School Dean’s Innovation Award, the National Science and Technology Council of Taiwan (grants NSTC 113-2917-I-006-009, 112-2634-F-006-003, 113-2321-B-006-023, 114-2917-I-006-016), and a doctoral student scholarship from the Xin Miao Education Foundation.
Ligon was a consultant of Travera, Bristol Myers Squibb, Servier, IntegraGen, L.E.K. Consulting, and Blaze Bioscience; received equity from Travera; and has research funding from Bristol Myers Squibb and Lilly. Vremenko is a cofounder and shareholder of Vectorly.
The authors prepared the initial manuscript and used ChatGPT to edit selected sections to improve readability. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the published article.