While healthcare providers and researchers are eager for AI to help out in the battle agains breast cancer, a recent review of test accuracy studies reveals a dearth of scientific evidence to support its use in screening.
“Current evidence for AI does not yet allow judgement of its accuracy in breast cancer screening programmes,” the researchers concluded following the review, “and it is unclear where on the clinical pathway AI might be of most benefit. AI systems are not sufficiently specific to replace radiologist double reading in screening programmes. Promising results in smaller studies are not replicated in larger studies.”
Moreover, noted the reviewers, whose report was published recently at The BMJ, “evidence is needed on the direct comparison of different AI systems; the effect of different mammogram machines on the accuracy of AI systems; the effect of differences in screening programmes on cancer detection with AI, or on how the AI system might work within specific breast screening IT systems; and the effect of making available additional information to AI systems for decision making.”
In response to the review, Dr. Philip Scott, a member of the the UK-based BCS Chartered Institute for IT, said, “If AI were adopted now in the screening of breast cancer, there is significant risk of over-diagnosis with all the anxiety that would cause. We need to educate and inform the public to maintain trust, and that includes being honest about the immaturity of most AI tools. AI has the potential to be of huge benefit or of huge harm to society, and standards for the design, development, and adoption of AI systems must be regulated to ensure we get the very best out of them.”
Specifically, the analysis of 12 studies into the use of AI for diagnosing breast cancer concluded it was “unclear” where the technology may be beneficial on the clinical pathway. Some 94% of 36 AI systems examined in studies were “less accurate than a single radiologist” and all were less accurate than two radiologists, experts from Warwick Medical School found. In three of the studies AI used for triage screened out 10%, 4% and 0% of cancers that were detected by radiologists – meaning in some cases it was missing 100% of cases spotted by radiologists.
The review also concluded that “evidence is needed on the types of cancer detected by AI to allow an assessment of potential changes to the balance of benefits and harms, including potential over- diagnosis. We need evidence for specific subgroups according to age, breast density, prior breast cancer, and breast implants. Evidence is also needed on radiologist views and understanding and on how radiologist arbitrators behave in combination with AI.”
Photo by alvarez/Getty Images