It can take years to train a doctor, then years more for that doctor to develop the experience that enables him or her to provide consistent, effective care. So how much time can AI cut off that process?
That’s the question a recent Chinese study asked by examining EHRs from nearly 600,000 patients over an 18-month period at the Guangzhou Women and Children’s Medical Center and then comparing AI-generated diagnoses against new assessments from physicians with a range of experience.
The verdict? “On average, the AI was noticeably more accurate than junior physicians and nearly as reliable as the more senior ones.”
The research, published February 11th in Nature Medicine Letters, is being touted as demonstrating a natural-language processing AI that is capable of out-performing rookie pediatricians in diagnosing common childhood ailments.
According to reports, “the AI is a machine learning classifier (MLC), capable of placing the information learned from the EHRs into categories to improve performance. Like traditionally-trained pediatricians, the AI broke cases down into major organ groups and infection areas (upper/lower respiratory, gastrointestinal, etc.) before breaking them down even further into subcategories. It could then develop associations between various symptoms and organ groups and use those associations to improve its diagnoses. This hierarchical approach mimics the deductive reasoning human doctors employ.”
Another key strength of the AI developed for this study was the enormous size of the dataset collected to teach it: 1,362,559 outpatient visits from 567,498 patients yielded some 101.6 million data points for the MLC to devour on its quest for pediatric dominance. This allowed the AI the depth of learning needed to distinguish and accurately select from the 55 different diagnosis codes across the various organ groups and subcategories.
“When comparing against the human doctors, the study used 11,926 records from an unrelated group of children, giving both the MLC and the 20 humans it was compared against an even playing field. The results were clear: while cohorts of senior pediatricians performed better than the AI, junior pediatricians (those with 3-15 years of experience) were outclassed.”
According to observers, while the research used a competitive analysis to measure the success of the AI, the results should be seen as anything but hostile to human doctors. “The near future of artificial intelligence in medicine will see these machine learning programs augment, not replace, human physicians.”
Indeed, the authors of the study specifically call out augmentation as the key short-term application of their work. “Triaging incoming patients via intake forms, performing massive metastudies using EHRs, providing rapid ‘second opinions’—the applications for an AI doctor that is better-but-not-the-best are as varied as the healthcare industry itself.”