UK study taps machine learning for disease risk prediction

According to researchers, the study illustrates the value of machine-learning for risk prediction within a traditional epidemiological study design, and how this approach might be reported to assist scientific verification.
Jeff Rowe

While machine learning has been repeatedly tested for its capacity to predict the risk of single disease, researchers have also begun exploring its potential to explore outcomes of even greater complexity, such as premature death. 

Recently, for example, researchers at the UK’s University of Nottingham found that machine learning approaches including deep learning greatly improved a team’s ability to predict premature death in a study of half a million U.K. Biobank participants, according to research published in PLOS One.

The study, spearheaded by assistant professor and research scientist Stephen F. Weng, PhD, sought to integrate machine learning into traditional epidemiological work by developing and reporting novel prognostic models to supplement existing techniques.

“In the era of big data, there is great optimism that machine learning can potentially revolutionize care, offer approaches for diagnostic assessment and personalize therapeutic decisions on par with, or superior to, clinicians,” Weng and co-authors wrote. “The challenge for applications and algorithms developed using machine learning is to not only enhance what can be achieved with traditional methods, but to also develop and report them in a similarly transparent and replicable way.”

For their current work, the researchers considered 502,628 adults aged 40 to 69 years whose health information was logged in the U.K. Biobank between 2006 and 2010. Using demographic data and taking into account biometric, clinical and lifestyle factors, they developed predictive mortality models.

They found that machine-learning algorithms were better at predicting individuals who died prematurely, with higher discrimination, better calibration and classification accuracy, when compared to standard approaches.

“The study shows the value of using machine learning to explore a wide array of individual clinical, demographic, lifestyle and environmental risk factors to produce a novel and holistic model that was not possible to achieve using standard approaches,” Weng’s team said. “This work suggests that use of machine learning should be more routinely considered when developing models for prognosis or diagnosis.”

The authors said next steps include validating these approaches in broader populations and integrating them into healthcare systems, as well as exploring how other machine learning models could play into risk prediction.

“The intriguing variations in machine learning model composition may enable new hypothesis generation for potentially significant risk factors that would otherwise not have been detected,” they wrote. “Epidemiological studies could then be designed specifically, and powered accordingly, to verify these signals.”