At the heart of biological life – and of the array of the so-called “life sciences” that are dedicated to understanding how that life works – lies the genome, which includes the entire set of genetic instructions found in a cell.
Not surprisingly, deciphering genomes is a complicated business, and a recent article at Genetic Engineering & Biotechnology News presents a nice overview of how researchers are beginning to recognize the role AI can play in helping them unpack and understand genomes.
The core challenge, the writer notes, is that “genome sequencing and storage are notoriously variable,” with an array of idiosyncracies that make consistent sequencing and storage exceedingly difficult.
“In genome science, there are a number of obvious idiosyncrasies,” the writer explains. “These include differences in sequencing protocols and technologies, as well as in data storage formats and data sharing practices. Less obvious idiosyncrasies, however, may be just as important. These include all the complications that may arise when genome analysis attempts to relate genome data to other sorts of data, such as phenotypic data, or data from other omics disciplines.”
That’s where AI can help, she explains, given the technology’s capacity to grapple with vast amounts of data while remaining sensitive to potential nuances.
For example, said John Ellithorpe, PhD, president of DNAnexus, which provides a cloud-based data analysis and management platform for DNA sequence data, “To make sure you have the highest accuracy and sensitivity, an AI/ML-based system that trains on data sets is better at detecting nuances and figuring out how to make a variant call than humans. AI democratizes the sequencing field. You do not have to understand exactly what the error profile from the sequencers looks like. Instead, you may expose enough to the algorithm to have a good distribution of the data, and then the algorithm figures it out.”
Another company using AI to drive its research is Verge Genomics, which “integrates human genomics with ML to develop drug candidates for complex neurological diseases such as amyotrophic lateral sclerosis (ALS), Alzheimer’s disease, and Parkinson’s disease.”
Specifically, explained Alice Zhang, co-founder and CEO, Verge’s development platform is founded on a proprietary human genomics database that was built over the last three years by partnering with dozens of brain banks, hospitals, and universities worldwide.
“We have collected thousands of different brain tissues directly from patients and RNA sequenced them internally to measure the expression of all of the genes in their genomes,” Zhang said. “We then mine these data with our internally developed ML algorithm to predict therapeutic targets.”
To understand how genes relate to disease, the article detailed, Verge studies disease signatures, which consist of expression profiles that encompass hundreds of genes. Rather than focus on single genes in isolation, Verge considers how single genes may act as master switches that effectively turn disease signatures on or off.
According to Zhang, the biggest revolutions in a field occur when that field applies technologies from other fields, (and) “if the field of drug development applies AI, it will overcome persistent difficulties such as biological complexity and the lack of adequate preclinical models.”
Photo by a-image/Getty Images