How NLP is helping root out heart disease

Without accurate and systematic case identification, say scientists, population management and research on valvular heart conditions and other complex conditions isn’t possible.
Jeff Rowe

The use of so-called “Big Data” and data analytics may be spreading rapidly across the healthcare sector, but providers can still struggle to identify patients with complex conditions such as valvular heart disease.

To address that challenge, researchers at the Kaiser Permanente Division of Research in Oakland, California recently conducted a study using natural language processing (NLP) to sort through over a million EHRs and echocardiogram reports to identify certain abbreviations, works, and phrases associated with aortic stenosis.

"My colleagues and I used a software application as the architecture to build and validate our NLP tools, but the algorithms were not something we found or borrowed off the shelf," Matthew Solomon, a cardiologist at the Permanente Medical Group and a physician researcher, recently explained to HealthcareIT News. "These algorithms were then applied to our entire dataset within the EHR. This involves organizing the data from our backend EHR systems, and then running the formatted data through the software to create an organized and structured dataset.

"We currently are conducting research on patients with valvular heart disease, and we are moving to incorporate these methods to identify patients, in real time, to establish one of the largest population management programs in the world for this patient population.”

Part of the problem currently, Solomon explained, is that attempts to identify highly specific conditions like valvular heart disease use diagnosis or procedure codes, which were created for billing purposes and are comparatively limited use for clinical care.

"For example, a patient with moderate or severe aortic stenosis, which is a narrowing of one of the primary heart valves, is entirely different than a patient with mild valve disease," said Solomon. "Yet some of the codes simply use 'aortic valve disease,' which could be applied to an entirely different clinical problem. Without accurate and systematic case identification, population management and research for valvular heart conditions isn't possible."

Moreover, the data needed to identify patients with valvular heart disease is buried in echocardiography reports, which, like many radiology reports, often are free-text fields that are heterogeneous and unstructured, and cannot be easily queried.

"The only solution to identify these patients was either to have an army of humans pore over 1,000,000 echocardiography reports, or by developing natural language processing methods and to teach a computer how to do that for us," Solomon said.

With the new NLP software, however, the researchers needed only minutes to identify nearly 54,000 patients with the relevant conditions, a process that would have likely taken years for physicians to perform manually.

"Not only did we identify the patients, but we were able to also extract all the key detailed elements from each echocardiography report," Solomon explained. "Now we are using this data to examine our practice patterns and outcomes for these patients so that we can improve our care and understanding of these patients for ourselves and the broader medical community.”

Photo by thodonal/Getty Images