Is it possible to re-identify physical activity data that have had protected health information removed by using machine learning?
That’s the question a team of researchers led by UC Berkeley engineer Anil Aswani recently examined, and as AI spreads across the healthcare sector, their findings suggest current laws and regulations are nowhere near sufficient to keep an individual's health status private in the face of AI development.
Published recently in the JAMA Network Open journal, the findings show that by using artificial intelligence, it is possible to identify individuals by learning daily patterns in step data, such as that collected by activity trackers, smartwatches and smartphones, and correlating it to demographic data.
The mining of two years' worth of data covering more than 15,000 Americans led to the conclusion that the privacy standards associated with 1996's HIPAA (Health Insurance Portability and Accountability Act) legislation need to be revisited and reworked.
"We wanted to use NHANES (the National Health and Nutrition Examination Survey) to look at privacy questions because this data is representative of the diverse population in the U.S.," explained Aswani in a release. "The results point out a major problem. If you strip all the identifying information, it doesn't protect you as much as you'd think. Someone else can come back and put it all back together if they have the right kind of information.”
"In principle, you could imagine Facebook gathering step data from the app on your smartphone, then buying health care data from another company and matching the two," he added. "Now they would have health care data that's matched to names, and they could either start selling advertising based on that or they could sell the data to others.”
According to Aswani, the problem isn't with the devices, but with how the information the devices capture can be misused and potentially sold on the open market.
"I'm not saying we should abandon these devices," he said. "But we need to be very careful about how we are using this data. We need to protect the information. If we can do that, it's a net positive.”
Though the study specifically looked at step data, the results suggest a broader threat to the privacy of health data.
"HIPAA regulations make your health care private, but they don't cover as much as you think," Aswani said. "Many groups, like tech companies, are not covered by HIPAA, and only very specific pieces of information are not allowed to be shared by current HIPAA rules. There are companies buying health data. It's supposed to be anonymous data, but their whole business model is to find a way to attach names to this data and sell it."
Aswani said as advances in AI make it easier for companies to gain access to health data, the temptation for companies to use it in illegal or unethical ways will increase.