Asymptomatic people who are infected with COVID-19 exhibit, by definition, no discernible physical symptoms of the disease. They are thus less likely to seek out testing for the virus, and could unknowingly spread the infection to others.
But it seems those who are asymptomatic may not be entirely free of changes wrought by the virus. MIT researchers have now found that people who are asymptomatic may differ from healthy individuals in the way that they cough. These differences are not decipherable to the human ear. But it turns out that they can be picked up by artificial intelligence.
In a paper published recently in the IEEE Journal of Engineering in Medicine and Biology, the team reports on an AI model that distinguishes asymptomatic people from healthy individuals through forced-cough recordings, which people voluntarily submitted through web browsers and devices such as cellphones and laptops.
The researchers trained the model on tens of thousands of samples of coughs, as well as spoken words. When they fed the model new cough recordings, it accurately identified 98.5 percent of coughs from people who were confirmed to have COVID-19, including 100 percent of coughs from asymptomatics—who reported they did not have symptoms but had tested positive for the virus.
The team is working on incorporating the model into a user-friendly app, which if FDA-approved and adopted on a large scale could potentially be a free, convenient, noninvasive prescreening tool to identify people who are likely to be asymptomatic for COVID-19. A user could log in daily, cough into their phone, and instantly get information on whether they might be infected and therefore should confirm with a formal test.
“The effective implementation of this group diagnostic tool could diminish the spread of the pandemic if everyone uses it before going to a classroom, a factory, or a restaurant,” says co-author Brian Subirana, a research scientist in MIT’s Auto-ID Laboratory.
Subirana’s co-authors are Jordi Laguarta and Ferran Hueto, of MIT’s Auto-ID Laboratory.
Vocal sentiments
Prior to the pandemic’s onset, research groups already had been training algorithms on cellphone recordings of coughs to accurately diagnose conditions such as pneumonia and asthma. In similar fashion, the MIT team was developing AI models to analyze forced-cough recordings to see if they could detect signs of Alzheimer’s, a disease associated with not only memory decline but also neuromuscular degradation such as weakened vocal cords.
They first trained a general machine-learning algorithm, or neural network, known as ResNet50, to discriminate sounds associated with different degrees of vocal cord strength. Studies have shown that the quality of the sound “mmmm” can be an indication of how weak or strong a person’s vocal cords are. Subirana trained the neural network on an audiobook dataset with more than 1,000 hours of speech, to pick out the word “them” from other words like “the” and “then.”
The team trained a second neural network to distinguish emotional states evident in speech, because Alzheimer’s patients—and people with neurological decline more generally—have been shown to display certain sentiments such as frustration, or having a flat affect, more frequently than they express happiness or calm. The researchers developed a sentiment speech classifier model by training it on a large dataset of actors intonating emotional states, such as neutral, calm, happy, and sad.
The researchers then trained a third neural network on a database of coughs in order to discern changes in lung and respiratory performance.
Finally, the team combined all three models, and overlaid an algorithm to detect muscular degradation. The algorithm does so by essentially simulating an audio mask, or layer of noise, and distinguishing strong coughs—those that can be heard over the noise—over weaker ones.
With their new AI framework, the team fed in audio recordings, including of Alzheimer’s patients, and found it could identify the Alzheimer’s samples better than existing models. The results showed that, together, vocal cord strength, sentiment, lung and respiratory performance, and muscular degradation were effective biomarkers for diagnosing the disease.
When the coronavirus pandemic began to unfold, Subirana wondered whether their AI framework for Alzheimer’s might also work for diagnosing COVID-19, as there was growing evidence that infected patients experienced some similar neurological symptoms such as temporary neuromuscular impairment.
“The sounds of talking and coughing are both influenced by the vocal cords and surrounding organs. This means that when you talk, part of your talking is like coughing, and vice versa. It also means that things we easily derive from fluent speech, AI can pick up simply from coughs, including things like the person’s gender, mother tongue, or even emotional state. There’s in fact sentiment embedded in how you cough,” Subirana says. “So we thought, why don’t we try these Alzheimer’s biomarkers [to see if they’re relevant] for COVID.”
“A striking similarity”
In April, the team set out to collect as many recordings of coughs as they could, including those from COVID-19 patients. They established a website where people can record a series of coughs, through a cellphone or other web-enabled device. Participants also fill out a survey of symptoms they are experiencing, whether or not they have COVID-19, and whether they were diagnosed through an official test, by a doctor’s assessment of their symptoms, or if they self-diagnosed. They also can note their gender, geographical location, and native language.
To date, the researchers have collected more than 70,000 recordings, each containing several coughs, amounting to some 200,000 forced-cough audio samples, which Subirana says is “the largest research cough dataset that we know of.” Around 2,500 recordings were submitted by people who were confirmed to have COVID-19, including those who were asymptomatic.
The team used the 2,500 COVID-associated recordings, along with 2,500 more recordings that they randomly selected from the collection to balance the dataset. They used 4,000 of these samples to train the AI model. The remaining 1,000 recordings were then fed into the model to see if it could accurately discern coughs from COVID patients versus healthy individuals.
Surprisingly, as the researchers write in their paper, their efforts have revealed “a striking similarity between Alzheimer’s and COVID discrimination.”
Without much tweaking within the AI framework originally meant for Alzheimer’s, they found it was able to pick up patterns in the four biomarkers—vocal cord strength, sentiment, lung and respiratory performance, and muscular degradation—that are specific to COVID-19. The model identified 98.5 percent of coughs from people confirmed with COVID-19, and of those, it accurately detected all of the asymptomatic coughs.
“We think this shows that the way you produce sound, changes when you have COVID, even if you’re asymptomatic,” Subirana says.
Asymptomatic symptoms
The AI model, Subirana stresses, is not meant to diagnose symptomatic people, as far as whether their symptoms are due to COVID-19 or other conditions like flu or asthma. The tool’s strength lies in its ability to discern asymptomatic coughs from healthy coughs.
The team is working with a company to develop a free pre-screening app based on their AI model. They are also partnerning with several hospitals around the world to collect a larger, more diverse set of cough recordings, which will help to train and strengthen the model’s accuracy.
As they propose in their paper, “Pandemics could be a thing of the past if pre-screening tools are always on in the background and constantly improved.”
Ultimately, they envision that audio AI models like the one they’ve developed may be incorporated into smart speakers and other listening devices so that people can conveniently get an initial assessment of their disease risk, perhaps on a daily basis.
Jennifer Chu, Massachusetts Institute of Technology