Researchers have developed a new method that uses artificial intelligence to foresee the most likely mutations of pathogens like SARS-COV-2, the virus that causes COVID-19.
The new research has implications for the rapid development of vaccines, treatments and diagnostic tests that would be much less likely to be impacted by new or emerging variants of concern.
Mohammad Kohandel, a professor and head of the Mathematical Medicine Laboratory in the Department of Applied Mathematics at the University of Waterloo, helped pioneer the research in the context of the ongoing pandemic.
“With a highly infectious pathogen like SARS-COV-2, we want to have a method for extracting the mutational information as quickly as possible,” Kohandel said. “Variants are a huge problem because we don’t know whether the diagnostic tests that are available are going to work or whether the treatments or vaccines will be effective in the long run.”
Kohandel’s research team initially focused on using a single ancestral sequence to identify the parts of the viral genome that are not significantly affected by mutations. These are the so-called “conserved” part of the virus.
Identifying the conserved parts of a pathogen is valuable because even if there are mutations, it will not impact the efficacy of vaccines, treatments or tests that work by targeting those stable pieces.
“Imagine that from the beginning of the pandemic, we knew exactly which parts of the genome were going to be stable and which ones would likely change,” said Amirhossein Darooneh, a member of the research team and a professor in applied mathematics at Waterloo. “Everything would be different right now.
“Now that we have so much data on the sequencing of SARS-COV2 and its variants, we are able to use all that information to train a neural network to predict the most likely mutations of the genome. Our AI can predict the mutations that happened with really high accuracy.”
After identifying the conserved parts, the team trained an AI to anticipate the mutations that would occur in a pathogen. The machine learning program assessed millions of genomic sequences as part of its training process. The AI was then tested on the genomic sequence of the original strain of coronavirus.
Based on its analysis of the original virus, the AI predicted and identified the variants that came to be known as alpha, beta, gamma, delta and other variants of concern as most likely mutable regions of the genome. Had this information been available at the early stages of the pandemic and when vaccines were first being developed, it could have led to more effective tests and vaccines that were much more resilient against current variants.
Along with its impacts on the pandemic, the new technology can also contribute to other medical treatments.
“Even with cancer, we should be able to identify the therapeutic targets for overcoming mutation-driven drug resistance,” said Michelle Przedborski, another of the team members and a professor of applied mathematics at Waterloo. “Lots of drugs are targeting a specific part of the protein in cancer cells. But if there are mutations in those, then drugs wouldn’t be effective anymore. We can apply the same analysis and AI method to other pathogens.”
University of Waterloo