Google DeepMind AI tool assesses DNA mutations for harm potential

9/19/2023
00:00
5
0
0

نقل من موقع

Scientists at Google DeepMind have built an artificial intelligence program that can predict whether millions of genetic mutations are either harmless or likely to cause disease, in an effort to speed up research and the diagnosis of rare disorders. The program makes predictions about so-called missense mutations, where a single letter is misspelt in the DNA code. Such mutations are often harmless but they can disrupt how proteins work and cause diseases from cystic fibrosis and sickle-cell anaemia to cancer and problems with brain development. The researchers used AlphaMissense to assess all 71m single-letter mutations that could affect human proteins. When they set the program’s precision to 90%, it predicted that 57% of missense mutations were probably harmless and 32% were probably harmful. It was uncertain about the impact of the rest. Based on the findings, the scientists have released a free online catalogue of the predictions to help geneticists and clinicians who are either studying how mutations drive diseases or diagnosing patients who have rare disorders. A typical person has about 9,000 missense mutations throughout their genome. Of more than 4m seen in humans, only 2% have been classified as either benign or pathogenic. Doctors already have computer programs to predict which mutations may drive disease but because the predictions are inaccurate, they can only provide supporting evidence for making a diagnosis. Writing in Science, Dr Jun Cheng and others describe how AlphaMissense performs better than current “variant effect predictor” programs and should help experts pinpoint more swiftly which mutations are driving diseases. The program may also flag mutations that have not previously been linked to specific disorders and guide doctors to better treatments. The AI is an adaptation of DeepMind’s AlphaFold program, which predicts the 3D structure of human proteins from their chemical makeup. AlphaMissense was fed data on DNA from humans and closely related primates to learn which missense mutations are common, and therefore probably benign, and which are rare and potentially harmful. At the same time, the program familiarised itself with the “language” of proteins by studying millions of protein sequences and learning what a “healthy” protein looks like. When the trained AI is fed a mutation, it generates a score to reflect how risky the genetic change appears to be, though it cannot say how the mutation causes any problems. “This is very similar to human language,” Cheng said. “If we substitute a word in an English sentence, a person familiar with English can immediately see whether the word substitution will change the meaning of the sentence or not.” Prof Joe Marsh, a computational biologist at Edinburgh University who was not involved in the work, said AlphaMissense had “great potential”. “We have this issue with computational predictors where everybody says their new method is the best,” he said. “You can’t really trust people, but [the DeepMind researchers] do seem to have done a pretty good job.” If clinical experts decided that AlphaMissense was reliable, its predictions may carry more weight in future disease diagnosis, he said. Prof Ben Lehner, senior group leader in human genetics at the Wellcome Sanger Institute, said the Al’s predictions need to be verified by other scientists but it seemed good at identifying which DNA changes cause disease and which do not. “One concern about the DeepMind model is that it is extremely complicated,” Lehrer said. “A model like this may turn out to be more complicated than the biology it is trying to predict. It’s humbling to realise that we may never be able to understand how these models actually work. Is this a problem? It may not be for some applications, but will doctors be comfortable making decisions about patients that they don’t understand and can’t explain? “The DeepMind model does a good job of predicting what is broken,” he added. “Knowing what is broken is a good first step. But you also need to know how something is broken if you want to fix it. Many of us are very busy generating the massive data needed to train the next generation of AI models that will tell us not only which changes in DNA are bad but also exactly what the problem is and how we might go about fixing things.”