RNA Viruses and Artificial Intelligence - Frightening Discovery
Posted 5 months ago
According to a study published in the Cell and reported in Nature Journal, Artificial intelligence (AI)- mediated work has revealed an astonishing 70,500 previously unknown RNA viruses, shedding light on a vast and uncharted viral universe. These viruses, many of which bear no resemblance to known species, were identified using metagenomics. This method allows scientists to analyze all the genomes in environmental samples without the need to culture individual viruses. This work underscores the potential of AI to explore the viral 'dark matter' that has long eluded scientific discovery, sparking intrigue and curiosity.
Despite being ubiquitous microorganisms that infect animals, plants, and even bacteria, viruses remain largely unexplored. Only a fraction of the viral world has been identified and characterized. Computational virologists believe the current state of known viruses is just the tip of the iceberg. Better tools like AI models and sophisticated molecular methodologies will assist us in understanding and characterizing these viruses and offer critical insights into mystery illnesses, such as those with unknown viral origins and emerging diseases, often caused by novel or re-emerging viruses.
The recent study, published in Cell, builds on previous efforts that employed machine learning to identify new viruses within sequencing data. What sets this work apart is its integration of advanced AI tools to examine predicted viral protein structures. The team used a protein-prediction tool called ESMFold, developed by researchers at Meta, to recognize proteins essential for RNA replication in viruses, specifically RNA-dependent RNA polymerases (RdRps). ESMFold was crucial in predicting the structures of these proteins, which in turn helped identify the viruses. This represents a significant leap forward from prior approaches that often missed rapidly evolving RNA viruses with unknown or divergent sequences, instilling optimism and encouragement for future discoveries.
In 2022, scientists combed through 5.7 million genomic samples and identified nearly 132,000 new RNA viruses. However, the rapid evolution of RNA viruses has made it difficult to detect many of them using traditional methods. The standard approach involves searching for RdRp-encoding sections of viral genomes, but if the sequence has diverged too far from known counterparts, it can be noticed. To overcome this limitation, the research team, led by an evolutionary biologist at the University in Shenzhen, China, developed an AI model called LucaProt. Using the "transformer" architecture that powers language models like ChatGPT, they trained LucaProt to recognize viral RdRps in genomic data. By feeding it sequencing and ESMFold protein-prediction data, the model identified approximately 160,000 RNA viruses, nearly half of which had never been described before. Remarkably, many of these viruses were found in extreme environments, such as hot springs and salt lakes, expanding the known "virosphere" into previously unexplored territories.
These findings have profound implications. By cataloging these viruses, researchers gain a deeper understanding of viral evolution, helping to trace their origins and how they adapt to different hosts.
According to virologist Prof. Dr. Muhammad Mukhtar, Vice Chancellor of the National Skills University Islamabad, this pioneering work demonstrates the transformative potential of AI in virology, providing a new lens through which to explore the viral world. As researchers expand the known virosphere, this discovery may hold the key to understanding and combating the viral threats of tomorrow.
Prof. Mukhtar also suggests that academic and research organizations must focus on medical virology, which must be addressed more. This means revising the curricula related to viral infections, their treatments, and the preventive measures needed to stay safe from current and emerging viral infection threats.
Additional Reading
1. Using artificial intelligence to document the hidden RNA virosphere
2. AI scans RNA ‘dark matter’ and uncovers 70,000 new viruses