On November 30th, CASP announced that DeepMind’s AlphaFold 2 had reached a 92.4% accuracy for predicting protein shape from amino acid sequences, essentially solving the 50-year old protein folding problem.
This is a major scientific breakthrough, not only for DeepMind or Google (parent company of DeepMind) but for humanity as a whole. Moreover, this breakthrough is a great example of how artificial intelligence can be used to solve some of the most complex issues in the world today.
So what exactly is the protein-folding problem?
Proteins are composed of a set of 20 amino acids that are strung together in a sequence. This sequence of amino acids can then fold in unique ways to form a three-dimensional atomic structure which determines the function of the protein. There is a high degree of freedom in how the protein can be folded and this process takes place in our body over the course of just a few seconds.
In 1969, molecular biologist Cyrus Levinthal used the Levinthal paradox thought experiment to estimate that there are roughly 10^300 different combinations of protein folds. Compare this to the Shannon number which estimates that there are roughly 10^120 different combinations of games that can be played in chess.
In 1972, Chemistry Nobel Prize winner Christian Anfinsen projected that just by knowing the sequence of amino acids, we can predict what the final structure of the protein will be. And from there, the search for the solution to the protein folding problem began.
The AlphaFold 2.0
In 1994, the CASP, or the Critical Assessment of Structure Prediction, competition was created to promote and encourage additional research into this field. In the 2018 competition, DeepMind’s AlphaFold 1.0 predicted protein structure just from knowing the amino acid sequence with an accuracy rate of over 70%. While this was the highest to date, the problem was still not completely solved.
This year the story was different. By replacing CNNs with attention mechanisms, known as transformers, the AlphaFold 2 was able to reach an accuracy of over 92%, above the 90% threshold it takes to be considered a solution to the problem. It is important to note that AlphaFold 2 has already been used to predict the protein in the COVID-19 virus.
This is perhaps one of the most important breakthroughs in structural biology and artificial intelligence in the last 20 years. While there aren’t any direct implications of this in the market yet, investors should be paying attention.
Because of AlphaFold 2, researchers can now delve deeper into many of the unanswered questions in science today. Novel proteins can be modeled in order to facilitate the growth of more environmentally friendly and renewable products. The timeline for creating drugs and vaccines can be accelerated faster than ever before. Diagnosis of diseases such as cancer can be advanced further. Universal flu shots that protect us from multiple viruses at once have now become a possibility and new protein structures can be used to simplify or better understand energy storage and artificial carbon sequestration. AlphaFold might even play a crucial role in discovering the treatments for many of the unsolved medical problems in our world. The possibilities are really endless.
Most use cases, however, suggest more innovation in the biotechnology, healthcare, and renewable energy space moving forward. While valuations are currently high in all of these areas due to the COVID-19 pandemic and the electric vehicle gold rush, AlphaFold might just have created room for these companies to go even higher. Whether we feel the effects right away or not, AlphaFold remains a massive source of optimism for the future and will continue to impact our lives and the markets for many years to come.
To learn more about DeepMind and AlphaFold 2, click here.