In recent months, Baker’s team has been collaborating with biologists who have previously delved into trying to figure out the shape of the proteins they studied. “There’s a lot of pretty cool biological research that’s really accelerating,” he says. A public database containing hundreds of thousands of finished protein forms should be an even bigger accelerator.
“It looks amazingly impressive,” says Tom Ellis, a synthetic biologist at Imperial College London, studying the yeast genome, which is excited to be testing the database. But he warns that most of the predicted forms have not yet been tested in the laboratory.
In the new version of AlphaFold, predictions come with a reliability rating that the tool uses to indicate how much it thinks each predicted shape is real. Using this measure, DeepMind found that AlphaFold predicted shapes for 36% of human proteins with a precision that is accurate to the level of individual atoms. This is enough for drug development, Hassabis says.
Earlier, after decades of work, only 17% of proteins in the human body have identified structures in the laboratory. If AlphaFold’s predictions are accurate as DeepMind says, the tool has more than doubled that number in just a few weeks.
Even predictions that are not completely accurate at the atomic level are still useful. For more than half of the proteins in the human body, AlphaFold has predicted a form that should be good enough for researchers to determine protein function. The rest of AlphaFold’s current predictions are either inaccurate, or refer to a third of the proteins in the human body that have no structure at all until they bind to others. “They are floppy disks,” says Hassabis.
“The fact that it can be applied at this level of quality is an impressive thing,” says Mohammed AlQuraish, a systems biologist at Columbia University who developed his own software to predict protein structure. He also points out that having structures for most proteins in the body will make it possible to study how these proteins act as a system, not just in isolation. “That’s what I think is the most exciting,” he says.
DeepMind publishes its tools and predictions for free and will not say if it plans to make money from them in the future. However, the possibility is not ruled out. To establish and launch the database, DeepMind is partnering with the European Molecular Biology Laboratory, an international research institution that already offers a large protein database.
For now, AlQuraishi can’t wait to see what the researchers are doing with the new data. “It’s pretty spectacular,” he says, “I don’t think any of us thought we’d come this fast. That’s amazing. “