Announcer
The following program features simulated voices generated for educational and philosophical exploration.
Vera Castellanos
Good afternoon. I'm Vera Castellanos.
Ryan Nakamura
And I'm Ryan Nakamura. Welcome to Simulectics Radio.
Vera Castellanos
Today we're examining one of biology's fundamental problems—protein folding—and how artificial intelligence has revolutionized our ability to predict three-dimensional protein structures from amino acid sequences. For decades, determining protein structure required expensive, time-consuming experimental techniques like X-ray crystallography or cryo-electron microscopy. AlphaFold, developed by DeepMind, changed this landscape entirely. It predicts protein structures with near-experimental accuracy in hours rather than years, opening pathways to rational drug design, enzyme engineering, and understanding disease mechanisms at molecular resolution.
Ryan Nakamura
We're talking about solving a problem that stumped biology for half a century. The protein folding problem asks how a linear chain of amino acids spontaneously folds into complex three-dimensional shapes that determine function. Evolution solved this through trial and error over billions of years. Now AI systems trained on existing structural data can predict novel folds without physical experimentation. This isn't just computational convenience—it's a fundamental shift in how we approach molecular biology, drug discovery, and synthetic protein design.
Vera Castellanos
Our guest is Dr. Demis Hassabis, CEO of Google DeepMind and architect of AlphaFold. Dr. Hassabis, welcome.
Dr. Demis Hassabis
Thank you. Delighted to be here.
Ryan Nakamura
Let's start with the basics. Why is protein structure so important?
Dr. Demis Hassabis
Proteins are molecular machines that execute virtually every biological process. Their function derives entirely from their three-dimensional structure—how the amino acid chain folds determines whether a protein catalyzes reactions, binds specific molecules, transmits signals, or provides structural support. If you know the structure, you can understand mechanism. You can identify binding sites for drug molecules. You can engineer variants with desired properties. But determining structure experimentally is difficult. Crystallography requires growing protein crystals, which many proteins won't form. Cryo-EM needs expensive equipment and expertise. Nuclear magnetic resonance works only for small proteins. These techniques take months to years per protein. Meanwhile, sequencing genomes gives us millions of protein sequences whose structures remain unknown. This gap between sequence knowledge and structure knowledge has been biology's grand challenge.
Vera Castellanos
How does AlphaFold address this?
Dr. Demis Hassabis
AlphaFold is a deep learning system trained on the Protein Data Bank—a database of experimentally determined structures—plus evolutionary information from related protein sequences across organisms. The key insight is that evolution constrains which amino acids can vary at each position. If two positions must be close in three-dimensional space for the protein to function, mutations at those positions will co-evolve to maintain proper interaction. By analyzing millions of related sequences, AlphaFold infers spatial constraints. It then uses neural networks to predict distances between amino acid pairs and torsion angles of the protein backbone. Finally, it assembles these predictions into a complete three-dimensional model. The system achieves accuracy comparable to experimental methods for many proteins, with confidence scores indicating reliability. We've predicted structures for hundreds of millions of proteins across all known organisms, making this data freely available to researchers worldwide.
Ryan Nakamura
What enabled this breakthrough? Protein folding has been studied for decades.
Dr. Demis Hassabis
Three factors converged. First, massive growth in available training data—the Protein Data Bank expanded from thousands to hundreds of thousands of structures, providing sufficient examples for deep learning. Second, genomic sequencing exploded, giving us enormous sequence databases revealing evolutionary relationships. Third, advances in deep learning architectures, particularly attention mechanisms and transformer models, allowed the system to learn complex spatial relationships from data. Earlier computational approaches used physical simulations or template-based modeling, which were limited by computational cost or availability of similar known structures. AlphaFold takes a fundamentally different approach—learning the rules of protein folding from data rather than trying to simulate physics from first principles. This data-driven strategy proved more effective than anyone anticipated.
Vera Castellanos
How accurate are these predictions?
Dr. Demis Hassabis
For proteins with sufficient evolutionary information, AlphaFold achieves median accuracy around ninety percent—backbone atoms predicted within one or two angstroms of experimental structures. This is comparable to experimental uncertainty in some techniques. The system provides confidence scores for each prediction, indicating reliability. High-confidence predictions are typically highly accurate. Lower-confidence regions often correspond to intrinsically disordered segments or areas lacking evolutionary constraints. The limitation is that AlphaFold predicts static structures, while proteins are dynamic molecules that undergo conformational changes. We're working on AlphaFold extensions to predict multiple conformational states and protein complexes, but capturing full dynamic behavior remains challenging.
Ryan Nakamura
How does this accelerate drug discovery?
Dr. Demis Hassabis
Structure-based drug design becomes vastly more accessible. Traditionally, you needed to experimentally determine the structure of a disease-related protein before designing molecules that bind specific sites. This took years. Now you can predict the structure computationally in hours, identify potential binding pockets, and use virtual screening to test millions of candidate molecules for binding affinity. This doesn't eliminate experimental validation, but it narrows the search space enormously. We're seeing applications in antibiotic resistance—predicting structures of bacterial enzymes to design inhibitors. In neglected tropical diseases where funding is limited—structure prediction enables academic researchers to pursue targets that pharmaceutical companies won't. In personalized medicine—predicting how patient-specific mutations alter protein structure to guide treatment selection. The bottleneck shifts from structure determination to validating predictions experimentally and developing delivery systems.
Vera Castellanos
Can AlphaFold design entirely novel proteins?
Dr. Demis Hassabis
Not directly, but it enables design workflows. Researchers use complementary tools—like RosettaFold or ProteinMPNN—to generate novel amino acid sequences predicted to fold into desired shapes. AlphaFold then validates whether these designs will actually fold as intended. This combination allows creation of proteins with functions not found in nature—novel enzymes, binding proteins, molecular sensors. David Baker's group at the University of Washington has designed proteins that self-assemble into nanomaterials, bind specific viral epitopes, or catalyze new reactions. The cycle is: design a sequence computationally, predict its structure with AlphaFold, synthesize the gene, express the protein, test function experimentally. Success rates have increased dramatically compared to pre-AlphaFold methods because structural prediction guides the design process.
Ryan Nakamura
What about protein-protein interactions and complexes?
Dr. Demis Hassabis
AlphaFold-Multimer extends the approach to predict structures of protein complexes—multiple proteins that interact functionally. This is crucial because most cellular processes involve assemblies, not isolated proteins. The immune system, signaling pathways, metabolic complexes—all depend on specific protein-protein interactions. Predicting these computationally enables understanding mechanism, identifying interaction interfaces for therapeutic disruption, and engineering synthetic complexes. We've predicted structures for millions of interactions across model organisms. The challenge is that many interactions are transient or regulated by cellular conditions—pH, phosphorylation, cofactors—that aren't captured in static structure predictions. We're working on incorporating these contextual factors.
Vera Castellanos
How does this change fundamental biological research?
Dr. Demis Hassabis
It democratizes structural biology. Previously, structure determination was a specialist field requiring expensive equipment and years of training. Now any biologist can access predicted structures for proteins of interest. This accelerates hypothesis generation—if you study a protein with unknown function, examining its predicted structure provides clues about mechanism. You can identify domains similar to known proteins, predict binding sites, generate testable hypotheses about cellular role. For evolutionary biology, comparing structures across species reveals conservation patterns and adaptation mechanisms. For systems biology, structural information enables modeling of entire pathways and networks at molecular resolution. The broader impact is that structure becomes a standard type of information available alongside sequence and expression data, integrated into routine biological inquiry.
Ryan Nakamura
What are the current limitations?
Dr. Demis Hassabis
Several significant ones. AlphaFold struggles with proteins that have few evolutionary homologs—if there's insufficient sequence diversity to infer constraints, predictions are less reliable. Intrinsically disordered regions that don't have stable structures are poorly predicted, though these regions are functionally important. Dynamic conformational changes—the multiple states a protein adopts during function—aren't captured in single static predictions. Post-translational modifications like phosphorylation, glycosylation, or lipidation aren't modeled. Environmental factors—membranes, ions, cofactors—are often absent. And we predict structure, not function directly. A protein's fold suggests possible functions, but confirming activity requires experimental testing. These are active research areas. We're developing methods to predict dynamic ensembles, incorporate modifications, and model membrane proteins more accurately.
Vera Castellanos
How do you validate predictions when experimental structures don't exist?
Dr. Demis Hassabis
This is a critical question. We use several approaches. For well-studied proteins, we compare predictions against subsequently published experimental structures—this provides continuous validation as new structures enter the Protein Data Bank. We test functional predictions—if a predicted structure suggests a protein binds a specific molecule, we test that experimentally. We examine evolutionary conservation of predicted structural features across species. We check physical plausibility—are buried residues hydrophobic, are charges appropriately distributed, are bond angles reasonable. And we use the model's own confidence metrics—regions with low predicted accuracy are flagged as uncertain. But ultimately, experimental validation remains essential. AlphaFold is a hypothesis generator, not a replacement for experimentation. It accelerates science by narrowing possibilities, but biology's complexity demands empirical confirmation.
Ryan Nakamura
Could this approach extend beyond proteins to other molecular systems?
Dr. Demis Hassabis
Absolutely. We're working on predicting RNA structures, which fold through different rules than proteins but share the principle that sequence determines structure. Small molecule design—generating drug candidates with desired properties—uses similar machine learning strategies. Predicting how proteins interact with DNA, RNA, or small molecules requires modeling multi-component complexes. Eventually, we aim to predict entire cellular systems—how metabolic networks function, how signaling cascades respond to perturbations. This requires integrating structure prediction with models of dynamics, regulation, and cellular context. The ultimate vision is whole-cell models that predict phenotype from genotype with molecular resolution. We're far from that, but AlphaFold demonstrates that machine learning can solve problems in biology that were thought intractable, suggesting similar approaches might work for increasingly complex systems.
Vera Castellanos
What ethical considerations arise from this technology?
Dr. Demis Hassabis
Open access was our priority—we made all predictions freely available because we believe scientific progress should be shared. But there are dual-use concerns. If you can design proteins with specific functions, you could potentially engineer toxins, allergens, or pathogens. We've consulted with biosecurity experts to identify risks. Current consensus is that protein design capability doesn't dramatically increase bioterrorism risk beyond existing synthetic biology tools, but vigilance is necessary. There's also equity—will benefits accrue primarily to wealthy institutions with resources to synthesize and test predicted proteins, or can global research communities participate? We're working with organizations in lower-income countries to build capacity for computational biology. Intellectual property is another issue—can companies patent naturally occurring proteins whose structures were predicted computationally? These are emerging questions requiring ongoing dialogue between scientists, ethicists, policymakers, and the public.
Ryan Nakamura
Looking forward, what's the next frontier?
Dr. Demis Hassabis
Full integration with drug discovery pipelines—going from predicted structure to clinical candidate more rapidly and reliably. Better modeling of dynamic behavior—proteins as molecular movies, not static snapshots. Predicting how mutations affect function, enabling precision medicine at scale. Designing entirely novel protein functions that nature never evolved—synthetic enzymes for industrial processes, carbon capture catalysts, self-assembling materials. And ultimately, extending these methods to understand cellular systems holistically. Biology's complexity means we'll always need experimentation, but AI can guide experiments more efficiently, test hypotheses in silico before committing resources, and reveal patterns invisible to human analysis. We're at the beginning of a transformation in how biological research is conducted.
Vera Castellanos
Yet each prediction is a hypothesis, not a certainty. The map is not the territory.
Dr. Demis Hassabis
Precisely. AlphaFold provides remarkably accurate maps, but biology is the territory—messy, context-dependent, shaped by evolution rather than engineering principles. We must remain humble about what models can tell us.
Ryan Nakamura
It's the classic tension between computational prediction and biological reality. Models simplify necessarily, but organisms operate in complexity we haven't fully captured.
Vera Castellanos
Which is why this technology is most powerful when it informs experiments, not replaces them. It accelerates the cycle of hypothesis and test.
Dr. Demis Hassabis
Exactly. AI and experimentation are complementary, not competitive. Together they advance understanding faster than either alone.
Vera Castellanos
Dr. Hassabis, thank you for this discussion.
Dr. Demis Hassabis
Thank you. It's been a pleasure.
Ryan Nakamura
Tomorrow we examine somatic gene therapy and in vivo editing with Dr. Jennifer Doudna.
Vera Castellanos
Until then. Good afternoon.