Paper List
-
Simulation and inference methods for non-Markovian stochastic biochemical reaction networks
This paper addresses the computational bottleneck of simulating and performing Bayesian inference for non-Markovian biochemical systems with history-d...
-
Translating Measures onto Mechanisms: The Cognitive Relevance of Higher-Order Information
This review addresses the core challenge of translating abstract higher-order information theory metrics (e.g., synergy, redundancy) into defensible, ...
-
Emergent Bayesian Behaviour and Optimal Cue Combination in LLMs
This paper addresses the critical gap in understanding whether LLMs spontaneously develop human-like Bayesian strategies for processing uncertain info...
-
Vessel Network Topology in Molecular Communication: Insights from Experiments and Theory
This work addresses the critical lack of experimentally validated channel models for molecular communication within complex vessel networks, which is ...
-
Modulation of DNA rheology by a transcription factor that forms aging microgels
This work addresses the fundamental question of how the transcription factor NANOG, essential for embryonic stem cell pluripotency, physically regulat...
-
Imperfect molecular detection renormalizes apparent kinetic rates in stochastic gene regulatory networks
This paper addresses the core challenge of distinguishing genuine stochastic dynamics of gene regulatory networks from artifacts introduced by imperfe...
-
Approximate Bayesian Inference on Mechanisms of Network Growth and Evolution
This paper addresses the core challenge of inferring the relative contributions of multiple, simultaneous generative mechanisms in network formation w...
-
An AI Implementation Science Study to Improve Trustworthy Data in a Large Healthcare System
This paper addresses the critical gap between theoretical AI research and real-world clinical implementation by providing a practical framework for as...
Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time
Stanford University | Yale School of Medicine
The 30-Second View
IN SHORT: This paper addresses the core challenge of efficiently and accurately sampling the conformational landscape of biomolecules from diffusion-based structure prediction models, which typically output highly concentrated distributions around a single static structure.
Innovation (TL;DR)
- Methodology Introduces ConforMix, a novel inference-time algorithm combining twisted sequential Monte Carlo (SMC) with automated exploration of the diffusion landscape, enabling asymptotically exact sampling of conditional distributions without additional model training.
- Methodology Presents ConforMixRMSD, an instantiation for automated exploration that biases sampling away from the default prediction using RMSD-based potentials on rigid secondary structure elements, recovering diverse conformations without prior knowledge of degrees of freedom.
- Methodology Applies the multistate Bennett acceptance ratio (MBAR) free energy estimation algorithm to diffusion models for the first time, enabling reconstruction of the unbiased model landscape from conditional samples.
Key conclusions
- ConforMixRMSD applied to Boltz-1 (an AlphaFold 3-like model) significantly outperforms MSA-modification baselines (AFCluster, AFSample2, CF-random) in recovering experimentally observed alternative conformations for domain motion (coverage: 0.69 ± 0.15 vs. 0.51 ± 0.17 for best baseline), membrane transporter (0.33 ± 0.23 vs. 0.20 ± 0.20), and cryptic pocket (0.45 ± 0.18 vs. 0.39 ± 0.16) protein sets, as measured by coverage at 50% of reference-to-reference RMSD.
- The method captures biologically relevant conformational transitions (domain motion, transporter cycling, cryptic pocket flexibility) while avoiding unphysical states through filtering based on pLDDT values and clash detection, demonstrating its utility for exploring continuous transitions.
- ConforMix enables efficient free energy estimation when applied to models like BioEmu, boosting the speed of such calculations, and its framework is orthogonal to model pretraining improvements, meaning it would benefit even a hypothetical model that perfectly reproduces the Boltzmann distribution.
Abstract: The function of biomolecules such as proteins depends on their ability to interconvert between a wide range of structures or “conformations.” Researchers have endeavored for decades to develop computational methods to predict the distribution of conformations, which is far harder to determine experimentally than a static folded structure. We present ConforMix, an inference-time algorithm that enhances sampling of conformational distributions using a combination of classifier guidance, filtering, and free energy estimation. Our approach upgrades diffusion models—whether trained for static structure prediction or conformational generation—to enable more efficient discovery of conformational variability without requiring prior knowledge of major degrees of freedom. ConforMix is orthogonal to improvements in model pretraining and would benefit even a hypothetical model that perfectly reproduced the Boltzmann distribution. Remarkably, when applied to a diffusion model trained for static structure prediction, ConforMix captures structural changes including domain motion, cryptic pocket flexibility, and transporter cycling, while avoiding unphysical states. Case studies of biologically critical proteins demonstrate the scalability, accuracy, and utility of this method.