Paper List

Health Informatics

An AI Implementation Science Study to Improve Trustworthy Data in a Large Healthcare System

2025-12-01

This paper addresses the critical gap between theoretical AI research and real-world clinical implementation by providing a practical framework for as...
Bioinformatics

The BEAT-CF Causal Model: A model for guiding the design of trials and observational analyses of cystic fibrosis exacerbations

2025-12

This paper addresses the critical gap in cystic fibrosis exacerbation management by providing a formal causal framework that integrates expert knowled...
Bioinformatics

Hierarchical Molecular Language Models (HMLMs)

2025-11-30

This paper addresses the core challenge of accurately modeling context-dependent signaling, pathway cross-talk, and temporal dynamics across multiple ...
Computational Neuroscience

Stability analysis of action potential generation using Markov models of voltage‑gated sodium channel isoforms

2025-11-30

This work addresses the challenge of systematically characterizing how the high-dimensional parameter space of Markov models for different sodium chan...
Network Science

Approximate Bayesian Inference on Mechanisms of Network Growth and Evolution

2025-11-30

This paper addresses the core challenge of inferring the relative contributions of multiple, simultaneous generative mechanisms in network formation w...
Bioinformatics

EnzyCLIP: A Cross-Attention Dual Encoder Framework with Contrastive Learning for Predicting Enzyme Kinetic Constants

2025-11-29

This paper addresses the core challenge of jointly predicting enzyme kinetic parameters (Kcat and Km) by modeling dynamic enzyme-substrate interaction...
Biophysics

Tissue stress measurements with Bayesian Inversion Stress Microscopy

2025-11-29

This paper addresses the core challenge of measuring absolute, tissue-scale mechanical stress without making assumptions about tissue rheology, which ...
Bioinformatics

DeepFRI Demystified: Interpretability vs. Accuracy in AI Protein Function Prediction

2025-11-29

This study addresses the critical gap between high predictive accuracy and biological interpretability in DeepFRI, revealing that the model often prio...

16 / 18

期刊: 24th International Workshop on Data Mining in Bioinformatics

发布日期: 2018-06-03

BioinformaticsComputational Biology

用于量子退火优化的二元潜在蛋白质适应度景观

University of Alabama at Birmingham

Truong-Son Hy

30秒速读

IN SHORT: 通过将序列映射到二元潜在空间进行基于QUBO的适应度优化，桥接蛋白质表示学习和组合优化。

核心创新

Methodology First framework to transform protein language model embeddings into binary latent representations for QUBO-based fitness modeling
Methodology Enables direct compatibility with quantum annealing hardware through native QUBO formulation
Biology Demonstrates that simple binary representations can capture meaningful structure in protein fitness landscapes

主要结论

Q-BioLat在ProteinGym GFP数据集（10,000个样本，潜在维度32-64）上实现了0.385-0.413的Spearman相关性
优化后的序列始终检索到适应度百分位顶部的最近邻，模拟退火在代理分数上实现了1.529±的改进
遗传算法在更高维潜在空间（m=64）中优于其他方法，而局部搜索能更好地保持序列真实性

研究空白： 现有的蛋白质优化方法依赖于连续表示和基于梯度的方法，这些方法不适合离散组合搜索，而经典的离散方法在高维序列空间中难以扩展。

摘要: 我们提出了Q-BioLat，一个在二元潜在空间中建模和优化蛋白质适应度景观的框架。从蛋白质序列出发，我们利用预训练的蛋白质语言模型获得连续嵌入，然后将其转换为紧凑的二元潜在表示。在这个空间中，蛋白质适应度使用二次无约束二元优化（QUBO）模型进行近似，从而通过经典启发式方法（如模拟退火和遗传算法）实现高效的组合搜索。在ProteinGym基准测试中，我们证明Q-BioLat能够捕捉蛋白质适应度景观中的有意义结构，并能够识别高适应度变体。尽管使用了简单的二值化方案，我们的方法始终能检索到其最近邻位于训练适应度分布顶部的序列，特别是在最强配置下。我们进一步表明，不同的优化策略表现出不同的行为，进化搜索在更高维的潜在空间中表现更好，而局部搜索在保持真实序列方面仍具有竞争力。除了其经验性能外，Q-BioLat为蛋白质表示学习和组合优化之间提供了自然的桥梁。通过将蛋白质适应度表述为QUBO问题，我们的框架与新兴的量子退火硬件直接兼容，为量子辅助蛋白质工程开辟了新的方向。

代码