Paper List

Systems Biology

Ill-Conditioning in Dictionary-Based Dynamic-Equation Learning: A Systems Biology Case Study

2026-03-11

This paper addresses the critical challenge of numerical ill-conditioning and multicollinearity in library-based sparse regression methods (e.g., SIND...
Neuroimaging

Hybrid eTFCE–GRF: Exact Cluster-Size Retrieval with Analytical pp-Values for Voxel-Based Morphometry

2026-03-11

This paper addresses the computational bottleneck in voxel-based neuroimaging analysis by providing a method that delivers exact cluster-size retrieva...
Bioinformatics

abx_amr_simulator: A simulation environment for antibiotic prescribing policy optimization under antimicrobial resistance

2026-03-11

This paper addresses the critical challenge of quantitatively evaluating antibiotic prescribing policies under realistic uncertainty and partial obser...
Bioinformatics

PesTwin: a biology-informed Digital Twin for enabling precision farming

2026-03-11

This paper addresses the critical bottleneck in precision agriculture: the inability to accurately forecast pest outbreaks in real-time, leading to su...
Bioinformatics

Equivariant Asynchronous Diffusion: An Adaptive Denoising Schedule for Accelerated Molecular Conformation Generation

2026-03-10

This paper addresses the core challenge of generating physically plausible 3D molecular structures by bridging the gap between autoregressive methods ...
Bioinformatics

Omics Data Discovery Agents

2026-03-10

This paper addresses the core challenge of making published omics data computationally reusable by automating the extraction, quantification, and inte...
Biophysics

Single-cell directional sensing at ultra-low chemoattractant concentrations from extreme first-passage events

2026-03-10

This work addresses the core challenge of how a cell can rapidly and accurately determine the direction of a chemoattractant source when the signal is...
Bioinformatics

SDSR: A Spectral Divide-and-Conquer Approach for Species Tree Reconstruction

2026-03-10

This paper addresses the computational bottleneck in reconstructing species trees from thousands of species and multiple genes by introducing a scalab...

8 / 18

期刊: 24th International Workshop on Data Mining in Bioinformatics

发布日期: 2018-06-03

BioinformaticsComputational Biology

用于量子退火优化的二元潜在蛋白质适应度景观

University of Alabama at Birmingham

Truong-Son Hy

30秒速读

IN SHORT: 通过将序列映射到二元潜在空间进行基于QUBO的适应度优化，桥接蛋白质表示学习和组合优化。

核心创新

Methodology First framework to transform protein language model embeddings into binary latent representations for QUBO-based fitness modeling
Methodology Enables direct compatibility with quantum annealing hardware through native QUBO formulation
Biology Demonstrates that simple binary representations can capture meaningful structure in protein fitness landscapes

主要结论

Q-BioLat在ProteinGym GFP数据集（10,000个样本，潜在维度32-64）上实现了0.385-0.413的Spearman相关性
优化后的序列始终检索到适应度百分位顶部的最近邻，模拟退火在代理分数上实现了1.529±的改进
遗传算法在更高维潜在空间（m=64）中优于其他方法，而局部搜索能更好地保持序列真实性

研究空白： 现有的蛋白质优化方法依赖于连续表示和基于梯度的方法，这些方法不适合离散组合搜索，而经典的离散方法在高维序列空间中难以扩展。

摘要: 我们提出了Q-BioLat，一个在二元潜在空间中建模和优化蛋白质适应度景观的框架。从蛋白质序列出发，我们利用预训练的蛋白质语言模型获得连续嵌入，然后将其转换为紧凑的二元潜在表示。在这个空间中，蛋白质适应度使用二次无约束二元优化（QUBO）模型进行近似，从而通过经典启发式方法（如模拟退火和遗传算法）实现高效的组合搜索。在ProteinGym基准测试中，我们证明Q-BioLat能够捕捉蛋白质适应度景观中的有意义结构，并能够识别高适应度变体。尽管使用了简单的二值化方案，我们的方法始终能检索到其最近邻位于训练适应度分布顶部的序列，特别是在最强配置下。我们进一步表明，不同的优化策略表现出不同的行为，进化搜索在更高维的潜在空间中表现更好，而局部搜索在保持真实序列方面仍具有竞争力。除了其经验性能外，Q-BioLat为蛋白质表示学习和组合优化之间提供了自然的桥梁。通过将蛋白质适应度表述为QUBO问题，我们的框架与新兴的量子退火硬件直接兼容，为量子辅助蛋白质工程开辟了新的方向。

代码