Paper List

Health Informatics

An AI Implementation Science Study to Improve Trustworthy Data in a Large Healthcare System

2025-12-01

This paper addresses the critical gap between theoretical AI research and real-world clinical implementation by providing a practical framework for as...
Bioinformatics

The BEAT-CF Causal Model: A model for guiding the design of trials and observational analyses of cystic fibrosis exacerbations

2025-12

This paper addresses the critical gap in cystic fibrosis exacerbation management by providing a formal causal framework that integrates expert knowled...
Bioinformatics

Hierarchical Molecular Language Models (HMLMs)

2025-11-30

This paper addresses the core challenge of accurately modeling context-dependent signaling, pathway cross-talk, and temporal dynamics across multiple ...
Computational Neuroscience

Stability analysis of action potential generation using Markov models of voltage‑gated sodium channel isoforms

2025-11-30

This work addresses the challenge of systematically characterizing how the high-dimensional parameter space of Markov models for different sodium chan...
Network Science

Approximate Bayesian Inference on Mechanisms of Network Growth and Evolution

2025-11-30

This paper addresses the core challenge of inferring the relative contributions of multiple, simultaneous generative mechanisms in network formation w...
Bioinformatics

EnzyCLIP: A Cross-Attention Dual Encoder Framework with Contrastive Learning for Predicting Enzyme Kinetic Constants

2025-11-29

This paper addresses the core challenge of jointly predicting enzyme kinetic parameters (Kcat and Km) by modeling dynamic enzyme-substrate interaction...
Biophysics

Tissue stress measurements with Bayesian Inversion Stress Microscopy

2025-11-29

This paper addresses the core challenge of measuring absolute, tissue-scale mechanical stress without making assumptions about tissue rheology, which ...
Bioinformatics

DeepFRI Demystified: Interpretability vs. Accuracy in AI Protein Function Prediction

2025-11-29

This study addresses the critical gap between high predictive accuracy and biological interpretability in DeepFRI, revealing that the model often prio...

16 / 18

期刊: ArXiv Preprint

发布日期: 2026-03-17

BioinformaticsMachine Learning

在强生物域偏移下药物反应模型对患者肿瘤的样本高效适应

Université Grenoble Alpes (UGA)

Camille Jimenez Cortes, Philippe Lalanda, German Vega

30秒速读

IN SHORT: 通过从无标记分子谱中学习可迁移表征，利用最少的临床数据实现患者药物反应的有效预测。

核心创新

Methodology Proposes STaR-DR, a staged transfer-learning framework that explicitly separates unsupervised representation learning, task-specific alignment, and few-shot clinical adaptation.
Methodology Demonstrates that unsupervised pretraining yields limited gains for in vitro prediction but substantially improves few-shot adaptation to patient tumors under strong domain shift.
Biology Links performance patterns to latent-space geometry, providing mechanistic insight into when representation learning is beneficial under biological domain shift.

主要结论

无监督预训练对域内预测（平衡准确率约0.85）和跨数据集泛化（ROC-AUC约0.75）的益处有限，但在适应具有非常有限标记数据的患者肿瘤时能带来明显收益。
分阶段框架在少样本患者水平适应期间实现了更快的性能提升，与单阶段基线相比，有效迁移所需的标记目标样本数量减少了约30-40%。
在Leave-Drug-Out协议下性能下降最为显著（AUPRC下降约0.15），突显了将药物反应预测外推到先前未见化合物的内在困难。

研究空白： 当前基于细胞系数据训练的DRP模型由于强生物域偏移而无法泛化到患者肿瘤，且现有方法未系统研究在此类偏移下表征学习何时能提高适应效率。

摘要: 由于体外细胞系与患者肿瘤之间存在显著的生物学差距，从临床前数据预测患者的药物反应仍然是精准肿瘤学的主要挑战。本研究不追求提高绝对的体外预测准确性，而是探讨在强生物域偏移下，明确分离表征学习与任务监督是否能使药物反应模型对患者数据实现更高效的样本适应。我们提出了一个分阶段的迁移学习框架，其中细胞和药物表征首先通过基于自动编码器的表征学习从大量未标记的药物基因组数据中独立学习。这些表征随后在细胞系数据上与药物反应标签对齐，并最终通过少样本监督适应到患者肿瘤。通过涵盖域内、跨数据集和患者水平设置的系统评估，我们发现当源域和目标域显著重叠时，无监督预训练提供的益处有限，但在适应具有非常有限标记数据的患者肿瘤时能带来明显收益。具体而言，所提出的框架在少样本患者水平适应期间实现了更快的性能提升，同时在标准细胞系基准测试中保持与单阶段基线相当的准确性。总体而言，这些结果表明，从无标记分子谱中学习结构化和可迁移的表征可以显著减少有效药物反应预测所需的临床监督量，为数据高效的临床前到临床转化提供了一条实用途径。