Paper List

Complex Systems

Macroscopic Dominance from Microscopic Extremes: Symmetry Breaking in Spatial Competition

2026-03-11

This paper addresses the fundamental question of how microscopic stochastic advantages in spatial exploration translate into macroscopic resource domi...
Computational Neuroscience

Linear Readout of Neural Manifolds with Continuous Variables

2026-03-11

This paper addresses the core challenge of quantifying how the geometric structure of high-dimensional neural population activity (neural manifolds) d...
Biophysics

Theory of Cell Body Lensing and Phototaxis Sign Reversal in “Eyeless” Mutants of Chlamydomonas

2026-03-11

This paper solves the core puzzle of how eyeless mutants of Chlamydomonas exhibit reversed phototaxis by quantitatively modeling the competition betwe...
Bioinformatics

Cross-Species Transfer Learning for Electrophysiology-to-Transcriptomics Mapping in Cortical GABAergic Interneurons

2026-03-11

This paper addresses the challenge of predicting transcriptomic identity from electrophysiological recordings in human cortical interneurons, where li...
Computational Neuroscience

Uncovering statistical structure in large-scale neural activity with Restricted Boltzmann Machines

2026-03-11

This paper addresses the core challenge of modeling large-scale neural population activity (1500-2000 neurons) with interpretable higher-order interac...
Computational Modeling

Realizing Common Random Numbers: Event-Keyed Hashing for Causally Valid Stochastic Models

2026-03-11

This paper addresses the critical problem that standard stateful PRNG implementations in agent-based models violate causal validity by making random d...
Bioinformatics

A Standardized Framework for Evaluating Gene Expression Generative Models

2026-03-11

This paper addresses the critical lack of standardized evaluation protocols for single-cell gene expression generative models, where inconsistent metr...
Bioinformatics

Single Molecule Localization Microscopy Challenge: A Biologically Inspired Benchmark for Long-Sequence Modeling

2026-03-11

This paper addresses the core challenge of evaluating state-space models on biologically realistic, sparse, and stochastic temporal processes, which a...

7 / 18

期刊: ArXiv Preprint

发布日期: 2025-12-05

BioinformaticsEvolutionary Biology

Tree Thinking in the Genomic Era: Unifying Models Across Cells, Populations, and Species

Stanford University | University of Oxford | University of California, Berkeley | Peking University | Guangzhou Medical University

Yun Deng, Shing H. Zhan, Yulin Zhang, Chao Zhang, Bingjie Chen

30秒速读

IN SHORT: This paper addresses the fragmentation of tree-based inference methods across biological scales by identifying shared algorithmic principles and statistical challenges in phylogenetics, population genetics, and cell lineage tracing.

核心创新

Methodology Identifies deep conceptual parallels between phylogenetic placement algorithms and ARG threading methods, demonstrating how phylogenetic placement generalizes to ARG reconstruction.
Biology Shows that quartet-based network methods in phylogenetics and ABBA-BABA statistics in population genetics capture the same underlying signal of gene flow through asymmetric genealogical relationships.
Methodology Demonstrates how ARG-based migration inference methods (e.g., GAIA, spacetrees) extend classical phylogeographic approaches by leveraging the full sequence of locally correlated genealogies along the genome.

主要结论

Tree-based models provide a unified framework for ancestry inference across biological scales, with ARGs representing ~2.48 million SARS-CoV-2 genomes demonstrating pandemic-scale feasibility.
Methodological parallels exist across domains: phylogenetic placement algorithms share core logic with ARG threading, and quartet-based methods in phylogenetics mirror ABBA-BABA statistics in population genetics for detecting gene flow.
Current ARG inference algorithms remain constrained by simplifying assumptions (neutrality, panmixia, constant population size) and face challenges in uncertainty quantification, particularly for non-model species or limited sample sizes.

研究空白： Tree-based inference methods have developed in isolation across different biological scales (species, populations, cells), creating redundant algorithmic development and missed opportunities for cross-disciplinary innovation despite shared statistical and computational challenges.

摘要: The ongoing explosion of genome sequence data is transforming how we reconstruct and understand the histories of biological systems. Across biological scales–from individual cells to populations and species–trees-based models provide a common framework for representing ancestry. Once limited to species phylogenetics, “tree thinking” now extends deeply to population genomics and cell biology, revealing the genealogical structure of genetic and phenotypic variation within and across organisms. Recently, there have been great methodological and computational advances on tree-based methods, including methods for inferring ancestral recombination graphs in populations, phylogenetic frameworks for comparative genomics, and lineage-tracing techniques in developmental and cancer biology. Despite differences in data types and biological contexts, these approaches share core statistical and algorithmic challenges: efficiently inferring branching histories from genomic information, integrating temporal and spatial signals, and connecting genealogical structures to evolutionary and functional processes. Recognizing these shared foundations opens opportunities for cross-fertilization between fields that are traditionally studied in isolation. By examining how tree-based methods are applied across cellular, population, and species scales, we identify the conceptual parallels that unite them and the distinct challenges that each domain presents. These comparisons offer new perspectives that can inform algorithmic innovations and lead to more powerful inference strategies across the full spectrum of biological systems.