Paper List
-
A Theoretical Framework for the Formation of Large Animal Groups: Topological Coordination, Subgroup Merging, and Velocity Inheritance
This paper addresses the core problem of how large, coordinated animal groups form in nature, challenging the classical view of gradual aggregation by...
-
CONFIDE: Hallucination Assessment for Reliable Biomolecular Structure Prediction and Design
This paper addresses the critical limitation of current protein structure prediction models (like AlphaFold3) where high-confidence scores (pLDDT) can...
-
Generative design and validation of therapeutic peptides for glioblastoma based on a potential target ATP5A
This paper addresses the critical bottleneck in therapeutic peptide design: how to efficiently optimize lead peptides with geometric constraints while...
-
Pharmacophore-based design by learning on voxel grids
This paper addresses the computational bottleneck and limited novelty in conventional pharmacophore-based virtual screening by introducing a voxel cap...
-
Human-Centred Evaluation of Text-to-Image Generation Models for Self-expression of Mental Distress: A Dataset Based on GPT-4o
This paper addresses the critical gap in evaluating how AI-generated images can effectively support cross-cultural mental distress communication, part...
-
ANNE Apnea Paper
This paper addresses the core challenge of achieving accurate, event-level sleep apnea detection and characterization using a non-intrusive, multimoda...
-
DeeDeeExperiment: Building an infrastructure for integrating and managing omics data analysis results in R/Bioconductor
This paper addresses the critical bottleneck of managing and organizing the growing volume of differential expression and functional enrichment analys...
-
Cross-Species Antimicrobial Resistance Prediction from Genomic Foundation Models
This paper addresses the core challenge of predicting antimicrobial resistance across phylogenetically distinct bacterial species, where traditional m...
Probabilistic Joint and Individual Variation Explained (ProJIVE) for Data Integration
Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University | Department of Radiology and Imaging Sciences, Emory University School of Medicine
30秒速读
IN SHORT: This paper addresses the core challenge of accurately decomposing shared (joint) and dataset-specific (individual) sources of variation in multi-modal datasets, where existing methods often lack a formal statistical model, leading to potential inaccuracies and interpretability issues.
核心创新
- Methodology Introduces ProJIVE, a novel probabilistic model that extends Probabilistic PCA (pPCA) to the JIVE framework, formally modeling joint and individual subject scores as random effects.
- Methodology Develops a unified Expectation-Maximization (EM) algorithm for maximum likelihood estimation, simultaneously inferring all model parameters (loadings, scores, noise variances), unlike multi-step decomposition approaches.
- Biology Successfully applies the model to integrate brain morphometry and cognitive data from the ADNI cohort, demonstrating that the extracted joint scores strongly correlate with established but expensive Alzheimer's disease biomarkers (e.g., amyloid PET, FDG-PET, ApoE4 status).
主要结论
- ProJIVE's maximum likelihood estimation via EM achieved greater accuracy in estimating latent scores and variable loadings compared to R.JIVE, AJIVE, and GIPCA across various simulation settings, including non-Gaussian data.
- In the ADNI application, the joint subject scores derived from brain morphometry and cognition data showed strong statistical associations with key Alzheimer's disease variables, validating the biological relevance of the extracted shared variation.
- The model provides a formal statistical framework where quantities like joint subject scores (potential prodromes) and variable loadings (drivers of variation) are directly modeled, enhancing interpretability over algorithmic decompositions.
摘要: Collecting multiple types of data on the same set of subjects is common in modern scientific applications including genomics, metabolomics, and neuroimaging. Joint and Individual Variation Explained (JIVE) seeks a low-rank approximation of the joint variation between two or more sets of features captured on common subjects and isolates this variation from that unique to each set of features. We develop an expectation-maximization (EM) algorithm to estimate a probabilistic model for the JIVE framework. The model extends probabilistic PCA to multiple datasets. Our maximum likelihood approach simultaneously estimates joint and individual components, which can lead to greater accuracy compared to other methods. We apply ProJIVE to measures of brain morphometry and cognition in Alzheimer’s disease. ProJIVE learns biologically meaningful sources of variation, and the joint morphometry and cognition subject scores are strongly related to more expensive existing biomarkers. Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. Code to reproduce the analysis is available at https://github.com/thebrisklab/ProJIVE. Supplementary materials for this article are available online.