Paper List
-
Pharmacophore-based design by learning on voxel grids
This paper addresses the computational bottleneck and limited novelty in conventional pharmacophore-based virtual screening by introducing a voxel cap...
-
CONFIDE: Hallucination Assessment for Reliable Biomolecular Structure Prediction and Design
This paper addresses the critical limitation of current protein structure prediction models (like AlphaFold3) where high-confidence scores (pLDDT) can...
-
On the Approximation of Phylogenetic Distance Functions by Artificial Neural Networks
This paper addresses the core challenge of developing computationally efficient and scalable neural network architectures that can learn accurate phyl...
-
EcoCast: A Spatio-Temporal Model for Continual Biodiversity and Climate Risk Forecasting
This paper addresses the critical bottleneck in conservation: the lack of timely, high-resolution, near-term forecasts of species distribution shifts ...
-
Training Dynamics of Learning 3D-Rotational Equivariance
This work addresses the core dilemma of whether to use computationally expensive equivariant architectures or faster symmetry-agnostic models with dat...
-
Fast and Accurate Node-Age Estimation Under Fossil Calibration Uncertainty Using the Adjusted Pairwise Likelihood
This paper addresses the dual challenge of computational inefficiency and sensitivity to fossil calibration errors in Bayesian divergence time estimat...
-
Few-shot Protein Fitness Prediction via In-context Learning and Test-time Training
This paper addresses the core challenge of accurately predicting protein fitness with only a handful of experimental observations, where data collecti...
-
scCluBench: Comprehensive Benchmarking of Clustering Algorithms for Single-Cell RNA Sequencing
This paper addresses the critical gap of fragmented and non-standardized benchmarking in single-cell RNA-seq clustering, which hinders objective compa...
An AI Implementation Science Study to Improve Trustworthy Data in a Large Healthcare System
Georgia Institute of Technology, Atlanta, GA, USA | Shriners Hospitals for Children, Tampa, FL, USA
The 30-Second View
IN SHORT: This paper addresses the critical gap between theoretical AI research and real-world clinical implementation by providing a practical framework for assessing and improving healthcare data quality using trustworthy AI principles.
Innovation (TL;DR)
- Methodology Developed a Python-based extension of OHDSI's Data Quality Dashboard (DQD) that integrates the METRIC framework for trustworthy AI assessment, addressing informative missingness, timeliness, and distribution consistency.
- Methodology Implemented a real-world case study modernizing a large pediatric healthcare system's Research Data Warehouse from OMOP CDM v5.1/5.2 to v5.4 within Microsoft Fabric, achieving 4% improvement in data quality test success rate (84.78% to 88.88%).
- Biology Demonstrated that data harmonization using OMOP CDM concept codes does not significantly impact AI model performance (mean AUROC: 71.3% with source codes vs. 70.0% with OMOP codes) while increasing interoperability for Craniofacial Microsomia case study.
Key conclusions
- Modernizing SC's OMOP CDM database from v5.1/5.2 to v5.4 improved overall data quality by 4% (84.78% to 88.88% success rate) and conformance by 8% (80.73% to 88.09%).
- Data harmonization using OMOP CDM concept codes maintained comparable AI model performance (mean AUROC difference: 1.3%) while enabling better interoperability across healthcare systems.
- Only 50% of ICD-9 codes shared common mappings with ICD-10 codes, revealing significant vocabulary transition challenges that could degrade AI model performance when encountering mixed coding systems.
Abstract: The rapid growth of Artificial Intelligence (AI) in healthcare has sparked interest in Trustworthy AI and AI Implementation Science, both of which are essential for accelerating clinical adoption. Yet, barriers such as strict regulations, gaps between research and clinical settings, and challenges in evaluating AI systems hinder real-world implementation. This study presents an AI implementation case study within Shriners Children’s (SC), a large multisite pediatric system, showcasing the modernization of SC’s Research Data Warehouse (RDW) to OMOP CDM v5.4 within a secure Microsoft Fabric environment. We introduce a Python-based data quality assessment tool compatible with SC’s infrastructure, an extension of OHDSI’s R/Java-based Data Quality Dashboard (DQD) that integrates Trustworthy AI principles using the METRIC framework. This extension enhances data quality evaluation by addressing informative missingness, redundancy, timeliness, and distributional consistency. We also compare systematic and case-specific AI implementation strategies for Craniofacial Microsomia (CFM) using the FHIR standard. Our contributions include a real-world evaluation of AI implementations, integration of Trustworthy AI in data quality assessment, and evidence-based insights into hybrid implementation strategies, highlighting the need to blend systematic infrastructure with use-case-driven approaches to advance AI in healthcare.