Paper List
-
GOPHER: Optimization-based Phenotype Randomization for Genome-Wide Association Studies with Differential Privacy
This paper addresses the core challenge of balancing rigorous privacy protection with data utility when releasing full GWAS summary statistics, overco...
-
Real-time Cricket Sorting By Sex A low-cost embedded solution using YOLOv8 and Raspberry Pi
This paper addresses the critical bottleneck in industrial insect farming: the lack of automated, real-time sex sorting systems for Acheta domesticus ...
-
Training Dynamics of Learning 3D-Rotational Equivariance
This work addresses the core dilemma of whether to use computationally expensive equivariant architectures or faster symmetry-agnostic models with dat...
-
Fast and Accurate Node-Age Estimation Under Fossil Calibration Uncertainty Using the Adjusted Pairwise Likelihood
This paper addresses the dual challenge of computational inefficiency and sensitivity to fossil calibration errors in Bayesian divergence time estimat...
-
Few-shot Protein Fitness Prediction via In-context Learning and Test-time Training
This paper addresses the core challenge of accurately predicting protein fitness with only a handful of experimental observations, where data collecti...
-
scCluBench: Comprehensive Benchmarking of Clustering Algorithms for Single-Cell RNA Sequencing
This paper addresses the critical gap of fragmented and non-standardized benchmarking in single-cell RNA-seq clustering, which hinders objective compa...
-
Simulation and inference methods for non-Markovian stochastic biochemical reaction networks
This paper addresses the computational bottleneck of simulating and performing Bayesian inference for non-Markovian biochemical systems with history-d...
-
Assessment of Simulation-based Inference Methods for Stochastic Compartmental Models
This paper addresses the core challenge of performing accurate Bayesian parameter inference for stochastic epidemic models when the likelihood functio...
Nyxus: A Next Generation Image Feature Extraction Library for the Big Data and AI Era
Axle Research | NovaGen Research Fund | NCATS
30秒速读
IN SHORT: This paper addresses the core pain point of efficiently extracting standardized, comparable features from massive (terabyte to petabyte-scale) biomedical imaging datasets, which is hindered by fragmented, non-scalable domain-specific libraries.
核心创新
- Methodology Introduces a unified, scalable out-of-core feature extraction library (Nyxus) designed from the ground up for 2D/3D big image data, supporting both radiomics and cellular analysis domains.
- Methodology Enables programmatic tuning of feature hyperparameters for optimal computational efficiency or coverage, supporting novel AI/ML applications.
- Methodology Provides multi-modal accessibility: Python package, CLI, Napari plugin, and OCI-compliant container for diverse user skill levels and cloud/HPC workflows.
主要结论
- Nyxus outperforms domain-specific tools in speed while calculating more features: on the TissueNet dataset, it was 3x to 35x faster than CellProfiler in default mode and 58x to 131x faster in optimized ('targeted') mode for intensity and texture features.
- The library demonstrates hardware scalability, with performance benefits plateauing after ~10 CPU threads, and provides up to 3x speedup using GPU acceleration for suitable ROI sizes (e.g., low counts of large regions >~5,000 pixels).
- Nyxus implements the broadest feature set among tested libraries (261 features) and includes an IBSI-compliant profile for radiomics, addressing the critical need for standardization and reproducibility in quantitative image analysis.
摘要: Modern imaging instruments can produce terabytes to petabytes of data for a single experiment. The biggest barrier to processing big image datasets has been computational, where image analysis algorithms often lack the efficiency needed to process such large datasets or make tradeoffs in robustness and accuracy. Deep learning algorithms have vastly improved the accuracy of the first step in an analysis workflow (region segmentation), but the expansion of domain specific feature extraction libraries across scientific disciplines has made it difficult to compare the performance and accuracy of extracted features. To address these needs, we developed a novel feature extraction library called Nyxus. Nyxus is designed from the ground up for scalable out-of-core feature extraction for 2D and 3D image data and rigorously tested against established standards. The comprehensive feature set of Nyxus covers multiple biomedical domains including radiomics and cellular analysis, and is designed for computational scalability across CPUs and GPUs. Nyxus has been packaged to be accessible to users of various skill sets and needs: as a Python package for code developers, a command line tool, as a Napari plugin for low to no-code users or users that want to visualize results, and as an Open Container Initiative (OCI) compliant container that can be used in cloud or super-computing workflows aimed at processing large data sets. Further, Nyxus enables a new methodological approach to feature extraction allowing for programmatic tuning of many features sets for optimal computational efficiency or coverage for use in novel machine learning and deep learning applications.