|国家预印本平台
首页|Poisson Hierarchical Indian Buffet Processes-With Indications for Microbiome Species Sampling Models

Poisson Hierarchical Indian Buffet Processes-With Indications for Microbiome Species Sampling Models

Poisson Hierarchical Indian Buffet Processes-With Indications for Microbiome Species Sampling Models

来源:Arxiv_logoArxiv
英文摘要

We introduce the Poisson Hierarchical Indian Buffet Process (PHIBP), a new class of species sampling models designed to address the challenges of complex, sparse count data by facilitating information sharing across and within groups. Our theoretical developments enable a tractable Bayesian nonparametric framework with machine learning elements, accommodating a potentially infinite number of species (taxa) whose parameters are learned from data. Focusing on microbiome analysis, we address key gaps by providing a flexible multivariate count model that accounts for overdispersion and robustly handles diverse data types (OTUs, ASVs). We introduce novel parameters reflecting species abundance and diversity. The model borrows strength across groups while explicitly distinguishing between technical and biological zeros to interpret sparse co-occurrence patterns. This results in a framework with tractable posterior inference, exact generative sampling, and a principled solution to the unseen species problem. We describe extensions where domain experts can incorporate knowledge through covariates and structured priors, with potential for strain-level analysis. While motivated by ecology, our work provides a broadly applicable methodology for hierarchical count modeling in genetics, commerce, and text analysis, and has significant implications for the broader theory of species sampling models arising in probability and statistics.

Abhinav Pandey、Juho Lee、Lancelot F. James

微生物学生物科学理论、生物科学方法

Abhinav Pandey,Juho Lee,Lancelot F. James.Poisson Hierarchical Indian Buffet Processes-With Indications for Microbiome Species Sampling Models[EB/OL].(2025-08-25)[2025-09-05].https://arxiv.org/abs/2502.01919.点此复制

评论