|国家预印本平台
首页|microGWAS: a computational pipeline to perform large scale bacterial genome-wide association studies

microGWAS: a computational pipeline to perform large scale bacterial genome-wide association studies

microGWAS: a computational pipeline to perform large scale bacterial genome-wide association studies

来源:bioRxiv_logobioRxiv
英文摘要

Identifying genetic variants associated with bacterial phenotypes, such as virulence, host preference, and antimicrobial resistance, has great potential for a better understanding of the mechanisms involved in these traits. The availability of large collections of bacterial genomes has made genome-wide association studies (GWAS) a common approach for this purpose. The need to employ multiple software tools for data pre- and post-processing limits the application of these methods by experienced bioinformaticians. To address this issue, we have developed a pipeline to perform bacterial GWAS from a set of assemblies and annotations, with multiple phenotypes as targets. The associations are run using five sets of genetic variants: unitigs, gene presence/absence, rare variants (i.e. gene burden test), gene cluster specific k-mers, and all unitigs jointly. All variants passing the association threshold are further annotated to identify overrepresented biological processes and pathways. The results can be further augmented by generating a phylogenetic tree and by predicting the presence of antimicrobial resistance and virulence associated genes. We tested the microGWAS pipeline on a previously reported dataset on E. coli virulence, successfully identifying the causal variants, and providing further interpretation on the association results. The microGWAS pipeline integrates the state-of-the-art tools to perform bacterial GWAS into a single, user-friendly, and reproducible pipeline, allowing for the democratization of these analyses. The pipeline can be accessed, together with its documentation, at: https://github.com/microbial-pangenomes-lab/microGWAS.

Damaris Bamu F、Fiebig Jenny、Galardini Marco、Burgaya Judit

10.1101/2024.07.08.602456

生物科学研究方法、生物科学研究技术计算技术、计算机技术微生物学

Damaris Bamu F,Fiebig Jenny,Galardini Marco,Burgaya Judit.microGWAS: a computational pipeline to perform large scale bacterial genome-wide association studies[EB/OL].(2025-03-28)[2025-06-29].https://www.biorxiv.org/content/10.1101/2024.07.08.602456.点此复制

评论