分枝杆菌非完全匹配核心基因组分型框架的构建与评估
Construction and Evaluation of a Non?Perfect Matching Core Genome Typing Framework for Mycobacterium
王越卓 1周哲敏1
作者信息
- 1. 苏州大学苏州医学院癌症研究院,苏州 215000
- 折叠
摘要
目的:针对传统分型方法在分枝杆菌跨物种比较中分辨力不足的问题,构建适用于全属的非完全匹配核心基因组多位点序列分型(cgMLST)框架。方法:收集4841个高质量分枝杆菌基因组,经校正去冗余后获得1663个代表性基因组。开展泛基因组分析,筛选出181个核心基因(存在于≥90%菌株)。在100%、99%和95%三个核苷酸一致性阈值下对等位基因聚类归并,建立非完全匹配cgMLST方案。结果:成功构建了包含181个核心基因的分枝杆菌非完全匹配cgMLST框架。99%阈值下等位基因从原始988573个归并至613146个,95%阈值下归并至453898个。该框架允许等位基因存在适度序列差异,避免了传统严格匹配导致的远缘基因组丢失。结论:首次建立了分枝杆菌属的非完全匹配核心基因组分型框架,为全属尺度种群结构解析、物种界定及水平基因转移研究提供了可量化分型工具。
Abstract
To address the insufficient resolution of traditional typing methods for cross-species comparison within the genus Mycobacterium, this study aims to construct a non-perfect matching core genome multilocus sequence typing (cgMLST) framework applicable to the entire genus. Methods: A total of 4,841 high-quality Mycobacterium genomes were collected. After species verification and redundancy removal, 1,663 representative genomes were obtained. Pan?genome analysis was performed, and 181 core genes (present in ≥90% of strains) were screened. Alleles were clustered and merged under three nucleotide identity thresholds (100%, 99%, and 95%) to establish the non-perfect matching cgMLST scheme. Results: A non-perfect matching core genome typing framework for Mycobacterium containing 181 core genes was successfully constructed. Under the 99% threshold, the number of alleles was reduced from the original 988,573 to 613,146; under the 95% threshold, it was further reduced to 453,898. This framework allows moderate sequence differences among alleles, avoiding the loss of distantly related genomes caused by the strict exact-matching requirement of traditional methods. Conclusion: A non-perfect matching core genome typing framework for the genus Mycobacterium is established for the first time, providing a quantifiable typing tool for genus-wide population structure analysis, species delineation, and horizontal gene transfer studies.关键词
分枝杆菌/非完全匹配核心基因组多位点序列分型/物种界定/等位基因聚类/系统发育Key words
Mycobacterium/non-perfect matching core genome multilocus sequence typing/species delineation/allele clustering/phylogeny引用本文复制引用
王越卓,周哲敏.分枝杆菌非完全匹配核心基因组分型框架的构建与评估[EB/OL].(2026-06-23)[2026-06-25].http://www.paper.edu.cn/releasepaper/content/202606-64.学科分类
医药卫生理论/医学研究方法/微生物学/生物科学研究方法、生物科学研究技术