|国家预印本平台
首页|Genozip Dual-Coordinate VCF format enables efficient genomic analyses and alleviates liftover limitations

Genozip Dual-Coordinate VCF format enables efficient genomic analyses and alleviates liftover limitations

Genozip Dual-Coordinate VCF format enables efficient genomic analyses and alleviates liftover limitations

来源:bioRxiv_logobioRxiv
英文摘要

We introduce Dual Coordinate VCF (DVCF), a file format that records genomic variants against two different reference genomes simultaneously and is fully compliant with the current VCF specification. As implemented in the Genozip platform, DVCF enables bioinformatics pipelines to seamlessly operate across two coordinate systems by leveraging the system most advantageous to each pipeline step, simplifying bioinformatics workflows and reducing file generation and associated data storage burden. Moreover, our benchmarking of Genozip DVCF shows that it produces more complete, less erroneous, and less biased translations across coordinate systems than two widely used alternative tools (i.e., LiftoverVcf and CrossMap).

Souilmi Yassine、Llamas Bastien、Purnomo Gludhug、Lan Divon Mordechai、Tobler Raymond

10.1101/2022.07.17.500374

遗传学生物工程学计算技术、计算机技术

Souilmi Yassine,Llamas Bastien,Purnomo Gludhug,Lan Divon Mordechai,Tobler Raymond.Genozip Dual-Coordinate VCF format enables efficient genomic analyses and alleviates liftover limitations[EB/OL].(2025-03-28)[2025-05-14].https://www.biorxiv.org/content/10.1101/2022.07.17.500374.点此复制

评论