HiCMC: High-Efficiency Contact Matrix Compressor
HiCMC: High-Efficiency Contact Matrix Compressor
Chromosome organization plays an important role in biological processes such as replication, regulation, and transcription. One way to study the relationship between chromosome structure and its biological functions is through Hi-C studies, a genome-wide method for capturing chromosome conformations. Such studies generate vast amounts of data. The problem is exacerbated by the fact that chromosome organization is dynamic, requiring snapshots at different points in time, further increasing the amount of data to be stored. We present a novel approach called the High-Efficiency Contact Matrix Compressor (HiCMC) for efficient compression of Hi-C data. By modeling the underlying structures found in the contact matrix, such as compartments and domains, HiCMC outperforms CMC by approximately 8% and more than 50% against cooler, LZMA, and bzip2 over the state of the art across multiple cell lines and resolutions. In addition, the domain information that is embedded in the data can be used to speed up downstream analysis. HiCMC is available at https://github.com/sXperfect/hicmc.
Muntefering Fabian、Ostermann Jorn、Voges Jan、Adhisantoso Yeremia Gunawan、Korner Tim
生物科学研究方法、生物科学研究技术计算技术、计算机技术遗传学
Muntefering Fabian,Ostermann Jorn,Voges Jan,Adhisantoso Yeremia Gunawan,Korner Tim.HiCMC: High-Efficiency Contact Matrix Compressor[EB/OL].(2025-03-28)[2025-05-05].https://www.biorxiv.org/content/10.1101/2023.11.03.565487.点此复制
评论