|国家预印本平台
首页|Data Version Management and Machine-Actionable Reproducibility for HPC based on git and DataLad

Data Version Management and Machine-Actionable Reproducibility for HPC based on git and DataLad

Data Version Management and Machine-Actionable Reproducibility for HPC based on git and DataLad

来源:Arxiv_logoArxiv
英文摘要

We present the adaptation of an existing data versioning and machine-actionable reproducibility solution for HPC. Both aspects are important for research data management and the DataLad tool provides both based on the very prevalent git version control system. However, it is incompatible with HPC batch processing. The presented extension enables DataLad's versioning and reproducibility in conjunction with the HPC batch scheduling system Slurm. It solves a fundamental incompatibility as well as inefficient behavior patterns on parallel file systems.

Andreas Knüpfer、Timothy J. Callow

计算技术、计算机技术

Andreas Knüpfer,Timothy J. Callow.Data Version Management and Machine-Actionable Reproducibility for HPC based on git and DataLad[EB/OL].(2025-05-10)[2025-06-05].https://arxiv.org/abs/2505.06558.点此复制

评论