HPC-oriented Canonical Workflows for Machine Learning Applications in Climate and Weather Prediction
HPC-oriented Canonical Workflows for Machine Learning Applications in Climate and Weather Prediction
Machine learning (ML) applications in weather and climate are gaining momentum as big data and theimmense increase in High-performance computing (HPC) power are paving the way. Ensuring FAIR data andreproducible ML practices are significant challenges for Earth system researchers. Even though the FAIRprinciple is well known to many scientists, research communities are slow to adopt them. CanonicalWorkflow Framework for Research (CWFR) provides a platform to ensure the FAIRness and reproducibilityof these practices without overwhelming researchers. This conceptual paper envisions a holistic CWFRapproach towards ML applications in weather and climate, focusing on HPC and big data. Specifically, wediscuss Fair Digital Object (FDO) and Research Object (RO) in the DeepRain project to achieve granularreproducibility. DeepRain is a project that aims to improve precipitation forecast in Germany by using ML.Our concept envisages the raster datacube to provide data harmonization and fast and scalable data access.We suggest the Juypter notebook as a single reproducible experiment. In addition, we envision JuypterHubas a scalable and distributed central platform that connects all these elements and the HPC resources to theresearchers via an easy-to-use graphical interface.
Machine learning (ML) applications in weather and climate are gaining momentum as big data and theimmense increase in High-performance computing (HPC) power are paving the way. Ensuring FAIR data andreproducible ML practices are significant challenges for Earth system researchers. Even though the FAIRprinciple is well known to many scientists, research communities are slow to adopt them. CanonicalWorkflow Framework for Research (CWFR) provides a platform to ensure the FAIRness and reproducibilityof these practices without overwhelming researchers. This conceptual paper envisions a holistic CWFRapproach towards ML applications in weather and climate, focusing on HPC and big data. Specifically, wediscuss Fair Digital Object (FDO) and Research Object (RO) in the DeepRain project to achieve granularreproducibility. DeepRain is a project that aims to improve precipitation forecast in Germany by using ML.Our concept envisages the raster datacube to provide data harmonization and fast and scalable data access.We suggest the Juypter notebook as a single reproducible experiment. In addition, we envision JuypterHubas a scalable and distributed central platform that connects all these elements and the HPC resources to theresearchers via an easy-to-use graphical interface.
Amirpasha, Mozaffari、Peter, Baumann、 Michael, Langguth、Martin, G. Schultz、Adrian, Rojas Campos、 Martin, Wittenbrink、Pascal, Nieters、 Otoniel, José Campos Escobar、Bing, Gong、Jessica, Ahring
大气科学(气象学)计算技术、计算机技术
FAIRReproducibilityMachine learningEarth system sciencesWorkflow
FAIRReproducibilityMachine learningEarth system sciencesWorkflow
Amirpasha, Mozaffari,Peter, Baumann, Michael, Langguth,Martin, G. Schultz,Adrian, Rojas Campos, Martin, Wittenbrink,Pascal, Nieters, Otoniel, José Campos Escobar,Bing, Gong,Jessica, Ahring.HPC-oriented Canonical Workflows for Machine Learning Applications in Climate and Weather Prediction[EB/OL].(2022-11-28)[2025-05-04].https://chinaxiv.org/abs/202211.00440.点此复制
评论