Hangar at VIRGO

Hangar is an open source data versioning tool that is geared towards reproducibility and collaboration on numerical datasets, with semantics that are similar to git. Its intrinsic adaptability to different data structures make Hangar a valuable tool for any physics data analysis pipeline. It also represents a flexible framework in the context of machine learning projects, allowing to choose the best suited training and test sets for the goal to be achieved.

As a collaboration between VIRGO/EGO and OROBIX during the ESCAPE H2020 project, here we present the implementation of Hangar into a machine learning pipeline of gravitational wave physics at VIRGO, providing scientists with git-like features to manage and control their experimental data. In the context of the VIRGO Collaboration, data versioning provided by Hangar also is envisioned to be used in general data processing pipeline, to browse among different versions of scientific data, depending on the calibration process.

Theme – Machine Learning, Statistics, and Algorithms, Data Processing Pipelines and Science-Ready Data, Data Interoperability