Storing Provenance information in a data processing workflow: one CTA use case

The assessment of the data quality and the need of being able to reproduce the processing requires a good understanding of the provenance of the data and the data processing steps responsible for producing them. The management of this provenance information takes place at various levels such as capture, storage, selection, distribution and visualization.
In this work, we focus on storing the provenance information returned by the reconstruction pipeline, one of many pipelines of CTA, for a prototype of the CTA observatory. This pipeline is launched via the CTADIRAC interware, based on DIRAC, in order to optimize the processing on different infrastructures such as the European Grid Infrastructure (EGI). The provenance information is ingested in an independent external database and these data conform to the ProvenanceDM data model defined by the International Virtual Observatory (IVOA).

Theme – Data Interoperability