Michèle Sanguillon

Software engineer. Co-author of the IVOA ProvenanceDM

Affiliation – LUPM - CNRS


Storing Provenance information in a data processing workflow: one CTA use case

The assessment of the data quality and the need of being able to reproduce the processing requires a good understanding of the provenance of the data and the data processing steps responsible for producing them. The management of this provenance information takes place at various levels such as capture, storage, selection, distribution and visualization.
In this work, we focus on storing the provenance information returned by the reconstruction pipeline, one of many pipelines of CTA, for a prototype of the CTA observatory. This pipeline is launched via the CTADIRAC interware, based on DIRAC, in order to optimize the processing on different infrastructures such as the European Grid Infrastructure (EGI). The provenance information is ingested in an independent external database and these data conform to the ProvenanceDM data model defined by the International Virtual Observatory (IVOA).

Practical Provenance in Astronomy

Recently the IVOA released a standard to structure provenance metadata and several implementations are in development in order to capture, store, access and visualize the provenance of astronomy data products. This BoF will be focused on practical needs for provenance in astronomy. A growing number of projects express the requirement to propose FAIR data (Findable, Accessible, Interoperable and Reusable) and thus manage provenance information to ensure the quality, reliability and trustworthiness of this data. The concepts are in place, but now, applied specifications and practical tools are needed to answer concrete use cases. We propose to discuss which strategies are considered by your projects (observatories or data providers) to capture provenance in your context and how you consider a end-user might query the provenance information to enhance her/his data selection and retrieval. The objective is to identify the development of tools and formats now needed to make provenance more practical.

If you are interested to participate in this BoF on provenance, please fill-in this questionnaire: https://frama.link/vqMBEqJg

This BoF is intended to be an open discussion. Except a short introduction, there will be no presentation, but you can find detailed presentations given in September during a provenance workshop within the ESCAPE European project, here is the summary (with access to all contributions): https://indico.in2p3.fr/event/21913/page/2641-summary

For more details on the IVOA Provenance Data Model, the recommendation is accessible here: https://www.ivoa.net/documents/ProvenanceDM