CARACal - The Containerized Automated Radio Astronomy Calibration Pipeline

CARACal has been under development since mid-2017 and was publicly released in May 2020. The project started with the goal to allow the MeerKAT Fornax and MHONGOOSE Large Survey Project (LSP) teams to produce top-quality radio images and data cubes from their raw data in a standardized, easy, and reproducible manner. This goal has now been achieved.

However, CARACal exceeded expectations, in its scope, acceptance, and from an educational point of view. CARACal uses the Stimela containerized pipeline framework (see contribution by Makathini et al. at this conference), which makes it possible to combine the best available (including third-party) software packages into a single pipeline. CARACal is open-source and freely available to not only the LSP teams but anyone who wants to utilise it. It is highly flexible, tunable, and able to reduce data from more than one telescope (uGMRT, JVLA), and it involves scientific analysis steps beyond the mere production of images (to date, source finding and characterisation). Since CARACal is open-source and reiles on containerization technology, any data reduction is transparent and reproducible. The development team involves computer scientists as well as astronomers at various academic levels, from students to professors. This, together with good review strategies, facilitates knowledge exchange on all sides and provides the required feedback to stay user-friendly as well as to sustain an appropriate level of programming style.

In this paper we will discuss the functionality of CARACal, its development structure, including crucial ingredients as well as pitfalls, and discuss future steps to improve its functionality to fully support FAIR principles, in particular interfacing to archives and standardized metadata.

Theme – Data Processing Pipelines and Science-Ready Data, Data Interoperability, Open Source Software and Community Development in Astronomy