QuartiCal - embarrassingly parallel calibration of radio interferometer data using Numba and Dask

Calibration in the era of MeerKAT and its ilk is becoming an increasingly daunting task due to the sheer volume of data produced by these instruments. With the SKA fast becoming a reality, it is critical to design and implement calibration algorithms which exploit parallel hardware whilst minimising memory footprint. Furthermore, distributed systems are increasingly necessary to process interferometer data in a reasonable amount of time. QuartiCal is a Python package implementing calibration in this context. To this end, QuartiCal makes use of a combination of Dask and Numba. Dask, using its powerful schedulers, allows appropriately written code to scale from executing on a laptop to a compute cluster. Numba is a just-in-time compiler for Python/NumPy which can provide C-like speed whilst retaining much of Python’s simplicity and syntax. Preliminary experiments have shown that the synergy between the above technologies outperforms QuartiCal’s predecessor - CubiCal - in both speed and memory footprint. This result is hard-won, as employing these cutting-edge technologies comes with a unique set of challenges in terms of debugging and optimization. In overcoming these challenges, wisdom and patterns have been developed which may be of use to the community at large.

Theme – Data Processing Pipelines and Science-Ready Data