Implementing a NUFFT in Dask for Radio Astronomy Applications

The Common Astronomy Software Application (CASA) is a widely used radio astronomy data reduction and analysis software package. It has a Python user interface and is implemented in C++ (with some Fortran). CASA has been under continual development for more than 14 years. Consequently, it has become complex and difficult to maintain. To demonstrate that the core functionality of CASA can be successfully re-implemented, the non-uniform FFT (NUFFT) was chosen as a representative function.

A pure Python framework was chosen for this test implementation due to the language's almost ubiquitous use in the astronomy community. One disadvantage of using pure Python is that it is an interpreted language which can be slow relative to compiled languages. To overcome this limitation we adopt a framework that makes use of the following Python packages: Zarr, Numba, and Dask. Zarr is used to store visibility data as chunked N-dimensional arrays. If code can not be easily vectorized, Numba is used for just-in-time compilation. Concurrency is achieved using the dynamic task scheduling library Dask.

For datasets that can fit into memory Dask NUFFT and CASA NUFFT have similar performance. Dask NUFFT outperforms CASA NUFFT when the visibility data is larger than memory and we demonstrate an appreciable speed-up across a set of data sizes and hardware configurations, potentially indicating more efficient blocked data management using this open-source framework than our tuned custom implementation.

Theme – Cloud Computing at Different Scales, Open Source Software and Community Development in Astronomy