Marcellin Atemkeng is a researcher with more than 7 years' experience in high-bandwidth signal processing and data science with a strong background in applied mathematics, computer science, artificial intelligence, and radio interferometric data processing. He is currently developing novel dimensional reduction algorithms and applying deep learning techniques for the SKA big data. Marcel is also a lecturer in machine learning/deep learning and the coordinator of the AI research group (AIRG) at Rhodes University where he is supervising a number of students. Visit the below link to learn more about AIRG, ongoing works, and recent publications. https://www.ru.ac.za/mathematics/research/artificialintelligenceresearchgroupairg/
xova - a Baseline-Dependent Time and Channel Averaging Implementation
Traditional radio interferometric correlators produce regular-gridded samples of the true uv-distribution by averaging the signal over constant, discrete time-frequency intervals. This regular sampling and averaging then translate to be irregular-gridded samples in the uv-space, and results in a baseline-length-dependent loss of amplitude and phase coherence, which is dependent on the distance from the image phase centre: this effect is known as decorrelation.
In this poster, we discuss the implementation and the theory behind Xova; a software package that implements Baseline-Dependent Time and Channel averaging on Measurement Set data. uv-samples along a baseline track are aggregated into a bin until a specified decorrelation tolerance is exceeded. The degree of decorrelation in the bin correspondingly determines the amount of channel averaging that is suitable for samples in the bin. This necessarily implies that the number of channels varies per bin and the output data loses the rectilinear input shape of the input data.
The BDA averaging algorithm used in xova is implemented in the codex-africanus algorithms library via parallel blockwise operations on dask arrays. Each individual operation is accelerated with numba. CASA table columns are exposed as dask arrays by the dask-ms data access library, which also supports writing dask arrays to both existing and new Measurement Sets. As the number of channels varies per bin, xova writes variably shaped data columns to new Measurement Sets. While this is supported by the CASA table system and the MSv2.0 specification, DDFacet is the only imager currently capable of correctly reading these columns. To the best of our knowledge, this is the first implementation that can output Time and Channel BDA data to a Measurement Set.
Additionally, care has been taken to ensure that columns are averaged according to their nominal and effective definitions in the MSv2.0 specification. Columns such as TIME and INTERVAL include bad or missing (i.e. flagged) data in a nominal average, while TIME_CENTROID and DATA exclude bad or missing data in an effective average. Thus, fully flagged bins are not merely discarded but are retained so that correct TIME and INTERVAL grids remain established for the downstream applications.
Xova shows a promising compression capability while maintaining decorrelation constant across all the baselines when the BDA option is activated. BDA averaging of data produced by the MeerKAT telescope shows that a data compression factor of 40X and outer field-of-interest suppression are achieved.