CANCELLED - Serving power-users and novices. Architecture and lessons learned with science platform MuseWISE.
2020-11-10, 18:15–18:30, Times in UTC

Astronomers are data-centric cowboys: they hunt, sometimes unprincipled, for the best-ever calibrated data.

A key challenge of science platforms is to cater to a community of astronomers that ranges from power-users to novice users. Astronomical power-users have advanced IT skills, know how to navigate archives of databases and bulk dataservers. They want these archives to be interoperable with their own suite of favorite software for data analysis. Power-users have a bi-directional workflow: they want to share through science platforms their improved versions of calibrated data, their subsequent data analysis. They want full data lineage to meet the standard of scientific reproducability. Novices to data-intensive astronomy know only basic Python/C/sql, need tutorials, intuitive user interfaces and close guidance in finding, accessing and analyzing calibrated data. They start with a one-directional workflow: downloading final products from static releases in archives.

With science platform MuseWISE we have 7 years of experience and lessons learned in serving power-users and enabling novices to grow their skills on one science platform. MuseWISE provides a platform for the Integral Field-Unit spectroscopic data of MUSE at the VLT. The people, databases, bulk data storages and compute infra for MuseWISE are spread over seven locations in France, Germany, Switzerland and the Netherlands. MuseWISE provides its users a platform for science analysis, data quality assessment, data processing and archive exploration. Power-users contribute and improve both code and data products. They use their expert knowledge of the information system and its structure. We started an experiment with Jupyter notebooks plus JupyterHub to support the novice that will transition from a consumer to a contributor.

In this talk we review the data-centric architecture of MuseWISE and its relation to data-centric cowboys. We present the lessons learned in seven years of operations across Europe.

Theme – Science Platforms and Data Lakes