Sergi obtained his PhD in 2014 (Marie Curie ITN) at the Laboratoire d’astrophysique de Bordeaux, France, under the supervision of Caroline Soubiran on the determination of stellar chemical abundances and the validation of the chemical tagging technique. Sergi worked for the ESA Gaia mission at the Geneva Observatory in the Stellar Variability group from 2014 to 2017. Since 2017, Sergi is part of the NASA ADS team as the Software Development & Operations Lead, where he is involved in the Machine Learning/Natural Language Processing efforts. In addition, he continues to do research developing iSpec (an open framework for spectroscopic analysis of stars), and Posidonius (a N-body code for simulating planetary and/or binary systems).
Exploring the use of Graph and Machine / Deep Learning technologies with the NASA ADS content
The NASA Astrophysics Data System (ADS) manages more than 14 million scientific abstracts with more than 5 million full text, more than 128 million citations and thousands of other relationships (e.g., articles’ keywords and data sources). NASA ADS users regularly explore the data using our website and API, which already relies on modern search technology such as Apache Solr. One of the next steps is to provide an even better service by, for instance, automatically enriching our dataset (e.g., article clustering/classification) or improving the search results (e.g., PageRank computation). To accomplish these goals, we are exploring state-of-the-art Graph and Machine/Deep Learning technologies such as Neo4J (Graph Database), BERT (Google’s Language Model for Natural Language Processing tasks) and Graph Neural Networks. We present our preliminary findings to shed some light on the challenges and opportunities that these technologies can offer.