Classification of Astrophysics Journal Articles with Machine Learning

The NASA/IPAC Extragalactic Database (NED) routinely reviews journal articles to extract fundamental data of extragalactic objects from the articles and join them across the spectrum into the database. The work of manually going through the journal articles, identifying if one is appropriate for inclusion in NED, and what kind of data are in the articles, is very labor intensive, especially given the ever-increasing numbers of publications each year. We present here a machine learning approach developed recently to help with the classifications of journal articles topics and content. We show that the application of this machine learning approach can reproduce the hand-classifications to an accuracy of over 90%.

Theme – Machine Learning, Statistics, and Algorithms