The role of the domain expert in the era of big data
by
The era of big data brought astronomy and computer science together by the adoption of machine learning techniques applied to large, and complex, astronomical data sets. However, the nature of astronomical data poses important constraints on how far the use of traditional learning techniques can go. In this scenario, one of the main challenges is the time consuming and expensive labeling process required to build training samples. Moreover, observational requirements over the sample we are able to label (spectroscopic) and the one we wish to classify (photometric) are very different, thus making representativity between training and test samples impossible. In this talk I will discuss how we can optimize the construction of training samples for classification purposes. I will also describe how such strategies have proven to be effective also in search for anomalies and how they can be tuned to optimize scientific discovery in large data sets.
Emille E. O. Ishida is a research engineer at CNRS, France. She is co-founder of the Cosmostatistics Initiative (COIN) and the SNAD collaboration and is scientific principal investigator of the Fink broker. She mainly works in machine learning applications to astronomy, with special emphasis on integration of expert knowledge in the learning cycle. She is also engaged in research for development of interdisciplinary scientific environments able to foster fruitful collaboration inspired by astronomy.
Coffee will be served at 10:30.
M. Girone, M. Elsing, L. Moneta, M. Pierini