Speaker
Description
Given the remarkable success of foundation models in language and vision, it is worth exploring whether a similar approach can be applied to scientific domains. These models have the potential to improve computational efficiency, generalize better to low-data regimes, and significantly amortize training costs. However, many questions remain open regarding architectures, data selection, preprocessing techniques, and evaluation strategies. In this talk, I will focus on two foundation model approaches for astrophysics. The first, AstroCLIP, uses contrastive learning to build a shared latent space by aligning two models representing different views of the same phenomenon. The second, AstroOBS (work in progress), uses latent masked modeling to construct a unified multimodal representation capable of integrating diverse observational data. I will also discuss the importance of representation learning and briefly mention our ongoing work on time-series modeling as a starting point for future modality encoders.