9–13 Sept 2024
Dep. of Physics of the University of Coimbra
Europe/Lisbon timezone

Lecturers & Topics

Introduction to astrostatistics, Bayesian and time series analysis

Mario Juric (Univ. of Washington) and Xavier Luri |(Univ. of Barcelona)

 

Random numbers and common statistical distributions; methods for sampling from probability density functions, key principles like the Central Limit Theorem and Law of Large Numbers; statistical tests for hypothesis testing and Maximum Likelihood Estimation.

Introduction to Bayesian probability, process of creating models within a Bayesian framework,  tools for efficient sampling in Bayesian statistics. Time series analyses, identification of patterns in bursts to periodic, and quasi-periodic signals.

 

Supervised and unsupervised learning techniques, methods for model interpretation, and a glimpse into the future of ML/AI

Emily Lauren Hunt (Univ. of Heidelberg), Friedrich Anders (Univ. of Barcelona)

Supervised  learning techniques: classification and regression, with a focus on neural networks and tree-based algorithms. Work examples include source classification,  determining stellar parameters from spectra, and estimating photometric redshifts.

Unsupervised learning techniques: clustering algorithms, dimensionality reduction methods (e.g. PCA, t-SNE, UMAP, self-organizing maps), edge detection algorithms.

Interpreting machine learning models using SHAP (SHapley Additive exPlanations). Symbolic regression for interpretability. Including uncertainties in ML models. The future of ML/AI, including the potential of transformers.

 

Tools for large-scale data analysis and visualization

André Moitinho (Univ. of Lisbon), Mario Juric (Univ. of Washington), Sandro Campos (Carnegie Mellon Univ.)

Introduction to tools and techniques for large-scale data analysis. Techniques for working on large computing clusters with multi-TB-scale catalog datasets, focusing on tools like LSDB/HiPSCat. Work  examples include classification of timeseries using Gaia and ZTF datasets. 

Introduction to visualization, as a mean of  analysis. Challenges for visualizing large datasets, and  available frameworks/services. Effective techniques for representing large datasets. Techniques for visualizing multi-dimensional data (1D, 2D, 3D and beyond), linked views and lower dimension projections.