PHYSTAT-Gamma 2022

Europe/Zurich
Gerrit Spengler, Louis Lyons (Imperial College (GB)), Manuel Meyer (University of Hamburg), Olaf Behnke (Deutsches Elektronen-Synchrotron (DE)), Thomas Lohse (Humboldt University of Berlin (DE)), Ullrich Schwanke (Humboldt University Berlin)
Description

Zoom link

Meeting ID: 622 3933 3803; Passcode: 496169

This link is for the pre-school on statistical methods (Tue, 27 Sep 2022) and PHYSTAT-Gamma (28-30 Sep 2022)

 

??

 

 

 

 

 

 

This compact workshop deals with statistical issues in the analyses of data from gamma-ray experiments, and is meant to establish tighter connections to observatories in other wavebands (e.g. x-rays, radio).

 

The workshop is complemented by an afternoon of introductory lectures on Tuesday, 27.9.22. Note that the ml.astro workshop on machine learning in astroparticle physics will take place on Monday, 26.9.22. This workshop is independent of PHYSTAT but will highlight complementary aspects.

 

The summary of the program for the week 26.9.22-30.9.22 is:

Mon, 26.9.22: ml.astro (Machine Learning in astroparticle physics)

Tue, 27.9.22: Introductory talks on Statistics and Machine learning

Wed-Fri (28.9.22.-30.9.22): PHYSTAT-gamma: Statistical Methods in gamma-ray astronomy

 

The homepage of PHYSTAT with a list of all workshops and seminars is at https://espace.cern.ch/phystat. 

Registration
Registration
Participants
    • 14:00 17:30
      Pre-School on Statistical methods
      • 14:00
        Statistics 101: A very fast introduction I 50m

        Probability and Bayes theorem, Frequentist and Bayesian statistics, likelihood
        function, parameter estimation and properties of estimators, maximum likelihood
        estimators (MLE), information inequality, asymptotic properties of MLE,
        variance of MLE

        Speaker: Glen Cowan
      • 14:50
        Statistics 101: A very fast introduction II 50m

        Frequentist hypothesis tests, significance level and power of a test, Neyman-Pearson lemma/likelihood ratio, goodness of fit, p values and significances, confidence interval from a test, coverage, confidence intervals and selected problems (e.g. limits near the boundary of the parameter space), Wilk's theorem and confidence regions

        Speaker: Glen Cowan
      • 15:40
        Coffee break 15m
      • 15:55
        Selected applications of 'Statistics 101' in HE/VHE gamma-ray astronomy 35m

        Error propagation, combination of stat+syst errors, profile likelihood, inter-experiment combination of likelihoods, trial factors, binned likelihood and applications in gamma-ray astronomy (Poisson Maximum Likelihood Estimation, On-Off Likelihood statistics)

        Speaker: Ullrich Schwanke
      • 16:30
        Introduction to Machine Learning in astroparticle physics 45m

        This very short introduction will summarize basic machine learning concepts and introduce and discuss a few feature selection and learning algorithms. The selected algorithms include: Naive Bayes, Nearest Neighbour Methods, Decicison Trees, Ensemble Methods and Neural Networks. Furthermore, the talk will address the selection of appropriate input variables as well as possibilities to exclude badly simulated observables.

        Speaker: Tim Ruhe
    • 14:00 18:00
      Statistics Session
      • 14:00
        Session introduction 5m
        Speaker: Ullrich Schwanke
      • 14:05
        Astrostatistics: Overview and highlights 40m
        Speaker: Eric Feigelson
      • 14:45
        Chi-square, K-S, and bootstrap: Fitting astrophysical models to data 40m

        Complicated models from astrophysical theory are often fit to observational data. There are several issues with the classical procedures used in astronomy literature. First, `chi-square minimization' is commonly used for fitting functions often disregard mathematical assumptions. Second, the Kolmogorov-Smirnov (K-S) test for goodness-of-fit is misused in astronomy when the model parameters are estimated from the dataset under study. Third, the KS is inefficient at detecting deviations between the data and model at the tails of the distribution. Fourth, the K-S test cannot justifiably be applied to multivariate data as KS is no longer distribution-free. Recent developments of bootstrap resampling method, a simple Monte Carlo procedure on data, will be described, to address these issues.

        Speaker: Jogesh Babu
      • 15:25
        Discussion 15m
      • 15:40
        Coffee break 20m
      • 16:00
        Overview of Bayesian methods for multiwavelength gamma-ray astronomy 40m

        Bayesian data analysis (BDA) gets its name from Bayes's theorem, stating that posterior probabilities for hypotheses are proportional to the product of their prior probabilities and likelihoods (predictive probabilities for the observed data based on each hypothesis). It's tempting to view the Bayesian approach as merely using priors to "modulate" the familiar frequentist maximum likelihood approach. But BDA uses all of probability theory, not just Bayes's theorem. In particular, many Bayesian calculations use the law of total probability to compute probabilities for composite hypotheses (e.g., hypotheses with uncertain parameters). These computations average ("marginalize") the likelihood function, rather than maximize it. Many of the key capabilities of Bayesian methods follow from this key distinction—performing
        computations that integrate rather than optimize over parameter space. I will highlight the role of marginalization in a variety of BDA methods relevant to multiwavelength gamma-ray astronomy: counterpart searches (cross-identification), accounting for systematics such as uncertain background rates, period searches using time-tagged event data, and population modeling accounting for measurement errors and selection effects in a hierarchical Bayesian framework.

        Speaker: Tom Loredo
      • 16:40
        Discussion 15m
      • 16:55
        Time Series Analysis In the Dynamic Universe 40m
        Speaker: Jeff Scargle
      • 17:35
        Discussion 20m
      • 17:55
        Closing remarks 5m
        Speaker: Ullrich Schwanke
    • 14:30 18:00
      Survey Session: Statistical methods for MWL counterpart identification
      • 14:30
        Session introduction 5m
        Speaker: Ullrich Schwanke
      • 14:35
        Counterpart identification: Overview 30m
        Speaker: Tamas Budavari
      • 15:05
        Discussion 15m
      • 15:20
        VHE gamma-ray surveys with CTA 30m

        The Cherenkov Telescope Array (CTA) will be the first astronomical observatory fully covering the gamma-ray sky in an energy range from 20 GeV up to 300 TeV. The observatory will be composed of two arrays of tens of telescopes located in La Palma, Spain, and Paranal, Chile.

        Among the Key Science Projects proposed by the CTA Consortium, Galactic and extragalactic surveys will be conducted during the first years of operation. With an unprecedented sensitivity and improved angular resolution, CTA surveys promise the discovery of several hundred of new gamma-ray sources, but the challenges coming along with the analyses of these data will also scale up. We will focus on the challenges of source variability, extended sources modeling, source confusion, source association with multi-wavelength catalogues, classification in source populations, and sources contamination due to the systematic errors in the modeling of instrumental and astrophysical backgrounds.

        Speakers: Jean-Philippe Lenain, Quentin Remy
      • 15:50
        Discussion 15m
      • 16:05
        Coffee break 15m
      • 16:20
        Identifying correct counterparts to high-energy sources by "multiwavelength educated guesses" imbibed in a Bayesian statistic environment 30m

        The identification of the counterparts to sources detected by
        instruments with large positional uncertainties can not be done using match in coordinates, due to the very high number density of the ancillary source catalogs.
        In addition, given that now the entire sky is literally covered by a plethora of multiwavelength surveys, the search for the counterparts by using a single band at a time is outdated. Instead, the entire SED for every single source in the sky can be created and used for discriminating the actual emitter from the field population.
        Finally, at least with respect to X-ray observations, we have more than 20 years of XMM and Chandra detection with a secure counterpart that can be used for creating a training sample to educate our guess.
        This is the basis of NWAY, a cross-matching code based Bayesian statistics that works with arbitrarily many catalogs, can handle varying positional errors, can incorporate additional prior information (the educated guesses and works accurately and fast in small areas and all-sky catalogues. In my talk, I will present how NWAY is now routinely used in the determination of the counterparts to Xray sources detected by e.g, ROSAT, XMMSlew, NUSTAR, and eROSITA. In particular, I will show how the prior (based on photometry, colors, parallax, and SNR of the detection) was built for eROSITA using Random Forest and tested on a validation sample providing 96% completeness and purity. The final goal is to discuss with the audience how a similar approach could be built for CTA.

        Speaker: Mara Salvato
      • 16:50
        Discussion 15m
      • 17:05
        Radio surveys 30m
        Speaker: Beatriz Mingo
      • 17:35
        Discussion 15m
      • 17:50
        Closing remarks 10m
        Speaker: Ullrich Schwanke