Astrostatistics & Machine Learning

Better science through better methods.

I am a highly active member of the astrostatistics community, serving through:

  • In forums, such as the Facebook astrostatistics group, I answer astronomers questions about statistics, data mining, machine learning, programming, etc.
  • Consulting: In my institute and in international collaborations, I enjoy giving advice to my colleagues and help design analyses.
  • Tutorials and workshops. This includes workshops on X-ray spectral analysis and jupyter tutorials
  • State-of-the-art research into novel techniques, including:
  • Nested sampling

    Bayesian inference requires reliable Monte Carlo methods.

    I enjoy studying various algorithms for their performance and limitations, including Markov Chain Monte Carlo, Variational Bayes and Importance Sampling and Nested Sampling. For example: limitations of MultiNest and a more robust alternative.

    I am the author of two popular nested sampling packages: PyMultiNest and the newer UltraNest.

  • Bayesian population inference

    Hierarchical Bayesian models are a powerful, self-consistent way to infer population distributions from uncertain measurements. They have wide applicability in astronomy.

    At the limit of our telescope's ability, faint and obscured sources are missing. Modeling the selection bias (sample truncation) enables unbiased inference up to the detection limit.

    I have successfully used these approaches in large surveys, to infer the abundance of Active Galactic Nuclei over cosmic time, luminosity and obscuration, and the obscuration of gamma-ray bursts.

  • Big Data

    How can we analyse large survey datasets while still inserting and gaining physical insights? A new inference algorithm can simultaneously analyse many datasets with arbitrarily complex physical models.

    I have extensive knowledge of neural networks, random forests and other machine learning techniques, as well as non-parametric inference. These are useful to infer models when the data are better than our physical understanding. A recent application is the development of spectral models for X-ray telescopes and robust Gaussian processes for exoplanet detection.

Getting statistics right