About Me

Researcher on the intersection of astronomy, statistics and computer science.

Astronomy Research

  • Growth of
    Black Holes

    In Astrophysics, I research growing black holes, their demographics and their immediate environments. Before matter falls into a black hole, it swirls around it and heats up, creating radiation. This makes distant galaxies shine bright as Active Galactic Nuclei (AGN). I observe this radiation because it tells us how much the black hole is currently growing.

    My research focuses on the time when this growth occured most (z=0.5-3). For my PhD project I reconstructed the total growth of black holes over cosmic time using a large sample of distant AGN (2000, CDFS, COSMOS, AEGIS, XMM-XXL). This also requires a good understanding of the observations (astro-statistics) and the obscuration of AGN.

  • The Obscurer around
    Active Galactic Nuclei

    In most AGN, much of the radiation is swallowed by thick columns of gas and dust near the black hole. In my research I try to understand these gas clouds, in particular their location, extent/covering and relation to the black hole. During my PhD I investigated different obscurer geometries. I could also place the best constraints to date on the intrinsic covering fraction of the obscurer (77% Compton-thin, 38% Compton-thick obscured). The large-scale gas in galaxies can also obscure — in two recent papers I constrained how important this effect is. To understand the nuclear obscurer is crucial to correctly infer the intrinsic emission and therefore the black hole growth. Also, the mechanisms making these clouds is currently unknown.

  • Astrostatistics &

    I have published in Statistics, where I focus on nested sampling Monte Carlo algorithms and their performance. Population studies (hierarchical Bayesian inference) interests me, as well as helping others with statistics problems.

    I write a lot of software for various purposes (>100 github repos), many of them are also used by others: I am the author of the PyMultiNest package, and the Bayesian X-ray Astronomy (BXA) code. I think daily about new algorithms and solutions.


  • 7 First-author Papers
  • 400 Citations
  • 13 Presentations in conferences
  • 25 Co-authored papers
  • 100 Github repositories
  • 2000 Github followers

Since I begun publishing in 2014 my papers have received more than 400 citations (over 1100 if including co-authored papers). You can find a full list on ADS. Here I mention some aspects for each of my accepted papers (1 more in refereeing at the moment):

Year, Title, Authors Astronomy aspects Statistics aspects Link
Buchner et al. (2019): X-ray spectral and eclipsing model of the clumpy obscurer in active galactic nuclei What do we know about the obscurer around AGN? 1) We know what fraction of the population it makes obscured, and to what degree. 2) We know it is clumpy, because sometimes we see transitions from unobscured to obscured. From these eclipse events, we construct a clumpy torus geometry. Finally, we put it to the test, by predicting X-ray spectra and comparing against NuSTAR X-ray observations.
Supplimentary material: 360° VR video of the clumpy obscurer around a AGN More information & download of xspec spectral table models • Monte Carlo X-ray simulator available at https://github.com/JohannesBuchner/xars
Monte Carlo ray-tracing & efficient line-sphere intersections, nested sampling for X-ray spectral fitting
Buchner et al. (2019): On the Prevalence of Supermassive Black Holes over Cosmic Time How many super-massive black holes are out there in the Universe? From the local Universe, we know they are quite common in the centers of galaxies. We also know that these galaxies were created from several collisions of galaxies at earlier times. Maybe only a fraction of those had to have black holes? Here we put limits on the process making super-massive black hole progenitors, and their total number. Future gravitational wave detectors and high-redshift quasar environments will help us make progress. Analysis of very large and high-resolution N-body dark matter-only simulations
Buchner (2019): Collaborative Nested Sampling: Big Data vs. complex physical models We want to analyse Big Data from upcoming large surveys with predictive physical models. Just dealing with one object at a time can be too expensive to compute. Here I notice that most data sets will look very similar, so one can re-use model computations and analyse multiple data sets simultaneously.
Supplimentary material: Demo code on Github.
New algorithm: Collaborative Nested Sampling
Buchner, Schulze & Bauer (2017): Galaxy gas as obscurer: I. GRBs x-ray galaxies and find a NH ~ M* relation We determined that the obscurer of long Gamma-ray Bursts (GRBs) is their host galaxy, by analysing the X-ray spectrum of ~1000 GRBs with better methods, correlation with host stellar mass, and considering cosmological hydrodynamic simulations.
Supplimentary material: Video explanationAccessible write-up of this studyCatalogue of Swift afterglows and their obscurations (FITS).
Inference about the properties of a population. Model comparison. Reconstructs a 3d shape from random probes (tomography). Analysis of the gas in Illustris cosmological simulations..
Buchner & Bauer (2017): Galaxy gas as obscurer: II. Separating the galaxy-scale and nuclear obscurers of Active Galactic Nuclei What fraction of AGN would be obscured if there was no torus? 40%! GRBs tell us how much gas there is in galaxies. The host galaxies of GRBs and AGNs are slightly different, but this can be bridged as we demonstrate in this paper. The nuclear obscurer is then described in a new model, the radiation-lifted torus, in which Eddington accretion rate is the driving force of the changing obscurer.
Supplimentary material: Video explanationAccessible write-up of this and the previous study.
Analysis of large cosmological simulations (EAGLE), non-parametric Kernel-density estimates.
Buchner et al. (2015): Obscuration-dependent evolution of Active Galactic Nuclei This luminosity-function type study reconstructs the distribution and evolution of AGN in obscuration and luminosity using a novel robust non-parametric approach. We constrain the importance of Compton-thick AGN to the accretion history of the Universe, and the evolution of the obscured fraction.
Supplimentary material: Video explanation • XLF as table for download: Space density of AGN as f(Lx, z, NH)Total Space density of AGN as f(Lx, z)Plot for comparison.
Censored inference about the properties of a population. Bayesian field inference. Reconstructs a 3d smooth function under selection effects without assuming a shape but only smoothness. Uses Stan to reconstruct the growth of black holes over cosmic time.
Buchner et al. (2014): X-ray spectral modelling of the AGN obscuring region in the CDFS: Bayesian model selection and catalogue Comparison of various models for the obscurer in AGN. Through model comparison between a disk, sphere and toroidal geometry, with the latter preferred, the obscurer was found to be extended but not fully covering, even for the Compton-thick sub-population.
Supplimentary material: Vizier CDFS catalogueBXA documentation
Presents several advancement in X-ray spectral analysis methods: Bayesian parametric analysis, comparison of models, Goodness-of-Fit, nested sampling vs. MCMC, vs likelihood contour error estimation
Buchner (2014): A statistical test for Nested Sampling algorithms Statistics paper: Evaluation of MultiNest and similar algorithms
Supplimentary material: Code for RadFriends and UltraNest on Github.
Analyses several nested sampling algorithms (e.g. MultiNest) for flaws using a new statistical test.


  • Consulting

    I am an active member on the Astrostatistics Facebook group, where we answer Astronomers questions about statistics, data mining, machine learning, programming, etc. I regularly answer questions, review papers on their use of statistics and write mini-tutorials. Similarly, I help out my colleagues with statistics questions in our astronomy institute, and am engaged in collaborations as a statistics advisor.

  • Guidelines

    I have written a minimal statistics checklist to help you identify and fix common errors/misinterpretation in your analysis, or of a paper you are refereeing.

    Lectures and books can be found on that page too.

You can find my statistics software and papers in the previous and next sections.


I write software to make my life and the life of my colleagues easier. Perhaps you can take advantage of it too.

BXA (13 fans) Bayesian X-ray analysis (nested sampling for Xspec and Sherpa)
condor_optimization (12 fans) CONDOR (COnstrained, Non-linear, Direct, parallel Optimization using trust Region method for high-computing load function) allows continuous parameter optimization
massivedatans (10 fans) Big Data vs. complex physical models - a scalable nested sampling inference algorithm for many data sets
nway (31 fans) nway -- Bayesian cross-matching of astronomical catalogues
PyMultiNest (156 fans) Pythonic Bayesian inference and visualization for the MultiNest Nested Sampling Algorithm or MCMC. See also the tutorial, RMultiNest.
UltraNest (10 fans) Pythonic Nested Sampling Development Framework & UltraNest
APEMoST Automated Parameter Estimation and Model Selection Toolkit -- A fast MCMC implementation for Bayesian inference
astrostatistics-recipes Recipes and codes for common statistics problems
BayesianBlocks Computes Bayesian Blocks with visualisation
extinctionevents Bayesian Recurring Event analysis
jbopt Parameter space exploration toolbox
posterierr MCMC Posterior Sample Error Propagation
presentations Research presentations on Astronomy and Astrostatistics
ps2data Extract data from postscript plots
PyDNest Python connection to Diffusive Nested Sampling.
pymultinest-tutorial pymultinest tutorial
RMultiNest R wrapper for MultiNest
stagedstan Staged Stan Python package
syscorr Bayesian correlation swiss army knife
ultranest-js Nested Sampling for Javascript applications
Click for an animation of MCMC (thanks to chi-feng) and Nested Sampling:
Scientific data analysis
imagehash (1443 fans) A Python Perceptual Image Hashing Module
LSST-PLAsTiCC-classification-solution (17 fans) Solution to LSST-PLAsTiCC photometric transient classification challenge
regulargrid (11 fans) Regular Grid Multivariate linear interpolation
addspec.py Merging X-ray spectra
agnviz Interactive visualisation of the Structure of Active Galactic Nuclei
athena-point-source-simulator Simulating Compton-thick AGN for Athena
autobackgroundmodel Machine-learned parametric model for background spectra
exofind RV exoplanet fitting code based on ExoFit
hmf Halo Mass Function in Python
imagestack Compression of large sets/databases of images
intersection Ray tracing / Line intersection formulas for various 2d and 3d objects
LightRayRider Ray tracing of hydrodynamic simulations to compute column densities
lunar-occulation-calculator Calculator for occultation of celestial objects by the moon or planetary bodies
npyinterp Fast interpolation/integration for monotonically increasing numpy arrays
scientific-visualisation-360-vr Scientific Visualisations with 360° VR HOWTO, with povray and ffmpeg
simbad2kstars Simbad to KStars import
test-calculator Online Scientific computations
webscipy Interactive web pages for scientific computing with python
xars X-ray absorption re-emission scattering - Monte Carlo simulator for X-ray obscurers
xray-data-analysis-docker Make CIAO and HEASOFT docker images
Machine Learning
flight-reservation-emails (20 fans) Searches your emails for flight tickets & displays a summary with all flight details
spoken-command-recognition (40 fans) A large, free audio sample database (10M words pronounced), a test bed for voice activity detection algorithms and for single-syllable word recognition
Presentations, writing & publishing
languagecheck (40 fans) Improve the language of your paper before submission
python-epub-builder (16 fans) Python API for building EPUB books
research2epub (42 fans) Linearize research papers into readable ebooks.
ads-reference-search Search ADS by reference "Lastname et al. (2019)"
bibtex-html-review BibTeX Review HTML pages
activitytracker Track what you are spending your time on
note3 Notational Velocity clone written with Jython+SWT
photozqual Evaluating photo-z redshift methods
spuren A desktop search engine kept fast and simple
tnewmail Thunderbird new mail desktop notifications
workflow-copilot Profile your computer interaction and interrupt with news only when appropriate
Data visualization
matplotlib-xkcdify (22 fans) sketchy, imprecise plotting theme for matplotlib
uncertaincolors Uncertainty visualisation in maps: Display measurement value and error simultaneously with Python/matplotlib
Logic, Puzzles & Games
ultragem UltraGem is a match three board game (Bejeweled/CandyCrush clone), with an advanced automated solver
zwicky-morphological-analysis Zwickys Morphological Analysis implemented in Python
Distributed Computing
Jake (18 fans) Collaborating on a project for ordinary people
Jarsync (35 fans) Java implementation of the Rsync protocol based on Jarsync by Casey Marshall
udt-java (96 fans) Java implementation of UDT (based on http://udt-java.sf.net by Bernd Schuller's)
availablelater AvailableLaterObjects simplify writing asynchronous calls by decoupling the result and hiding Thread creation.
externals Converting Java projects to Maven without breaking Upstream
fss A computer-independent view and addressing scheme of a common folder.
ics The Interclient Communication Service provides low-level clients communication methods
ska-SchedEng Scheduling Engine for large telescope arrays
ska-SchedEng-demo-nrao-field-system SchedulingEngine interface for controlling NRAO field system
ska-SchedUI Scheduling Web Interface for large telescope arrays
File systems
chunk-fuse (13 fans) Filesystem in User Space (fuse) using compressed and encrypted chunks
fresh-torrents Helps you find popular, cheap (bandwidth-wise) and new torrents
sparse-reiserfs-manager Script to create and resize reiserfs sub-filesystems stored in sparse files.
DHCProbe (30 fans) Send a DHCP request to DHCP server to check its configuration
PasswordTopologies Common Password Topologies
doithtml HTML progress reports for DoIt
ic4stan Information Criteria for Stan outputs
libreoffice-translate Translation of 5000 german comments in the libreoffice code base
mediawiki2html_machine A pure-PHP Mediawiki to HTML converter
S4 output of Student Skill Sharpening Sessions
... and various code snippets