MCMC Introduction

by Jason Wang and Henry Ngo (2018)

Here, we will explain how to sample an orbit posterior using MCMC techniques. MCMC samplers take some time to fully converge on the complex posterior, but should be able to explore all posteriors in roughly the same amount of time (unlike OFTI). We will use the parallel-tempered version of the Affine-invariant sample from the ptemcee package, as the parallel tempering helps the walkers get out of local minima. Parallel-tempering can be disabled by setting the number of temperatures to 1, and will revert back to using the regular ensemble sampler from emcee.

Read in Data and Set up Sampler

We use orbitize.driver.Driver to streamline the processes of reading in data, setting up the two-body interaction, and setting up the MCMC sampler.

When setting up the sampler, we need to decide how many temperatures and how many walkers per temperature to use. Increasing the number of temperatures further ensures your walkers will explore all of parameter space and will not get stuck in local minima. Increasing the number of walkers gives you more samples to use, and, for the Affine-invariant sampler, a minimum number is required for good convergence. Of course, the tradeoff is that more samplers means more computation time. We find 20 temperatures and 1000 walkers to be reliable for convergence. Since this is a tutorial meant to be run quickly, we use fewer walkers and temperatures here.

Note that we will only use the samples from the lowest-temperature walkers. We also assume that our astrometric measurements follow a Gaussian distribution.

orbitize can also fit for the total mass of the system and system parallax, including marginalizing over the uncertainties in those parameters.

Python2 WARNING

When using Python2, there is an integer division bug in the ptemcee package. This results in no temperature adaptation. See Issue #128. Do not use parallel tempering (num_temps > 1) with Python2 until this is fixed.

[1]:
import numpy as np

import orbitize
from orbitize import driver
import multiprocessing as mp

filename = "{}/GJ504.csv".format(orbitize.DATADIR)

# system parameters
num_secondary_bodies = 1
system_mass = 1.75 # [Msol]
plx = 51.44 # [mas]
mass_err = 0.05 # [Msol]
plx_err = 0.12 # [mas]

# MCMC parameters
num_temps = 5
num_walkers = 20
num_threads = mp.cpu_count() # or a different number if you prefer


my_driver = driver.Driver(
    filename, 'MCMC', num_secondary_bodies, system_mass, plx, mass_err=mass_err, plx_err=plx_err,
    mcmc_kwargs={'num_temps': num_temps, 'num_walkers': num_walkers, 'num_threads': num_threads}
)

Running the MCMC Sampler

We need to pick how many steps the MCMC sampler should sample. Additionally, because the samples are correlated, we often only save every nth sample. This helps when we run a lot of samples, because saving all the samples requires too much disk space and many samples are unnecessary because they are correlated.

[2]:
total_orbits = 6000 # number of steps x number of walkers (at lowest temperature)
burn_steps = 10 # steps to burn in per walker
thin = 2 # only save every 2nd step

my_driver.sampler.run_sampler(total_orbits, burn_steps=burn_steps, thin=thin)
/Users/bluez3303/miniconda3/envs/python3.6/lib/python3.5/site-packages/orbitize/priors.py:163: RuntimeWarning: invalid value encountered in log
  lnprob = -np.log((element_array*normalizer))
/Users/bluez3303/miniconda3/envs/python3.6/lib/python3.5/site-packages/orbitize/priors.py:269: RuntimeWarning: invalid value encountered in log
  lnprob = np.log(np.sin(element_array)/normalization)
/Users/bluez3303/miniconda3/envs/python3.6/lib/python3.5/site-packages/orbitize/priors.py:163: RuntimeWarning: invalid value encountered in log
  lnprob = -np.log((element_array*normalizer))
/Users/bluez3303/miniconda3/envs/python3.6/lib/python3.5/site-packages/orbitize/priors.py:163: RuntimeWarning: invalid value encountered in log
  lnprob = -np.log((element_array*normalizer))
/Users/bluez3303/miniconda3/envs/python3.6/lib/python3.5/site-packages/orbitize/priors.py:163: RuntimeWarning: invalid value encountered in log
  lnprob = -np.log((element_array*normalizer))
/Users/bluez3303/miniconda3/envs/python3.6/lib/python3.5/site-packages/orbitize/priors.py:269: RuntimeWarning: invalid value encountered in log
  lnprob = np.log(np.sin(element_array)/normalization)
/Users/bluez3303/miniconda3/envs/python3.6/lib/python3.5/site-packages/orbitize/priors.py:269: RuntimeWarning: invalid value encountered in log
  lnprob = np.log(np.sin(element_array)/normalization)
/Users/bluez3303/miniconda3/envs/python3.6/lib/python3.5/site-packages/orbitize/priors.py:269: RuntimeWarning: invalid value encountered in log
  lnprob = np.log(np.sin(element_array)/normalization)
Burn in complete
300/300 steps completed
Run complete
[2]:
<ptemcee.sampler.Sampler at 0x11ada5780>

After completing the samples, the 'run_sampler' method also creates a 'Results' object that can be accessed with 'my_sampler.results'.

MCMC Diagnostics

The Sampler object also has two convenience functions to examine and modify the walkers in order to diagnose MCMC performance.

First, we can examine 5 randomly selected walkers for two parameters: semimajor axis and eccentricity. We expect 150 steps per walker since there were 6,000 orbits requested with 20 walkers, so that’s 300 orbits per walker. However, we have thinned by a factor of 2, so there are 150 saved steps.

[3]:
sma_chains, ecc_chains = my_driver.sampler.examine_chains(param_list=['sma1','ecc1'], n_walkers=5)
../_images/tutorials_MCMC_tutorial_9_0.png
../_images/tutorials_MCMC_tutorial_9_1.png

This method returns one matplotlib Figure object for each parameter. If no param_list given, all parameters are returned. Here, we told it to plot 5 randomly selected walkers but we could have specified exactly which walkers with the walker_list keyword. The step_range keyword also determines which steps in the chain are plotted (when nothing is given, the default is to plot all steps). We can also have these plots automatically generate if we called run_sampler with examine_chains=True.

Note that this is just a convenience function. It is possible to recreate these chains from reshaping the posterior samples and selecting the correct entries.

The second diagnostic tool is the chop_chains, which allows us to remove entries from the beginning and/or end of a chain. This updates the corresponding Results object stored in sampler (in this case, my_driver.sampler.results). The burn parameter specifies the number of steps to remove from the beginning (i.e. to add a burn-in to your chain) and the trim parameter specifies the number of steps to remove from the end. If only one parameter is given, it is assumed to be a burn value. If trim is not zero, the sampler object is also updated so that the current position (sampler.curr_pos) matches the new end point. This allows us to continue MCMC runs at the correct position, even if we have removed the last few steps of the chain.

Let’s remove the first and last 25 steps, leaving 100 orbits (or steps) per walker

[4]:
my_driver.sampler.chop_chains(burn=25,trim=25)
Chains successfully chopped. Results object updated.

Now we can examine the chains again to verify what we did. Note that the number of steps removed from either end of the chain is uniform across all walkers.

[5]:
sma_chains, ecc_chains = my_driver.sampler.examine_chains(param_list=['sma1','ecc1'], n_walkers=5)
../_images/tutorials_MCMC_tutorial_15_0.png
../_images/tutorials_MCMC_tutorial_15_1.png

Plotting Basics

We will make some basic plots to visualize the samples in 'my_driver.sampler.results'. Orbitize currently has two basic plotting functions which returns matplotlib Figure objects. First, we can make a corner plot (also known as triangle plot, scatterplot matrix, or pairs plot) to visualize correlations between pairs of orbit parameters:

[6]:
corner_plot_fig = my_driver.sampler.results.plot_corner() # Creates a corner plot and returns Figure object
corner_plot_fig.savefig('my_corner_plot.png') # This is matplotlib.figure.Figure.savefig()
WARNING:root:Too few points to create valid contours
WARNING:root:Too few points to create valid contours
../_images/tutorials_MCMC_tutorial_17_1.png

Next, we can plot a visualization of a selection of orbits sampled by our sampler. By default, the first epoch plotted is the year 2000 and 100 sampled orbits are displayed.

[7]:
epochs = my_driver.system.data_table['epoch']

orbit_plot_fig = my_driver.sampler.results.plot_orbits(
    object_to_plot = 1, # Plot orbits for the first (and only, in this case) companion
    num_orbits_to_plot= 100, # Will plot 100 randomly selected orbits of this companion
    start_mjd=epochs[0] # Minimum MJD for colorbar (here we choose first data epoch)
)
orbit_plot_fig.savefig('my_orbit_plot.png') # This is matplotlib.figure.Figure.savefig()
../_images/tutorials_MCMC_tutorial_19_0.png

For more advanced plotting options and suggestions on what to do with the returned matplotlib Figure objects, see the dedicated Plotting tutorial.

Saving and Loading Results

We will save the results in the HDF5 format. It will save two datasets: 'post' which will contain the posterior (the chains of the lowest temperature walkers) and 'lnlike' which has the corresponding probabilities. In addition, it saves 'sampler_name' as an attribute of the HDF5 root group.

[8]:
hdf5_filename='my_posterior.hdf5'
import os
# To avoid weird behaviours, delete saved file if it already exists from a previous run of this notebook
if os.path.isfile(hdf5_filename):
    os.remove(hdf5_filename)
my_driver.sampler.results.save_results(hdf5_filename)

Saving sampler results is a good idea when we want to analyze the results in a different script or when we want to save the output of a long MCMC run to avoid having to re-run it in the future. We can then load the saved results into a new blank results object.

[9]:
from orbitize import results
loaded_results = results.Results() # Create blank results object for loading
loaded_results.load_results(hdf5_filename)

Instead of loading results into an orbitize.results.Results object, we can also directly access the saved data using the 'h5py' python module.

[10]:
import h5py
filename = 'my_posterior.hdf5'
hf = h5py.File(filename,'r') # Opens file for reading
# Load up each dataset from hdf5 file
sampler_name = np.str(hf.attrs['sampler_name'])
post = np.array(hf.get('post'))
lnlike = np.array(hf.get('lnlike'))
hf.close() # Don't forget to close the file