Radial Velocity Tutorial for MCMC

By Roberto Tejada (2019)

This tutorial will assume the user is familiar with the Driver class and is acquainted with MCMC terminology. For more information about MCMC, see the MCMC Introduction Tutorial.

We explain how to jointly fit radial velocity data and relative astrometry using the MCMC technique. First we need a set of data containing radial velocity measurements. We check the data using read_input and observe the quant_type column for radial velocity data. For more information on orbitize.read_input.read_file(), see the Formatting Input Tutorial. You can fit for separate jitter and gamma terms for each RV instrument in your dataset by adding an “instrument” column to your data csv.

NOTE: Astrometry+RV fitting currently only works with MCMC and not OFTI.

Read and Format Data

[1]:
import numpy as np
import matplotlib.pyplot as plt
from orbitize import read_input, plot, priors, driver, DATADIR
import multiprocessing as mp

data_table = read_input.read_file("{}/HD4747.csv".format(DATADIR)) # print all columns
data_table.pprint_all()
  epoch   object  quant1  quant1_err quant2 quant2_err quant12_corr quant_type instrument
--------- ------ -------- ---------- ------ ---------- ------------ ---------- ----------
  56942.3      1    606.5        7.0 180.04       0.62          nan      seppa      defsp
  57031.2      1    606.6        6.4 180.52       0.58          nan      seppa      defsp
  57289.4      1    604.0        7.0  184.9        0.9          nan      seppa      defsp
50366.475      0 -0.54103    0.00123    nan        nan          nan         rv      defrv
50418.329      0 -0.40053    0.00206    nan        nan          nan         rv      defrv
50462.252      0 -0.24094     0.0011    nan        nan          nan         rv      defrv
50689.604      0  0.37292    0.00117    nan        nan          nan         rv      defrv
50784.271      0  0.46223    0.00133    nan        nan          nan         rv      defrv
50806.227      0  0.48519    0.00103    nan        nan          nan         rv      defrv
50837.217      0  0.49395    0.00117    nan        nan          nan         rv      defrv
50838.201      0  0.49751    0.00112    nan        nan          nan         rv      defrv
50839.211      0  0.50187    0.00112    nan        nan          nan         rv      defrv
 51009.62      0  0.53355    0.00135    nan        nan          nan         rv      defrv
51011.548      0  0.53164    0.00128    nan        nan          nan         rv      defrv
51013.602      0  0.53629     0.0016    nan        nan          nan         rv      defrv
51050.491      0  0.52154    0.00468    nan        nan          nan         rv      defrv
51170.248      0  0.50757     0.0014    nan        nan          nan         rv      defrv
51367.585      0  0.47678    0.00131    nan        nan          nan         rv      defrv
51409.527      0  0.46147    0.00523    nan        nan          nan         rv      defrv
51543.244      0  0.44311    0.00152    nan        nan          nan         rv      defrv
51550.229      0  0.43286    0.00147    nan        nan          nan         rv      defrv
51755.551      0  0.39329    0.00213    nan        nan          nan         rv      defrv
51899.284      0  0.36457    0.00163    nan        nan          nan         rv      defrv
52097.626      0  0.32986    0.00157    nan        nan          nan         rv      defrv
52488.542      0  0.26687     0.0015    nan        nan          nan         rv      defrv
52572.312      0  0.25035    0.00195    nan        nan          nan         rv      defrv
52987.228      0  0.19466    0.00238    nan        nan          nan         rv      defrv
52988.186      0  0.18469    0.00194    nan        nan          nan         rv      defrv
53238.456      0  0.16892    0.00134    nan        nan          nan         rv      defrv
53303.403      0  0.16769    0.00112    nan        nan          nan         rv      defrv
53339.265      0  0.16069    0.00119    nan        nan          nan         rv      defrv
53724.274      0  0.11302    0.00103    nan        nan          nan         rv      defrv
53724.276      0  0.11605    0.00112    nan        nan          nan         rv      defrv
54717.455      0  0.00984    0.00123    nan        nan          nan         rv      defrv
54718.508      0  0.01242    0.00115    nan        nan          nan         rv      defrv
 54719.51      0  0.01572    0.00123    nan        nan          nan         rv      defrv
 54720.47      0  0.01534    0.00113    nan        nan          nan         rv      defrv
54722.401      0  0.01479    0.00127    nan        nan          nan         rv      defrv
 54723.47      0  0.01422    0.00122    nan        nan          nan         rv      defrv
54724.474      0  0.01169     0.0012    nan        nan          nan         rv      defrv
54725.383      0  0.01383    0.00113    nan        nan          nan         rv      defrv
54726.505      0   0.0195    0.00123    nan        nan          nan         rv      defrv
54727.452      0   0.0175    0.00113    nan        nan          nan         rv      defrv
 55014.62      0 -0.00636    0.00141    nan        nan          nan         rv      defrv
55015.624      0 -0.00409    0.00138    nan        nan          nan         rv      defrv
55016.624      0 -0.00566    0.00121    nan        nan          nan         rv      defrv
55048.525      0 -0.01975    0.00124    nan        nan          nan         rv      defrv
55076.584      0 -0.01614    0.00128    nan        nan          nan         rv      defrv
55077.594      0 -0.01303    0.00126    nan        nan          nan         rv      defrv
55134.464      0 -0.01689    0.00136    nan        nan          nan         rv      defrv
55198.254      0 -0.02885     0.0012    nan        nan          nan         rv      defrv
55425.584      0 -0.04359    0.00125    nan        nan          nan         rv      defrv
55522.379      0  -0.0512     0.0013    nan        nan          nan         rv      defrv
55806.547      0 -0.07697     0.0013    nan        nan          nan         rv      defrv
 56148.55      0 -0.10429    0.00128    nan        nan          nan         rv      defrv
56319.201      0  -0.1102    0.00128    nan        nan          nan         rv      defrv
56327.208      0 -0.11332    0.00149    nan        nan          nan         rv      defrv
56507.645      0 -0.12324    0.00133    nan        nan          nan         rv      defrv
56912.534      0 -0.17085    0.00113    nan        nan          nan         rv      defrv

The quant_type column displays the type of data each row contains: astrometry (radec or seppa), or radial velocity (rv). For astrometry, quant1 column contains right ascension or separation, and the quant2 column contains declination or position angle. For rv data, quant1 contains radial velocity data in \(\mathrm{km/s}\), while quant2 is filled with nan to preserve the data structure. The table contains each respective error column.

We can now initialize the Driver class. MCMC samplers take time to converge to absolute maxima in parameter space, and the more parameters we introduce, the longer we expect it to take.

Create Driver Object

For joint orbital fits with RV data, we need to fit the stellar and companion masses (m0 and m1 respectively as separate free parameters). This differs from the astrometry-only case where fitting the total mass mtot suffices. We set the system keyword fit_secondary_mass to True when initializing the Driver object.

[2]:
filename = "{}/HD4747.csv".format(DATADIR)

# system parameters
num_secondary_bodies = 1
stellar_mass = 0.84 # [Msol]
plx = 53.18 # [mas]
mass_err = 0.04 # [Msol]
plx_err = 0.12 # [mas]

# MCMC parameters
num_temps = 5
num_walkers = 30
num_threads = 2 # or a different number if you prefer, eg mp.cpu_count()

my_driver = driver.Driver(
    filename, 'MCMC', num_secondary_bodies, stellar_mass, plx, mass_err=mass_err, plx_err=plx_err,
    system_kwargs = {'fit_secondary_mass':True, 'tau_ref_epoch':0},
    mcmc_kwargs={'num_temps': num_temps, 'num_walkers': num_walkers, 'num_threads': num_threads}
)

Since MCMC is an object in orbitize!, we can assign a variable to the sampler and work with this variable:

[3]:
m = my_driver.sampler

RV Priors

The priors for the two RV parameters, the radial velocity offset (gamma), and jitter (sigma), have default uniform prior and log uniform prior respectively. The gamma uniform prior is set between \((-5,5)\) \(\mathrm{km/s}\), and the jitter log uniform prior is set for (\(10^{-4},0.05\)) \(\mathrm{km/s}\). The prior for m1 is a log uniform prior and is set for (\(10^{-3},2.0\))\(M_\odot\). The current version of orbitize addressed in this tutorial returns the stellar radial velocity only.

NOTE: We may change the priors as instructed in the Modifying Priors tutorial:

[4]:
# getting the system object:
sys = my_driver.system

lab = sys.param_idx

print(sys.labels)
print(sys.sys_priors)

print(vars(sys.sys_priors[lab['m1']]))
['sma1', 'ecc1', 'inc1', 'aop1', 'pan1', 'tau1', 'plx', 'gamma_defrv', 'sigma_defrv', 'm1', 'm0']
[Log Uniform, Uniform, Sine, Uniform, Uniform, Uniform, Gaussian, Uniform, Log Uniform, Log Uniform, Gaussian]
{'minval': 1e-06, 'maxval': 2, 'logmin': -13.815510557964274, 'logmax': 0.6931471805599453}
[5]:
# change the m1 prior:
sys.sys_priors[lab['m1']] = priors.LogUniformPrior(1e-4, 0.5)

print(sys.labels)
print(sys.sys_priors)
print(vars(sys.sys_priors[lab['m1']]))
['sma1', 'ecc1', 'inc1', 'aop1', 'pan1', 'tau1', 'plx', 'gamma_defrv', 'sigma_defrv', 'm1', 'm0']
[Log Uniform, Uniform, Sine, Uniform, Uniform, Uniform, Gaussian, Uniform, Log Uniform, Log Uniform, Gaussian]
{'minval': 0.0001, 'maxval': 0.5, 'logmin': -9.210340371976182, 'logmax': -0.6931471805599453}

Running the MCMC Sampler

As noted in the MCMC Introduction Tutorial, we must choose the sampler step for MCMC and can save every \(nth\) sample to avoid using too much disk space using thin.

[6]:
total_orbits = 1000 # number of steps x number of walkers (at lowest temperature)
burn_steps = 10 # steps to burn in per walker
thin = 2 # only save every 2 steps
[7]:
m.run_sampler(total_orbits, burn_steps=burn_steps, thin=thin)
Starting Burn in
/home/sblunt/Projects/orbitize/orbitize/priors.py:354: RuntimeWarning: invalid value encountered in log
  lnprob = -np.log((element_array*normalizer))
/home/sblunt/Projects/orbitize/orbitize/priors.py:354: RuntimeWarning: invalid value encountered in log
  lnprob = -np.log((element_array*normalizer))
/home/sblunt/Projects/orbitize/orbitize/priors.py:463: RuntimeWarning: invalid value encountered in log
  lnprob = np.log(np.sin(element_array)/normalization)
/home/sblunt/Projects/orbitize/orbitize/priors.py:463: RuntimeWarning: invalid value encountered in log
  lnprob = np.log(np.sin(element_array)/normalization)
10/10 steps of burn-in complete
Burn in complete. Sampling posterior now.

Run complete
[7]:
<ptemcee.sampler.Sampler at 0x7f17c0d19b50>

Now we can plot the distribution of MCMC parameter of interest:

[8]:
accepted_m1 = m.results.post[:, lab['m1']]
plt.hist(accepted_m1,histtype='step')
plt.xlabel('m1'); plt.ylabel('number of orbits')
plt.show()
../_images/tutorials_RV_MCMC_Tutorial_15_0.svg

Saving Results over Extended MCMC Run

Sometimes our MCMC run will need to run for an extended period of time to let the walkers converge. To observe the convergence, we often need to see the walkers’ progress along parameter space. We can save the sampler results periodically and keep running the sampler until convergence. To run for a greater number of steps and periodically save the results, we can create a for-loop and run for as many iterations as we’d like.

[9]:
filename = "{}/HD4747.csv".format(DATADIR)

total_orbits = 1000 # number of steps x number of walkers (at lowest temperature)
burn_steps = 10 # steps to burn in per walker
thin = 2 # only save every 10th step

# system parameters
num_secondary_bodies = 1
stellar_mass = 0.84 # [Msol]
plx = 53.18 # [mas]
mass_err = 0.04 # [Msol]
plx_err = 0.12 # [mas]

# MCMC parameters
num_temps = 5
num_walkers = 30
num_threads = 2 # or a different number if you prefer, e.g. mp.cpu_count()


my_driver = driver.Driver(
    filename, 'MCMC', num_secondary_bodies, stellar_mass, plx, mass_err=mass_err, plx_err=plx_err,
    system_kwargs = {'fit_secondary_mass':True, 'tau_ref_epoch':0},
    mcmc_kwargs={'num_temps': num_temps, 'num_walkers': num_walkers, 'num_threads': num_threads}
)

m = my_driver.sampler

We’re now ready for the loop! The results object contains a save_results function which lets us save the results for our directory, and we will use the load_results object from results to access the data later. We also define the n_iter below to mark how many MCMC runs to save our within results.

NOTE: To avoid long convergence periods, we may initialize the walkers in a sphere around the global minima of the parameter space as outlined in our Modifying MCMC Initial Positions Tutorial.

[10]:
# file name to save as:
hdf5_filename = 'my_rv_posterior_%1d.hdf5'
[11]:
n_iter = 2 # number of iterations
for i in range(n_iter):
    # running the sampler:
    orbits = m.run_sampler(total_orbits, burn_steps=burn_steps, thin=thin)
    myResults = m.results
    hdf5_filename = 'my_rv_posterior_%1d.hdf5' % i
    myResults.save_results(hdf5_filename) # saves results object as an hdf5 file
Starting Burn in
/home/sblunt/Projects/orbitize/orbitize/priors.py:354: RuntimeWarning: invalid value encountered in log
  lnprob = -np.log((element_array*normalizer))
/home/sblunt/Projects/orbitize/orbitize/priors.py:463: RuntimeWarning: invalid value encountered in log
  lnprob = np.log(np.sin(element_array)/normalization)
/home/sblunt/Projects/orbitize/orbitize/priors.py:354: RuntimeWarning: invalid value encountered in log
  lnprob = -np.log((element_array*normalizer))
/home/sblunt/Projects/orbitize/orbitize/priors.py:463: RuntimeWarning: invalid value encountered in log
  lnprob = np.log(np.sin(element_array)/normalization)
10/10 steps of burn-in complete
Burn in complete. Sampling posterior now.

Run complete
Starting Burn in
/home/sblunt/Projects/orbitize/orbitize/priors.py:354: RuntimeWarning: invalid value encountered in log
  lnprob = -np.log((element_array*normalizer))
/home/sblunt/Projects/orbitize/orbitize/priors.py:354: RuntimeWarning: invalid value encountered in log
  lnprob = -np.log((element_array*normalizer))
/home/sblunt/Projects/orbitize/orbitize/priors.py:463: RuntimeWarning: invalid value encountered in log
  lnprob = np.log(np.sin(element_array)/normalization)
/home/sblunt/Projects/orbitize/orbitize/priors.py:463: RuntimeWarning: invalid value encountered in log
  lnprob = np.log(np.sin(element_array)/normalization)
10/10 steps of burn-in complete
Burn in complete. Sampling posterior now.

Run complete

Plotting and Accesing Saved Results

We can plot the corner plot saved in the results object by following the steps in the Advanced Plotting Tutorial:

[12]:
median_values = np.median(myResults.post,axis=0) # Compute median of each parameter
range_values = np.ones_like(median_values)*0.95 # Plot only 95% range for each parameter
corner_figure_median_95 = myResults.plot_corner(
    range=range_values,
    truths=median_values
)
../_images/tutorials_RV_MCMC_Tutorial_23_0.svg

As illustrated in the plot above, MCMC needs more time to run. We only performed two iterations in the loop to demonstrate its useage, but with increased n_iter, the trendplots saved in the loop and the corner plot will show how the walkers converge to absolute extrema in parameter space.

To access the saved data, we can read it into a results object as shown in the MCMC Introduction Tutorial:

[13]:
from orbitize import results

loaded_results = results.Results() # Create blank results object for loading
loaded_results.load_results('my_rv_posterior_%1d.hdf5' % (n_iter-1))

To demonstrate use of the loaded results file above, we can use the saved results to plot our orbital plots:

[14]:
epochs = my_driver.system.data_table['epoch']
orbit_plot_fig = loaded_results.plot_orbits(
    object_to_plot = 1, # Plot orbits for the first (and only, in this case) companion
    num_orbits_to_plot= 50, # Will plot 50 randomly selected orbits of this companion
    start_mjd=epochs[0], # Minimum MJD for colorbar (here we choose first data epoch)
)
WARNING: ErfaWarning: ERFA function "d2dtf" yielded 1 of "dubious year (Note 5)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "dtf2d" yielded 1 of "dubious year (Note 6)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "utctai" yielded 1 of "dubious year (Note 3)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "taiutc" yielded 1 of "dubious year (Note 4)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "dtf2d" yielded 10 of "dubious year (Note 6)" [astropy._erfa.core]
<Figure size 1008x432 with 0 Axes>
../_images/tutorials_RV_MCMC_Tutorial_28_2.svg

We can pass the rv_time_series = True argument in plot_orbits to display the RV time series plot as a third panel of plot_orbits:

[15]:
epochs = my_driver.system.data_table['epoch']

orbit_plot_fig = plot.plot_orbits(
    loaded_results,
    object_to_plot = 1, # Plot orbits for the first (and only, in this case) companion
    num_orbits_to_plot= 50, # Will plot 50 randomly selected orbits of this companion
    start_mjd=np.min(epochs), # Minimum MJD for colorbar (here we choose first data epoch)
    show_colorbar = True,
    rv_time_series = True
)
orbit_plot_fig.savefig('HD4747_rvtimeseries_panelplot.png', dpi=250)
WARNING: ErfaWarning: ERFA function "d2dtf" yielded 1 of "dubious year (Note 5)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "dtf2d" yielded 1 of "dubious year (Note 6)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "utctai" yielded 1 of "dubious year (Note 3)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "taiutc" yielded 1 of "dubious year (Note 4)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "dtf2d" yielded 4 of "dubious year (Note 6)" [astropy._erfa.core]
<Figure size 1008x432 with 0 Axes>
../_images/tutorials_RV_MCMC_Tutorial_30_2.svg