Radial Velocity Tutorial for MCMC
By Roberto Tejada (2019)
This tutorial will assume the user is familiar with the Driver
class and is acquainted with MCMC terminology. For more information about MCMC, see the MCMC Introduction Tutorial.
We explain how to jointly fit radial velocity data and relative astrometry using the MCMC technique. First we need a set of data containing radial velocity measurements. We check the data using read_input
and observe the quant_type
column for radial velocity data. For more information on orbitize.read_input.read_file()
, see the Formatting Input Tutorial. You can fit for separate jitter and gamma terms for each RV
instrument in your dataset by adding an “instrument” column to your data csv.
NOTE: Astrometry+RV fitting currently only works with MCMC and not OFTI.
Read and Format Data
[1]:
import numpy as np
import matplotlib.pyplot as plt
from orbitize import read_input, plot, priors, driver, DATADIR
import multiprocessing as mp
data_table = read_input.read_file("{}/HD4747.csv".format(DATADIR)) # print all columns
data_table.pprint_all()
epoch object quant1 quant1_err quant2 quant2_err quant12_corr quant_type instrument
--------- ------ -------- ---------- ------ ---------- ------------ ---------- ----------
56942.3 1 606.5 7.0 180.04 0.62 nan seppa defsp
57031.2 1 606.6 6.4 180.52 0.58 nan seppa defsp
57289.4 1 604.0 7.0 184.9 0.9 nan seppa defsp
50366.475 0 -0.54103 0.00123 nan nan nan rv defrv
50418.329 0 -0.40053 0.00206 nan nan nan rv defrv
50462.252 0 -0.24094 0.0011 nan nan nan rv defrv
50689.604 0 0.37292 0.00117 nan nan nan rv defrv
50784.271 0 0.46223 0.00133 nan nan nan rv defrv
50806.227 0 0.48519 0.00103 nan nan nan rv defrv
50837.217 0 0.49395 0.00117 nan nan nan rv defrv
50838.201 0 0.49751 0.00112 nan nan nan rv defrv
50839.211 0 0.50187 0.00112 nan nan nan rv defrv
51009.62 0 0.53355 0.00135 nan nan nan rv defrv
51011.548 0 0.53164 0.00128 nan nan nan rv defrv
51013.602 0 0.53629 0.0016 nan nan nan rv defrv
51050.491 0 0.52154 0.00468 nan nan nan rv defrv
51170.248 0 0.50757 0.0014 nan nan nan rv defrv
51367.585 0 0.47678 0.00131 nan nan nan rv defrv
51409.527 0 0.46147 0.00523 nan nan nan rv defrv
51543.244 0 0.44311 0.00152 nan nan nan rv defrv
51550.229 0 0.43286 0.00147 nan nan nan rv defrv
51755.551 0 0.39329 0.00213 nan nan nan rv defrv
51899.284 0 0.36457 0.00163 nan nan nan rv defrv
52097.626 0 0.32986 0.00157 nan nan nan rv defrv
52488.542 0 0.26687 0.0015 nan nan nan rv defrv
52572.312 0 0.25035 0.00195 nan nan nan rv defrv
52987.228 0 0.19466 0.00238 nan nan nan rv defrv
52988.186 0 0.18469 0.00194 nan nan nan rv defrv
53238.456 0 0.16892 0.00134 nan nan nan rv defrv
53303.403 0 0.16769 0.00112 nan nan nan rv defrv
53339.265 0 0.16069 0.00119 nan nan nan rv defrv
53724.274 0 0.11302 0.00103 nan nan nan rv defrv
53724.276 0 0.11605 0.00112 nan nan nan rv defrv
54717.455 0 0.00984 0.00123 nan nan nan rv defrv
54718.508 0 0.01242 0.00115 nan nan nan rv defrv
54719.51 0 0.01572 0.00123 nan nan nan rv defrv
54720.47 0 0.01534 0.00113 nan nan nan rv defrv
54722.401 0 0.01479 0.00127 nan nan nan rv defrv
54723.47 0 0.01422 0.00122 nan nan nan rv defrv
54724.474 0 0.01169 0.0012 nan nan nan rv defrv
54725.383 0 0.01383 0.00113 nan nan nan rv defrv
54726.505 0 0.0195 0.00123 nan nan nan rv defrv
54727.452 0 0.0175 0.00113 nan nan nan rv defrv
55014.62 0 -0.00636 0.00141 nan nan nan rv defrv
55015.624 0 -0.00409 0.00138 nan nan nan rv defrv
55016.624 0 -0.00566 0.00121 nan nan nan rv defrv
55048.525 0 -0.01975 0.00124 nan nan nan rv defrv
55076.584 0 -0.01614 0.00128 nan nan nan rv defrv
55077.594 0 -0.01303 0.00126 nan nan nan rv defrv
55134.464 0 -0.01689 0.00136 nan nan nan rv defrv
55198.254 0 -0.02885 0.0012 nan nan nan rv defrv
55425.584 0 -0.04359 0.00125 nan nan nan rv defrv
55522.379 0 -0.0512 0.0013 nan nan nan rv defrv
55806.547 0 -0.07697 0.0013 nan nan nan rv defrv
56148.55 0 -0.10429 0.00128 nan nan nan rv defrv
56319.201 0 -0.1102 0.00128 nan nan nan rv defrv
56327.208 0 -0.11332 0.00149 nan nan nan rv defrv
56507.645 0 -0.12324 0.00133 nan nan nan rv defrv
56912.534 0 -0.17085 0.00113 nan nan nan rv defrv
The quant_type
column displays the type of data each row contains: astrometry (radec or seppa), or radial velocity (rv). For astrometry, quant1
column contains right ascension or separation, and the quant2
column contains declination or position angle. For rv data, quant1
contains radial velocity data in \(\mathrm{km/s}\), while quant2
is filled with nan
to preserve the data structure. The table contains each respective error column.
We can now initialize the Driver
class. MCMC samplers take time to converge to absolute maxima in parameter space, and the more parameters we introduce, the longer we expect it to take.
Create Driver Object
For joint orbital fits with RV data, we need to fit the stellar and companion masses (m0
and m1
respectively as separate free parameters). This differs from the astrometry-only case where fitting the total mass mtot
suffices. We set the system keyword fit_secondary_mass
to True
when initializing the Driver object.
[2]:
filename = "{}/HD4747.csv".format(DATADIR)
# system parameters
num_secondary_bodies = 1
stellar_mass = 0.84 # [Msol]
plx = 53.18 # [mas]
mass_err = 0.04 # [Msol]
plx_err = 0.12 # [mas]
# MCMC parameters
num_temps = 5
num_walkers = 30
num_threads = 2 # or a different number if you prefer, eg mp.cpu_count()
my_driver = driver.Driver(
filename, 'MCMC', num_secondary_bodies, stellar_mass, plx, mass_err=mass_err, plx_err=plx_err,
system_kwargs = {'fit_secondary_mass':True, 'tau_ref_epoch':0},
mcmc_kwargs={'num_temps': num_temps, 'num_walkers': num_walkers, 'num_threads': num_threads}
)
Since MCMC is an object in orbitize!
, we can assign a variable to the sampler and work with this variable:
[3]:
m = my_driver.sampler
RV Priors
The priors for the two RV parameters, the radial velocity offset (gamma), and jitter (sigma), have default uniform prior and log uniform prior respectively. The gamma uniform prior is set between \((-5,5)\) \(\mathrm{km/s}\), and the jitter log uniform prior is set for (\(10^{-4},0.05\)) \(\mathrm{km/s}\). The prior for m1
is a log uniform prior and is set for (\(10^{-3},2.0\))\(M_\odot\). The current version of orbitize addressed in this tutorial returns the
stellar radial velocity only.
NOTE: We may change the priors as instructed in the Modifying Priors tutorial:
[4]:
# getting the system object:
sys = my_driver.system
lab = sys.param_idx
print(sys.labels)
print(sys.sys_priors)
print(vars(sys.sys_priors[lab['m1']]))
['sma1', 'ecc1', 'inc1', 'aop1', 'pan1', 'tau1', 'plx', 'gamma_defrv', 'sigma_defrv', 'm1', 'm0']
[Log Uniform, Uniform, Sine, Uniform, Uniform, Uniform, Gaussian, Uniform, Log Uniform, Log Uniform, Gaussian]
{'minval': 1e-06, 'maxval': 2, 'logmin': -13.815510557964274, 'logmax': 0.6931471805599453}
[5]:
# change the m1 prior:
sys.sys_priors[lab['m1']] = priors.LogUniformPrior(1e-4, 0.5)
print(sys.labels)
print(sys.sys_priors)
print(vars(sys.sys_priors[lab['m1']]))
['sma1', 'ecc1', 'inc1', 'aop1', 'pan1', 'tau1', 'plx', 'gamma_defrv', 'sigma_defrv', 'm1', 'm0']
[Log Uniform, Uniform, Sine, Uniform, Uniform, Uniform, Gaussian, Uniform, Log Uniform, Log Uniform, Gaussian]
{'minval': 0.0001, 'maxval': 0.5, 'logmin': -9.210340371976182, 'logmax': -0.6931471805599453}
Running the MCMC Sampler
As noted in the MCMC Introduction Tutorial, we must choose the sampler step for MCMC and can save every \(nth\) sample to avoid using too much disk space using thin
.
[6]:
total_orbits = 1000 # number of steps x number of walkers (at lowest temperature)
burn_steps = 10 # steps to burn in per walker
thin = 2 # only save every 2 steps
[7]:
m.run_sampler(total_orbits, burn_steps=burn_steps, thin=thin)
Starting Burn in
/home/sblunt/Projects/orbitize/orbitize/priors.py:354: RuntimeWarning: invalid value encountered in log
lnprob = -np.log((element_array*normalizer))
/home/sblunt/Projects/orbitize/orbitize/priors.py:354: RuntimeWarning: invalid value encountered in log
lnprob = -np.log((element_array*normalizer))
/home/sblunt/Projects/orbitize/orbitize/priors.py:463: RuntimeWarning: invalid value encountered in log
lnprob = np.log(np.sin(element_array)/normalization)
/home/sblunt/Projects/orbitize/orbitize/priors.py:463: RuntimeWarning: invalid value encountered in log
lnprob = np.log(np.sin(element_array)/normalization)
10/10 steps of burn-in complete
Burn in complete. Sampling posterior now.
Run complete
[7]:
<ptemcee.sampler.Sampler at 0x7f17c0d19b50>
Now we can plot the distribution of MCMC parameter of interest:
[8]:
accepted_m1 = m.results.post[:, lab['m1']]
plt.hist(accepted_m1,histtype='step')
plt.xlabel('m1'); plt.ylabel('number of orbits')
plt.show()
Saving Results over Extended MCMC Run
Sometimes our MCMC run will need to run for an extended period of time to let the walkers converge. To observe the convergence, we often need to see the walkers’ progress along parameter space. We can save the sampler results periodically and keep running the sampler until convergence. To run for a greater number of steps and periodically save the results, we can create a for-loop and run for as many iterations as we’d like.
[9]:
filename = "{}/HD4747.csv".format(DATADIR)
total_orbits = 1000 # number of steps x number of walkers (at lowest temperature)
burn_steps = 10 # steps to burn in per walker
thin = 2 # only save every 10th step
# system parameters
num_secondary_bodies = 1
stellar_mass = 0.84 # [Msol]
plx = 53.18 # [mas]
mass_err = 0.04 # [Msol]
plx_err = 0.12 # [mas]
# MCMC parameters
num_temps = 5
num_walkers = 30
num_threads = 2 # or a different number if you prefer, e.g. mp.cpu_count()
my_driver = driver.Driver(
filename, 'MCMC', num_secondary_bodies, stellar_mass, plx, mass_err=mass_err, plx_err=plx_err,
system_kwargs = {'fit_secondary_mass':True, 'tau_ref_epoch':0},
mcmc_kwargs={'num_temps': num_temps, 'num_walkers': num_walkers, 'num_threads': num_threads}
)
m = my_driver.sampler
We’re now ready for the loop! The results
object contains a save_results
function which lets us save the results for our directory, and we will use the load_results
object from results
to access the data later. We also define the n_iter
below to mark how many MCMC runs to save our within results.
NOTE: To avoid long convergence periods, we may initialize the walkers in a sphere around the global minima of the parameter space as outlined in our Modifying MCMC Initial Positions Tutorial.
[10]:
# file name to save as:
hdf5_filename = 'my_rv_posterior_%1d.hdf5'
[11]:
n_iter = 2 # number of iterations
for i in range(n_iter):
# running the sampler:
orbits = m.run_sampler(total_orbits, burn_steps=burn_steps, thin=thin)
myResults = m.results
hdf5_filename = 'my_rv_posterior_%1d.hdf5' % i
myResults.save_results(hdf5_filename) # saves results object as an hdf5 file
Starting Burn in
/home/sblunt/Projects/orbitize/orbitize/priors.py:354: RuntimeWarning: invalid value encountered in log
lnprob = -np.log((element_array*normalizer))
/home/sblunt/Projects/orbitize/orbitize/priors.py:463: RuntimeWarning: invalid value encountered in log
lnprob = np.log(np.sin(element_array)/normalization)
/home/sblunt/Projects/orbitize/orbitize/priors.py:354: RuntimeWarning: invalid value encountered in log
lnprob = -np.log((element_array*normalizer))
/home/sblunt/Projects/orbitize/orbitize/priors.py:463: RuntimeWarning: invalid value encountered in log
lnprob = np.log(np.sin(element_array)/normalization)
10/10 steps of burn-in complete
Burn in complete. Sampling posterior now.
Run complete
Starting Burn in
/home/sblunt/Projects/orbitize/orbitize/priors.py:354: RuntimeWarning: invalid value encountered in log
lnprob = -np.log((element_array*normalizer))
/home/sblunt/Projects/orbitize/orbitize/priors.py:354: RuntimeWarning: invalid value encountered in log
lnprob = -np.log((element_array*normalizer))
/home/sblunt/Projects/orbitize/orbitize/priors.py:463: RuntimeWarning: invalid value encountered in log
lnprob = np.log(np.sin(element_array)/normalization)
/home/sblunt/Projects/orbitize/orbitize/priors.py:463: RuntimeWarning: invalid value encountered in log
lnprob = np.log(np.sin(element_array)/normalization)
10/10 steps of burn-in complete
Burn in complete. Sampling posterior now.
Run complete
Plotting and Accesing Saved Results
We can plot the corner plot saved in the results object by following the steps in the Advanced Plotting Tutorial:
[12]:
median_values = np.median(myResults.post,axis=0) # Compute median of each parameter
range_values = np.ones_like(median_values)*0.95 # Plot only 95% range for each parameter
corner_figure_median_95 = myResults.plot_corner(
range=range_values,
truths=median_values
)
As illustrated in the plot above, MCMC needs more time to run. We only performed two iterations in the loop to demonstrate its useage, but with increased n_iter
, the trendplots saved in the loop and the corner plot will show how the walkers converge to absolute extrema in parameter space.
To access the saved data, we can read it into a results object as shown in the MCMC Introduction Tutorial:
[13]:
from orbitize import results
loaded_results = results.Results() # Create blank results object for loading
loaded_results.load_results('my_rv_posterior_%1d.hdf5' % (n_iter-1))
To demonstrate use of the loaded results file above, we can use the saved results to plot our orbital plots:
[14]:
epochs = my_driver.system.data_table['epoch']
orbit_plot_fig = loaded_results.plot_orbits(
object_to_plot = 1, # Plot orbits for the first (and only, in this case) companion
num_orbits_to_plot= 50, # Will plot 50 randomly selected orbits of this companion
start_mjd=epochs[0], # Minimum MJD for colorbar (here we choose first data epoch)
)
WARNING: ErfaWarning: ERFA function "d2dtf" yielded 1 of "dubious year (Note 5)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "dtf2d" yielded 1 of "dubious year (Note 6)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "utctai" yielded 1 of "dubious year (Note 3)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "taiutc" yielded 1 of "dubious year (Note 4)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "dtf2d" yielded 10 of "dubious year (Note 6)" [astropy._erfa.core]
<Figure size 1008x432 with 0 Axes>
We can pass the rv_time_series = True
argument in plot_orbits
to display the RV time series plot as a third panel of plot_orbits
:
[15]:
epochs = my_driver.system.data_table['epoch']
orbit_plot_fig = plot.plot_orbits(
loaded_results,
object_to_plot = 1, # Plot orbits for the first (and only, in this case) companion
num_orbits_to_plot= 50, # Will plot 50 randomly selected orbits of this companion
start_mjd=np.min(epochs), # Minimum MJD for colorbar (here we choose first data epoch)
show_colorbar = True,
rv_time_series = True
)
orbit_plot_fig.savefig('HD4747_rvtimeseries_panelplot.png', dpi=250)
WARNING: ErfaWarning: ERFA function "d2dtf" yielded 1 of "dubious year (Note 5)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "dtf2d" yielded 1 of "dubious year (Note 6)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "utctai" yielded 1 of "dubious year (Note 3)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "taiutc" yielded 1 of "dubious year (Note 4)" [astropy._erfa.core]
WARNING: ErfaWarning: ERFA function "dtf2d" yielded 4 of "dubious year (Note 6)" [astropy._erfa.core]
<Figure size 1008x432 with 0 Axes>