{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Modifying MCMC Initial Positions\n", "\n", "by Henry Ngo (2019) & Sarah Blunt (2021) & Mireya Arora (2021)\n", "\n", "When you set up the MCMC Sampler, the initial position of your walkers are randomly determined. Specifically, they are uniformly distributed in your Prior phase space. This tutorial will show you how to change this default behaviour so that the walkers can begin at locations you specify. For instance, if you have an initial guess for the best fitting orbit and want to use MCMC to explore posterior space around this peak, you may want to start your walkers at positions centered around this peak and distributed according to an N-dimensional Gaussian distribution. \n", "\n", "Note: This tutorial is meant to be read after reading the [MCMC Introduction tutorial](https://orbitize.readthedocs.io/en/latest/tutorials/MCMC_tutorial.html). If you are wondering what walkers are, you should start there!\n", "\n", "The `Driver` class is the main way you might interact with `orbitize!` as it automatically reads your input, creates all the `orbitize!` objects needed to do your calculation, and defaults to some commonly used parameters or settings. However, sometimes you want to work directly with the underlying API to do more advanced tasks such as changing the MCMC walkers' initial positions, or [modifying the priors](https://orbitize.readthedocs.io/en/latest/tutorials/Modifying_Priors.html).\n", "\n", "This tutorial walks you through how to do that. \n", "\n", "**Goals of this tutorial**:\n", "- Learn to modify the MCMC `Sampler` object\n", "- Learn about the structure of the `orbitize` code base" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Import modules" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from scipy.optimize import minimize as mn\n", "import orbitize\n", "from orbitize import driver\n", "import multiprocessing as mp" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## 1) Create Driver object\n", "\n", "First, let's begin as usual and create our `Driver` object, as in the MCMC Introduction tutorial." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "filename = \"{}/GJ504.csv\".format(orbitize.DATADIR)\n", "\n", "# system parameters\n", "num_secondary_bodies = 1\n", "total_mass = 1.75 # [Msol]\n", "plx = 51.44 # [mas]\n", "mass_err = 0.05 # [Msol]\n", "plx_err = 0.12 # [mas]\n", "\n", "# MCMC parameters\n", "num_temps = 5\n", "num_walkers = 30\n", "num_threads = mp.cpu_count() # or a different number if you prefer\n", "\n", "\n", "my_driver = driver.Driver(\n", " filename,\n", " \"MCMC\",\n", " num_secondary_bodies,\n", " total_mass,\n", " plx,\n", " mass_err=mass_err,\n", " plx_err=plx_err,\n", " mcmc_kwargs={\n", " \"num_temps\": num_temps,\n", " \"num_walkers\": num_walkers,\n", " \"num_threads\": num_threads,\n", " },\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## 2) Access the `Sampler` object to view the walker positions\n", "\n", "As mentioned in the introduction, the `Driver` class creates the objects needed for the orbit fit. At the time of this writing, it creates a `Sampler` object which you can access with the `.sampler` attribute and a `System` object which you can access with the `.system` attribute.\n", "\n", "The `Sampler` object contains all of the information used by the orbit sampling algorithm (OFTI or MCMC) to fit the orbit and determine the posteriors. The `System` object contains information about the astrophysical system itself (stellar and companion parameters, the input data, etc.). \n", "\n", "To see all of the attributes of the driver object, you can use `dir()`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(dir(my_driver))" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "This returns many other functions too, but you see `sampler` and `system` at the bottom. Don't forget that in Jupyter notebooks, you can use `my_driver?` to get the docstring for its class (i.e. the `Driver` class) and `my_driver??` to get the full source code of that class. You can also get this information in the API documentation.\n", "\n", "Now, let's list the attributes of the `my_driver.sampler` attribute." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(dir(my_driver.sampler))" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Again, you can use the `?` and `??` features as well as the API documentation to find out more. Here we see an attribute `curr_pos` which contains the current position of all the walkers for the MCMC sampler. These positions were generated upon initialization of the `Sampler` object, which happened as part of the initialization of the `Driver` object." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Examine `my_driver.sampler.curr_pos`\n", "\n", "`curr_pos` is an array and has shape (`n_temps`, `n_walkers`, `n_params`) for the parallel-tempered MCMC sampler and shape (`n_walkers`, `n_params`) for the affine-invariant ensemble sampler. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_driver.sampler.curr_pos.shape # Here we are using the parallel-tempered MCMC sampler" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Basically, this is the same shape as the output of the Sampler. Let's look at the start position of the first five walkers at the lowest temperature, to get a better sense of what the strucutre is like." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(my_driver.sampler.curr_pos[0, 0:5, :])" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## 3) Replace `curr_pos` with your own initial positions for walkers\n", "\n", "When the sampler is run with the `sampler.run_sampler()` method, it will start the walkers at the `curr_pos` values, run the MCMC forward for the given number of steps, and then update `curr_pos` to reflect where the walkers ended up. The next time `run_sampler()` is called, it does the same thing again.\n", "\n", "Here, you have just created the sampler but have not run it yet. So, if you update `curr_pos` with our own custom start locations, when you run the sampler, it will begin at your custom start locations instead." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### 3.1) Generate your own initial positions\n", "\n", "There are many ways to create your own walker start distribution and what you want to do will depend on your science question and prior knowledge. \n", "\n", "If you have already generated and validated your own initial walker positions, you can skip down to the \"Update sampler position\". Some users use the output of OFTI or a previous MCMC run as the initial position.\n", "\n", "If you need to generate your own positions, read on. Here, let's assume you know a possible best fit value and your uncertainty in that fit. Perhaps you got this through a least squares minimization. So, let's create a distribution of walkers that are centered on the best fit value and distributed normallly with the 1-sigma in each dimension equal to the uncertainty on that best fit value." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "First, let's define the best fit value and the spread. As a reminder, the order of the parameters in the array is (for a single planet-star system): semimajor axis, eccentricity, inclination, argument of periastron, position angle of nodes, epoch of periastron passage, parallax and total mass. You can check the indices with this dict in the `system` object." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(my_driver.system.param_idx)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set centre and spread of the walker distribution\n", "# Values from Table 1 in Blunt et al. 2017, AJ, 153, 229\n", "sma_cen = 44.48\n", "sma_sig = 15.0\n", "ecc_cen = 0.0151\n", "ecc_sig = 0.175\n", "inc_cen = 2.30 # (131.7 deg)\n", "inc_sig = 0.279 # (16.0 deg)\n", "aop_cen = 1.60 # (91.7 deg)\n", "aop_sig = 1.05 # (60.0 deg)\n", "pan_cen = 2.33 # (133.7 deg)\n", "pan_sig = 0.872 # (50.0 deg)\n", "tau_cen = 0.77 # (2228.11 yr)\n", "tau_sig = 0.65 # (121.0 yr)\n", "\n", "# Note : parallax and stellar mass already defined above (plx, plx_err, total_mass, mass_err)\n", "walker_centres = np.array(\n", " [sma_cen, ecc_cen, inc_cen, aop_cen, pan_cen, tau_cen, plx, total_mass]\n", ")\n", "walker_1sigmas = np.array(\n", " [sma_sig, ecc_sig, inc_sig, aop_sig, pan_sig, tau_sig, plx_err, mass_err]\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "You can use `numpy.random.standard_normal` to generate normally distributed random numbers in the same shape as your walker initial positions (`my_driver.sampler.curr_pos.shape`). Then, multiply by `walker_1sigmas` to get the spread to match your desired distribution and add `walker_centres` to get the distribution centered on your desired values." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "curr_pos_shape = my_driver.sampler.curr_pos.shape # Get shape of walker positions\n", "\n", "# Draw from multi-variate normal distribution to generate new walker positions\n", "new_pos = np.random.standard_normal(curr_pos_shape) * walker_1sigmas + walker_centres" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### 3.2) Using an optimizer to obtain a best fit value" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Other optimizing software can also be used to generate intial positions. Depending on the quality of data collected and whether a suitable guess array of parameters can be made, different optimizing software can provide better best fit values for for MCMC walkers. Below you will find a few options that cater to different scenarios." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### 3.2a) Using `scipy.optimize.minimize`" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Assuming the data obtained allows for a suitable guess to be made for each parameter, a scipy.optimize.minimize software can be used to generate a best fit value. You may want to skip this step and input your guess values directly into MCMC's initial walker positions, however scipy can help refine the guess parameters.\n", "\n", "First, we define a new log liklihood function function `neg_logl` based on the guess values we have. \n", "Note, since we have predefined a good guess, from the aforementioned Table, as `walker_centres` we will continue to use it as a guess array for examples below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The following code performs a minimization whereas the log liklihood function is based on maximization so we redefine the\n", "# likelihood function is redefined to return -x to make this a minization scenario\n", "\n", "m = my_driver.sampler\n", "\n", "\n", "def neg_logl(paramarray):\n", " x = m._logl(\n", " paramarray, include_logp=True\n", " ) # set include_logp to true to include guess array in likelihood function\n", "\n", " return -x\n", "\n", "\n", "guessarray = walker_centres\n", "results = mn(neg_logl, guessarray, method=\"Powell\")\n", "print(results.x) # results.x is the best fit value" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "In our trials, Powell has given the best results, but you may replace it with a different minimizing method depending on your need. " ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### 3.3) Scattering walkers" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "To set up MCMC so that it explores the nearby probablity space thoroughly and finds the global minimum, you can scatter the initial positions of the walkers around the best fit value. This can be done by adding random numbers to `results.x`\n", "\n", "This section overrides walker_1sigmas and creates a spread of `new_pos` in a different manner than above.\n", "The following is a template based on the aforementioned Table. The scatter is created using a variety of methods, we recommend reviewing the code to ensure it is compatible to your data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "new_pos = np.random.standard_normal(curr_pos_shape) * 0.03 + results.x" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### 3.4) Update sampler position\n", "\n", "After generating and validating your new walker positions, through whatever methods you choose, it's now time to update the sampler object to have its `curr_pos` be your new positions." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_driver.sampler.curr_pos = np.copy(new_pos)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### 3.5) Validate your new positions\n", "\n", "Drawing from a normal distribution can cause your walkers to start outside of your prior space. See the Modifying Priors tutorial for information on how to interact with the prior objects, which would allow you to find the limits on each parameter set by the priors etc. \n", "\n", "Here, let's do something more simple and just check that all values are physically valid. After this we can begin to correct them.\n", "\n", "The following function can be used to identify walkers that have been initialized outside of the appropriate prior probability space. It will raise a `ValueError` if walkers are initialized outside of the priors. You should update your positions until this method runs without raising an error." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "try:\n", " my_driver.sampler.check_prior_support()\n", "except Exception as e:\n", " print(e)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "We should continue investigating which parameters are being initialized outside of the prior space until this function returns empty lists." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "And you're done! You can continue at \"Running the MCMC Sampler\" in the [MCMC Introduction Tutorial](https://orbitize.readthedocs.io/en/latest/tutorials/MCMC_tutorial.html#Running-the-MCMC-Sampler)" ] } ], "metadata": { "interpreter": { "hash": "e899b22145868d3cd465733d82c36c2ae3ac0d3591d6a0807ec2e5e577a9cf5c" }, "kernelspec": { "display_name": "Python 3.7.5 64-bit", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.4" } }, "nbformat": 4, "nbformat_minor": 2 }