PREFACE Workflow ================= This page describes the internal processing pipeline executed by :func:`~preface.run_preface`. The pipeline is divided into two broad phases: **Phase One** selects the most promising exoplanet targets by applying a selection metric, and **Phase Two** determines when those targets are observable from the chosen telescope location. The final output is a ranked list of transit events to direct observing efforts. The sections below describe each processing step in the order it is executed. .. contents:: :local: :depth: 1 ---- Phase One --------------------------- Phase One produces a ranked subset of TEPCat planets deemed viable for transmission spectroscopy. It proceeds through seven steps: - ``ModCheck``: Assembles the TEPCat exoplanet catalogue - ``ImpactMerger``: Recovers literature impact parameters from exoplanets.org - ``ExoplanetseuImpactMerger``: Recovers literature impact parameters from exoplanets.eu - ``WorkingTEPSetBuilder``: Calculates all remaining telescope- and time-independent parameters for selection metric calculation - ``RankMaker``: Calculates the selection metric for each planet from the working TEPCat dataset - ``Cutter``: Calculates the metric cutoff threshold, where planets with the metric lower than the threshold are deemed "non-viable" - ``ViabilitySplitter``: Divides full ranked catalogue into two output catalogues deemed "viable" and "non-viable" Step 1: ``ModCheck`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The input catalogue is primarily drawn from the Transiting ExoPlanet Catalogue (TEPCat), maintained by Dr John Southworth at Keele University. TEPCat was selected over alternatives such as exoplanets.eu for two reasons critical to pipeline operation: it provides transit durations (:math:`t_{14}`) for every entry, which is absent from most other catalogues yet essential for Phase Two transit predictions, and it provides complete mid-transit ephemerides in BJD\ :sub:`TDB`, which enables accurate prediction of future transit times. TEPCat is distributed as three separate files: - ``allplanets-csv.csv``: well-studied objects with full parameter sets. - ``kepplanets-csv.csv``: less-studied objects, typically faint Kepler targets. - ``observables.csv``: coordinates, magnitudes, ephemerides, durations, and depths for all planets in both of the above files. First, ``ModCheck`` checks the modification timestamp of each local copy. If any file is older than seven days, an updated version is downloaded from the TEPCat website before proceeding. Then, the three files are read into ``pandas`` DataFrames, cleaned, and concatenated into a single working table with consistent column headers. Any planet entry missing either transit duration (:math:`t_{14}`) or transit depth (:math:`\delta`) is removed, as both quantities are required at multiple downstream stages. Brown dwarf entries are also excluded. Finally, the cleaned catalogue is saved as ``FullTEPSet.csv`` and passed to the next step. Step 2: ``ImpactMerger`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The impact parameter :math:`b` is not provided by TEPCat's standard data files, yet it is required by Phase Two's transit geometry model. This step performs a first-pass recovery by cross-matching ``FullTEPSet.csv`` against the Exoplanet Orbit Database (exoplanets.org), which was found to hold the largest number of literature :math:`b`-values at the time of original pipeline development. Planet names are normalised between the two catalogues to ensure consistent matching, and every planet that appears in both catalogues receives its literature impact parameter at this stage. The result is written out as ``FullTEPSetWithExoOrgImpacts.csv``. Step 3: ``ExoplanetseuImpactMerger`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For planets still missing :math:`b` after Step 2, a second cross-match is performed against a snapshot of the exoplanets.eu catalogue (currently 24 June, 2026). Although this catalogue holds fewer :math:`b`-values overall, it tends to be more up to date for recent discoveries. Planets appearing in both catalogues receive their impact parameters in this pass. The updated catalogue is saved as ``FullTEPSetWithAllImpacts.csv``. .. note:: At the time of initial deployment (September 2017), this two-pass recovery successfully retrieved literature impact parameters for 1,213 of 1,451 TEPCat entries (83.6%). The two external catalogue snapshots are static; the recovery rate may decline gradually as TEPCat continues to be updated with new discoveries. Step 4: ``WorkingTEPSetBuilder`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ At this stage, the catalogue contains all available literature parameters but is still missing several quantities required by the metric and by Phase Two. ``WorkingTEPSetBuilder`` derives these internally. **Stellar magnitudes.** TEPCat provides V- and K-band magnitudes for parent stars, but many instruments use filters outside these bands. Magnitudes in U, B, R, and I are recovered by applying published spectral type conversion tables (Mamajek 2017; Pecaut & Mamajek 2013) to each star's effective temperature and V-band magnitude. Sloan :math:`g`, :math:`r`, :math:`i`, and :math:`z` magnitudes are recovered using the empirical colour transformations of Jordi et al. (2007). Specialist longpass filters (e.g. the Liverpool Telescope RISE 720 nm filter) are handled using a flux-additivity approximation. **Semi-major axis.** Where not already present, the semi-major axis is computed from Kepler's Third Law: .. math:: a = \left( \frac{P^2\, M_*}{M_\odot} \right)^{1/3} where :math:`P` is the orbital period in years and :math:`M_*` is the stellar mass in solar masses. The result is expressed in AU. **Equilibrium temperature.** Planet equilibrium temperatures are computed assuming zero Bond albedo: .. math:: T_\mathrm{eq} = T_\mathrm{eff} \cdot \sqrt{\frac{R_*}{2a}} where :math:`T_\mathrm{eff}` is the stellar effective temperature and :math:`R_*` is the stellar radius expressed in AU. **Remaining impact parameters.** For the minority of planets for which :math:`b` could not be recovered from either external catalogue in Steps 2-3, an estimate is computed analytically using the transit geometry model of Seager & Mallén-Ornelas (2003): .. math:: b = \sqrt{ \left( \frac {1 + \frac{R_p}{R_*}} {\cos \left[ \frac{\pi t_{14}}{P} \right] } \right)^2 - \left( \frac{a}{R_*} \tan \left[ \frac{\pi t_{14}}{P} \right] \right)^2 } **Planet mass estimation.** Planetary masses cannot be derived using transit parameters alone, yet mass is required by the metric calculation for planets. For planets lacking a literature mass, PREFACE estimates :math:`M_p` from the planet's radius using the continuous mass-radius-temperature relation of Edmondson et al. (2023): .. list-table:: :header-rows: 1 :widths: 25 40 50 * - Regime - Condition - Power-law applied * - Rocky - :math:`R_p < 1.580 \,R_\oplus` - :math:`M_p = \left(R_p / 1.01\right)^{1/0.28}\,M_\oplus` * - Neptunian - :math:`1.580 \leq R_p < 9.683\,R_\oplus` - :math:`M_p = \left(R_p / 0.53\right)^{1/0.68}\,M_\oplus` * - Jovian (degenerate) - :math:`R_p \geq 9.683\,R_\oplus` - :math:`M_p = 71.70\,M_\oplus` The Neptunian-Jovian boundary is defined as the radius at which the mass-radius relation becomes degenerate for a Jovian planet with an equilibrium temperature of :math:`T_\mathrm{eq} = 500\,\mathrm{K}`. Beyond this radius, the mass-radius relation becomes degenerate, and a unique mass estimate is not recoverable from radius alone, therefore a conservative lower-bound mass estimate at the boundary is given. The final estimated mass is then converted from Earth masses to Jupiter masses before being stored. .. figure:: images/MR_relation.png :width: 95% :align: center **Figure.** Left: Planet mass as a function of planetary radius for the ranked target sample. Right: Distribution of planetary radii, showing the fraction of targets with measured (blue) and unmeasured (orange) masses within each radius bin. Step 5: ``RankMaker`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For each planet in the working catalogue, ``RankMaker`` uses the chosen instrument's detector properties to compute an optimum exposure time and an approximate signal-to-noise ratio per planet. These quantities are used to evaluate the Phase One selection metric, which ranks planets by their expected spectroscopic return for the given instrument and filter combination. Metric scores are computed for all four available metric modes simultaneously: - ``Rank``: the standard photometric transmission spectroscopy metric (Morgan et al. 2019): .. math:: \mathcal{D} = C_{T}(\lambda) \cdot 10^{-0.2m_*} \cdot t_{14} \cdot \frac{T_\text{eq} R_p \delta}{M_p} where .. math:: C_{T}(\lambda) = N_\lambda^{-1/2} \cdot 10^{0.2m_\text{zp}} \cdot \left( \frac{t_\text{exp}}{t_\text{exp} + t_\text{overhead}} \right)^{1/2} contains all telescope-dependent variables. - ``Habitable_Rank``: the standard metric restricted to planets in or near the habitable zone of their host star. This effectively sets :math:`T_\mathrm{eq} = 1` for all targets, which can be followed by an application of a temperature mask afterwards. - ``Multi_Transit_Rank``: a metric weighted towards planets with frequent transits, suited to long-term monitoring campaigns: .. math:: \mathcal{D}_{\text{multi}} = \mathcal{D} \cdot P^{-1/2} - ``Multi_Transit_Habitable_Rank``: the habitable-zone variant of the multi-transit metric. The full catalogue is sorted by the standard ``Rank`` score from highest to lowest and saved as a ranked CSV. All four metric columns are retained so that the user can switch metric mode (via ``metric_mode``) without rerunning Phase One from scratch. .. note:: Exposure times and SNR values produced at this step are meaningful only in a relative sense. They are proportionality estimates used for ranking and are not a substitute for a dedicated exposure-time calculator when planning actual observations. Step 6: ``Cutter`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Because the Phase One metric is a proportionality rather than an absolute physical quantity, a fixed, physically motivated viability threshold cannot be defined. Instead, PREFACE uses the cumulative distribution of rank scores for a reference calibration instrument to place a data-driven cut. VLT/FORS2 (600RI+19 grism) is chosen as the calibration instrument for spectroscopy, because they access the largest viable subsets of targets, making the calibration cut as inclusive as possible. No corresponding calibration instrument is defined for photometry. The ``Cutter`` evaluates the cumulative rank-score distribution for the calibration instrument and identifies the minimum absolute rank score at which the cumulative fraction equals ``viable_cumulative_cut`` (e.g. ``0.97``). This threshold score is returned and passed to the ``ViabilitySplitter``. .. figure:: images/Cutter.png :width: 85% :align: center **Figure.** Cumulative fraction of the total ranking score contributed by the ranked target list. Horizontal dashed lines indicate the 0.80, 0.90, and 0.97 cumulative score thresholds. Step 7: ``ViabilitySplitter`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Using the threshold score from the Cutter, ``ViabilitySplitter`` divides the full ranked catalogue into two output files: - **Viable targets**: planets scoring at or above the threshold for the user's chosen instrument. These are passed to Phase Two for transit scheduling. - **Non-viable targets**: planets below the threshold. These are written to the ``nonviable_target_list/`` output directory for reference but take no further part in the pipeline. Both files are saved to the ``phase_1/`` output directory. The number of viable targets depends on both the chosen instrument and the value of ``viable_cumulative_cut``; a cut of ``0.90`` to ``0.97`` is recommended as it captures all planets in the productive upper "knee" of the cumulative distribution while excluding the long low-viability tail. ---- Phase Two ------------------------------ Phase Two takes the viable target list produced by Phase One and returns a ranked schedule of observable transit events within the user-specified observing window. For each planet, it propagates the transit ephemeris forward, evaluates every predicted event against astronomical night and altitude constraints, scores each observable event using an airmass- weighted transit coverage model and a moonlight noise model, and combines these with the Phase One metric score to produce a final event weight. Phase Two is the most computationally intensive stage of the pipeline, running at approximately 1-2 seconds per target for a viable target list of ~1000 planets in an 8-month observation window. Multiprocessing is applied at this stage using ``joblib``, distributing the workload across available CPU cores as configured by :class:`~preface.configs.MultiprocessingConfigurations`. PhaseTwo proceeds through three steps: - ``MultiprocessingWrapper``: Initializes necessary lookup tables and ``joblib`` multiprocessing for Phase Two - ``MultiprocessingProcess``: For each planet, find and rank viable transit events in the given observing window - ``PostCleaner``: Assembles ``MultiprocessingProcess`` outputs for each planet, then generates two final ranked transit datasets Step 1: ``MultiprocessingWrapper`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In order to reduce per-planet runtime during ``MultiprocessingProcess``, ``MultiprocessingWrapper`` first constructs the necessary lookup tables to be used across all parallel jobs, then dispatches those jobs via ``joblib``. The following preparatory steps are executed in the following sequence: **IERS table update.** Downloads or refreshes the IERS-A Earth orientation table required by Astropy for accurate coordinate transformations, falling back to an IERS mirror on network timeout. The cached copy is refreshed automatically if older than 7 days. **Telescope location & timezone resolution.** Gathers the IANA timezone of the telescope via its stored latitude and longitude for local-time handling during graphical output during multiprocessing. **Sun & Moon AltAz lookup table.** Precomputes Sun and Moon altitude/azimuth for every minute across the provided observation window (padded by 2 days). Positions are first computed at 10-minute precision in the local ``AltAz`` frame, then up-sampled to 1-minute resolution via cubic spline interpolation in Cartesian coordinates. Saved as a compressed Parquet file. **Lunar brightness lookup table.** If ``toggle_moonlight_noise`` is enabled, precomputes the Moon's hourly apparent magnitude using the Allen (1976) empirical phase-angle formula .. math:: m_\text{moon} (\lambda, \phi) = m_\text{full moon} (\lambda) + 0.026|\phi| + 4 \times 10^{-9} \cdot \phi^4 with lunar phase angles :math:`\phi` from ``pyephem`` and per-filter full-Moon magnitudes empirically calculated from Toledano et al. (2024)'s LIME toolbox sampling during October 2025 to May 2026 at TNT ULTRASPEC. Per-filter effective wavelengths :math:`\lambda` are retrieved from Bessell et al. (1998) and Fukugita et al. (1996). Saved as a compressed Parquet file. With all lookup tables in place, ``MultiprocessingProcess`` is then dispatched per viable planet as a ``joblib`` job, with progress tracked via ``tqdm``. Step 2: ``MultiprocessingProcess`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``MultiprocessingProcess`` is the core computational step of Phase Two, and the most computationally expensive section of the PREFACE pipeline. For this reason, it is dispatched once per viable planet as a ``joblib`` job. ``MultiprocessingProcess`` is divided into the following subprocesses: ephemeris propagation, transit contact times calculation, target local coordinate computation, nighttime observability classification, airmass-weighted scoring, transit coverage scoring, moonlight noise scoring, final transit decision metric scoring, and optional graphical output for every predicted transit event within the observing window. .. note:: TEPCat ephemerides are given in Barycentric Julian Date in Barycentric Dynamical Time (BJD\ :sub:`TDB`), which can differ from standard UTC by up to approximately 10 minutes due to relativistic corrections and accumulated leap seconds. This offset is appropriately handled internally within the pipeline. **Ephemeris propagation.** Transit first-contact times :math:`T_1` are propagated forward from the TEPCat reference ephemeris. The reference mid-transit time :math:`T_0` (in BJD\ :sub:`TDB`) is converted to a first-contact time by subtracting half the transit duration :math:`t_{14}`, and the integer epoch range spanning the observing window is computed. For each epoch :math:`n`, the transit start time :math:`T_{1,n}` and its associated timing uncertainty :math:`\sigma_n` are .. math:: T_{1,n} = T_{1,0} + n \cdot P .. math:: \sigma_n = \sqrt{\sigma_{T_0}^2 + \left( n \cdot \sigma_P \right)^2} **Transit contact time calculation.** For each predicted transit, the four contact times are derived from the transit geometry. The mid-transit :math:`T_0` and fourth contact :math:`T_4` follow directly from :math:`T_1` and :math:`t_{14}` .. math:: T_0 = T_1 + (0.5 \cdot t_{14}), \qquad T_4 = T_1 + t_{14} The internal contacts :math:`T_2` and :math:`T_3` are derived from the transit geometry model of Seager & Mallén-Ornelas (2003), from which .. math:: t_{23} = \frac{P}{\pi} \arcsin \left( \frac{ \sqrt{(R_*-R_p)^2 - (bR_*)^2} }{a} \right) and the ingress/egress duration .. math:: t_{12} = \frac{|t_{14} - t_{23}|}{2} follows. Subsequently, .. math:: T_2 = T_1 + t_{12}, \qquad T_4 = T_4 - t_{12} For grazing transits where impact parameter :math:`b = 0` and :math:`T_2` is unphysical, :math:`T_2` and :math:`T_3` are both set to :math:`T_0` and the event is treated as having no flat bottom. Finally, One-hour baseline windows are appended on either side of the transit: .. math:: T_\mathrm{BaseStart} = T_1 - 1\,\mathrm{hr}, \qquad T_\mathrm{BaseEnd} = T_4 + 1\,\mathrm{hr} **Target local coordinate computation.** The target's altitude and azimuth over time are computed for a 30-hour window (from 12 hours before closest midnight to 18 hours after) via Astropy's ``transform_to()`` method. To minimise per-event overhead, all time windows are batched and the altitude/azimuth coordinate transforms are performed simultaneously at 10-minute resolution in a single vectorised call, then interpolated to 1-minute resolution via cubic spline in Cartesian coordinates. Sun and Moon positions are drawn directly from the precomputed lookup table. **Nighttime observability classification.** First, the start and end of astronomical night are identified by detecting zero-crossings of the Sun's altitude profile relative to the :math:`-18°` astronomical twilight threshold, distinguishing sunfall (descending) from sunrise (ascending) crossings. Several multi-night edge cases are handled explicitly such that the night starts/ends are bounded by the 30-hour window and contain the corresponding transit's mid-transit time. Then, four binary altitude spot-check conditions are evaluated over the transit time window, nighttime window, and transit-night overlap windows. - *Target reaches minimum altitude:* target exceeds observable altitude at any point during the transit or night. - *Target above minimum altitude for entire transit:* target remains above :math:`30°` throughout the full transit-plus-baseline window. - *Target reaches minimum altitude at night:* target exceeds :math:`30°` at some point during the observing night. - *Target reaches minimum altitude during transit at night:* target exceeds :math:`30°` during the overlap of the transit window and the night. These conditions are combined into **strict** (``F``) and **lax** (``P``) altitude sets and evaluated against timing conditions on how the transit overlaps the night, producing one of the following internal rank markers: .. list-table:: :header-rows: 1 :widths: 15 85 * - Rank - Condition * - ``03_F`` - Night contains full transit + baselines. |br| Full transit + baselines always above observable altitude. |br| Target visible at some point during night. * - ``03_P`` - Night contains full transit + baselines. |br| Full transit + baselines at observable altitude during transit or during night. |br| Transit observable at some point during night. * - ``02_F`` - Night contains full transit but not full baselines. |br| Full transit + baselines always above observable altitude. |br| Target visible at some point during night. * - ``02_P`` - Night contains full transit but not full baselines. |br| Full transit + baselines at observable altitude during transit or during night. |br| Transit observable at some point during night. * - ``01_F`` - Night contains partial transits. |br| Full transit + baselines always above observable altitude. |br| Target visible at some point during night. * - ``01_P`` - Night contains partial transits. |br| Full transit + baselines at observable altitude during transit or during night. |br| Transit observable at some point during night. * - ``X`` - No part of the transit is observable. Event is discarded. .. |br| raw:: html
where the "minimum observable altitude" is :math:`30°`, which corresponds to an airmass of 2. The classification follows a top-down hierarchy: the strictest condition (``03_F``) is tested first, and the event falls through until a matching condition is found. **Integration limits.** For events ranked ``01`` or above, the usable observing window is bounded by integration limits :math:`L_1` and :math:`L_2`, defined as whichever is more restrictive between astronomical night and the times at which the target crosses the minimum observable altitude. For ``F``-ranked events, where the target remains above :math:`30°` throughout the transit: .. math:: L_1 = \max\left( T_\mathrm{BaseStart},\; T_\mathrm{sunfall} \right), \qquad L_2 = \min\left( T_\mathrm{BaseEnd},\; T_\mathrm{sunrise} \right) For ``P``-ranked events, the minimum-altitude crossing time(s) of the target's path are identified by detecting zero-crossings of altitude relative to :math:`30°`. A conditional logic gate handles scenarios including one or two crossing points, a rising or setting target, and varying orderings of the sunfall, altitude crossing, and transit contact times. Any event for which :math:`L_2 \leq L_1` after limit assignment is downgraded to rank ``X`` and discarded. **Airmass-weighted scoring.** The atmospheric seeing PSF :math:`\Theta` scales with airmass :math:`X = \csc(i)` at an altitude angle of :math:`i`, as :math:`\Theta \propto \lambda^{-1/5} \cdot X^{3/5}`. To penalise events at low altitude, an airmass weight is computed by the inverse of an intergral that describes how the airmass changes over the usable observing window: .. math:: W_\mathrm{airmass} = 3 \left( \frac{L_2-L_1}{\int_{L_1}^{L_2}X(t)^{3/5} dt} - \frac{2}{3} \right) The weight is amended to return a value of 1 for targets constantly at the zenith, decline steeply as altitude decreases, and have a value of 0 for targets constantly at :math:`30^\circ` altitude. .. figure:: images/AirmassMetric.png :width: 80% :align: center **Figure.** Airmass weighting function for a target observed at a constant altitude. **Transit coverage scoring.** Baseline coverage before and after the transit is essential for fitting the light curve and recovering limb-darkening parameters. The observable fraction of each of five segments within :math:`[L_1, L_2]` is evaluated: 1. Pre-transit baseline (:math:`T_\mathrm{BaseStart}` to :math:`T_1`) 2. Ingress (:math:`T_1` to :math:`T_2`) 3. Full mid-transit (:math:`T_2` to :math:`T_3`) 4. Egress (:math:`T_3` to :math:`T_4`) 5. Post-transit baseline (:math:`T_4` to :math:`T_\mathrm{BaseEnd}`) For each segment, the observable sub-interval is clipped to the overlap of the segment with the visibility window :math:`[L_1, L_2]`: .. list-table:: :header-rows: 1 :widths: 30 45 45 * - Segment - Lower clipped limit - Upper clipped limit * - Pre-transit baseline - :math:`L_{\mathrm{BS}1} = \max(L_1,\,T_\mathrm{BaseStart})` - :math:`L_{\mathrm{BS}2} = \min(L_2,\,T_1)` * - Ingress - :math:`L_{\mathrm{In}1} = \max(L_1,\,T_1)` - :math:`L_{\mathrm{In}2} = \min(L_2,\,T_2)` * - Full mid-transit - :math:`L_{\mathrm{Tr}1} = \max(L_1,\,T_2)` - :math:`L_{\mathrm{Tr}2} = \min(L_2,\,T_3)` * - Egress - :math:`L_{\mathrm{Eg}1} = \max(L_1,\,T_3)` - :math:`L_{\mathrm{Eg}2} = \min(L_2,\,T_4)` * - Post-transit baseline - :math:`L_{\mathrm{BE}1} = \max(L_1,\,T_4)` - :math:`L_{\mathrm{BE}2} = \min(L_2,\,T_\mathrm{BaseEnd})` In cases where the segment lies entirely outside the visibility window, both the clipped limits are set to zero. These clipped lengths are combined into three weight components: .. math:: W_\mathrm{base} &= \frac{(L_{\mathrm{BS}2} - L_{\mathrm{BS}1}) + (L_{\mathrm{BE}2} - L_{\mathrm{BE}1})}{\text{2 hr}} \\ W_\mathrm{trans} &= \frac{L_{\mathrm{Tr}2} - L_{\mathrm{Tr}1}}{T_3 - T_2} \\ W_\mathrm{inout} &= \frac{(L_{\mathrm{In}2} - L_{\mathrm{In}1}) + (L_{\mathrm{Eg}2} - L_{\mathrm{Eg}1})}{2\,(T_2 - T_1)} To which the composite event weight is: .. math:: W_\mathrm{event} = W_\mathrm{base} \cdot W_\mathrm{trans} \cdot W_\mathrm{inout} For grazing transits (:math:`T_2 = T_3`, so :math:`W_\mathrm{trans}` is undefined), the mid-transit term is omitted: .. math:: W_\mathrm{event} = W_\mathrm{base} \cdot W_\mathrm{inout} **Moonlight noise scoring.** If ``toggle_moonlight_noise`` is enabled, a moonlight noise weight is computed using the atmospheric scattering model of Winkler (2022), incorporating both Rayleigh and Mie scattering components. The scattered moonlight intensity after one-time scattering :math:`I_{L1}` (in flux/arcsec\ :sup:`2`) is modelled as: .. math:: I_{L1} = p(\theta) \cdot F^*_L \cdot \frac{\tau_R + \tau_M}{\tau} \cdot \sec\zeta\, \frac{e^{-\tau \sec\zeta} - e^{-\tau \sec z}}{\sec z - \sec\zeta} where the composite scattering phase function :math:`p(\theta)` is a weighted combination of the Rayleigh and Mie phase functions: .. math:: p(\theta) &= \frac{\tau_R\, p_R(\theta) + \tau_M\, p_M(\theta)}{\tau_R + \tau_M} \\ p_R(\theta) &= \frac{1}{4\pi} \frac{3(1-\chi)}{4(1+2\chi)} \left[\frac{1+3\chi}{1-\chi} + \cos^2\theta\right] \\ p_M(\theta) &= \frac{1}{4\pi} \frac{1-g^2}{(1+g^2-2g\cos\theta)^{3/2}} The quantities appearing in these expressions are defined as follows: .. list-table:: :header-rows: 1 :widths: 10 70 * - Symbol - Description * - :math:`F^*_L` - Top-of-atmosphere (TOA) lunar flux (in flux), |br| retrieved from precomputed lunar brightness LUT at start of observation * - :math:`\zeta` - Zenith angle of the observation target * - :math:`z` - Zenith angle of the Moon * - :math:`\theta` - Moon-target angular separation angle * - :math:`\tau_R` - Rayleigh scattering optical depth * - :math:`\tau_M` - Mie (aerosol) scattering optical depth, |br| sourced from the configurable ``scattering_aod`` parameter * - :math:`\tau_A` - Absorption optical depth, |br| sourced from the configurable ``absorption_aod`` parameter * - :math:`\tau` - Total optical depth (:math:`\tau = \tau_R + \tau_M + \tau_A`) * - :math:`\chi` - Rayleigh depolarisation factor, fixed at :math:`\chi = 0.0148` * - :math:`g` - Aerosol asymmetry factor, |br| sourced from the configurable ``asymmetry_factor`` parameter; |br| typically in the range 0.5-0.8 .. |br| raw:: html
The Rayleigh scattering optical depth is computed from the instrument altitude and filter wavelength as: .. math:: \tau_R = e^{-h/H}\,(1.229 \times 10^{10})\,\lambda^{-4.05} where :math:`h` is the instrument altitude in metres, :math:`H = 8500` m is Earth's scale height, and :math:`\lambda` is the filter's effective wavelength in nm. The scattered moonlight intensity :math:`I_{L1}` is then converted from flux/arcsec\ :sup:`2` to mag/arcsec\ :sup:`2` (:math:`\mu_\mathrm{bg,moon}`) using filter flux zeropoints. The sky brightness contribution over a sky patch of area :math:`A` (arcsec\ :sup:`2`) is: .. math:: m_\mathrm{bg,moon} = \mu_\mathrm{bg,moon} - 2.5\log(A) The SNR reduction factor, referred to as the moonlight noise weight, is then derived from the ratio of the new and old detectability scores :math:`\mathcal{D}`, which are directly proportional to the SNR. The resulting moonlight noise weight is: .. math:: W_\mathrm{Moon} = \left(1 + \frac{10^{-0.4(m_\mathrm{bg,moon} - \gamma)}} {10^{-0.4 m_*} + 10^{-0.4 m_\mathrm{sky}}}\right)^{-0.5} where :math:`\gamma` is a configurable "moonlight amplification factor" that adjusts the effective moon background brightness to provide stronger suppression under bright moonlight conditions. The default value (:math:`\gamma = 10`) is calibrated such that, for a full-moon sky background of :math:`\mu_\mathrm{bg,moon} = 17\,\mathrm{mag/arcsec^2}`, a sky patch of area :math:`A = 10\,\mathrm{arcsec^2}`, and a target of magnitude :math:`m_* = 12`, the resulting moonlight noise weight is approximately :math:`W_\mathrm{Moon} \approx 0.1`. .. figure:: images/MoonlightMetric.png :width: 99% :align: center **Figure.** Moonlight noise metric (Default aerosol parameters, Amplification factor 10) as a function of lunar illumination and the closest Moon-target separation. If you find the suppression too aggressive, try reducing the amplification factor to 8-9. If ``toggle_moonlight_noise`` is disabled, :math:`W_\mathrm{Moon} = 1`. **Final transit decision metric.** Finally, the final decision metric for a given transit event is the product of four terms: .. math:: \mathcal{D}_\mathrm{final} = \mathcal{D} \cdot W_\mathrm{airmass} \cdot W_\mathrm{event} \cdot W_\mathrm{Moon} where :math:`\mathcal{D}` is the Phase One metric score for the planet under the chosen ``metric_mode``. This formulation ensures the final ranking reflects both the intrinsic spectroscopic value of a target (Phase One) and the practical quality of the specific observing opportunity (Phase Two). **Graphical output.** If ``toggle_graph_outputs`` is enabled and the event weight :math:`W_\mathrm{event}` meets or exceeds ``event_weight_graph_threshold``, a sky chart is generated showing the target's altitude path (coloured by azimuth), the Moon's path, the observing night bounded by astronomical twilight, the transit window and baseline shaded in red, and altitude floor and ceiling limits. Moon illumination and closest lunar separation are annotated. Plots are saved as JPG files in ``phase_2/graph/`` per transit labelled with the event weight, internal rank, planet name, instrument, and transit time in local time. .. figure:: images/example_graph.jpg :width: 99% :align: center **Figure.** Example graphical output for K2-83b predicted at TNT ULTRASPEC. **Per-planet output.** All computed quantities (contact times, internal rank, UTC observation start and end, airmass metric, segment weights, event weight, moon noise metric, and final metric) are written to an individual CSV in ``phase_2/individual_planets/``. Internal state lists are cleared at the end of each planet's processing to prepare for the next job. Step 3: ``PostCleaner`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Once all per-planet event files have been generated by the multiprocessing pool, ``PostCleaner`` assembles the final pipeline outputs: **Full event list.** All per-planet CSV files are read then merged into a single DataFrame. Unobservable events (internal rank ``X``) are removed. The remaining events are sorted by final transit metric :math:`\mathcal{D}_\mathrm{final}` from highest to lowest and written to ``phase_2/full_ranked_event_list/``. This file is the primary output of the pipeline and can be used directly to construct observing schedules. **Cumulative observability scores.** For each planet, all events for which event weight :math:`W_\mathrm{event} > 0.5` (full transit capture with at least 50% baseline coverage on both sides) are identified. Their :math:`\cal{D}_\mathrm{final}` scores are summed to produce a "cumulative observability score", and the total count of qualifying events is recorded. These results are written to ``phase_2/cumulative_observability_scores/``. This product is intended to help campaign planners distinguish between planets that offer a single high-priority window and those that present multiple well-scored opportunities throughout the observation window. .. note:: Intermediate CSV files written to the pipeline's internal data store during the run can be removed after completion using :func:`~preface.wipe_intermediate_csvs`.