Sensitivity analysis¶
Sobol sensitivity indices answer the canonical which input drives the
output question for a stochastic sweep. gmat_sweep.sensitivity
(sobol_sample + sobol_analyze)
wraps SALib so you can produce a
Saltelli sample design, run the sweep through the existing sweep(samples=...)
path, and read the indices off without assembling SALib's problem
dict by hand.
Install¶
The dependency is gated behind an optional extra so the default install stays slim:
Workflow¶
The three steps are: build the sample design, run the sweep, analyse.
from gmat_sweep import sweep, sobol_sample, sobol_analyze
perturb = {
"CoastTime": ("uniform", 0.0, 1200.0),
"Inj.Element1": ("normal", 1.0, 0.05),
"Inj.Element2": ("normal", 0.0, 0.05),
"Inj.Element3": ("normal", 0.0, 0.05),
}
# 1. Saltelli/Sobol sample design — n*(2D+2) rows for D=4, so n=128 → 1280 runs.
samples = sobol_sample(perturb, n=128, seed=42)
# 2. Run the sweep through the explicit-row entry point.
df = sweep("injection_dispersion.script", samples=samples)
# 3. Sobol indices on the metric of interest.
indices = sobol_analyze(df, perturb, metric="Sat.X", seed=42)
print(indices)
sobol_analyze returns a tidy long DataFrame with columns
kind / param_a / param_b / value / conf:
| kind | param_a | param_b | value | conf |
|---|---|---|---|---|
| S1 | CoastTime | NaN | 0.41 | 0.03 |
| S1 | Inj.Element1 | NaN | 0.18 | 0.02 |
| ... | ||||
| ST | CoastTime | NaN | 0.55 | 0.04 |
| ... | ||||
| S2 | CoastTime | Inj.Element1 | 0.07 | 0.05 |
kind="S1" is the first-order index (parameter alone), "ST" is the
total-order index (parameter plus all its interactions), and "S2" is
the pairwise second-order index. conf is SALib's 95 % bootstrap
confidence half-width — interpret index values as value ± conf.
Sample size¶
The Saltelli design generates n * (2*D + 2) runs at
calc_second_order=True and n * (D + 2) runs at
calc_second_order=False, where D = len(perturb). Pick n as a
power of two; SALib's authors recommend n ≥ 1024 for stable estimates
on most engineering problems.
If the second-order pairwise indices aren't part of the question, trade them away to halve the run count:
samples = sobol_sample(perturb, n=128, seed=42, calc_second_order=False)
df = sweep("...", samples=samples)
indices = sobol_analyze(df, perturb, metric="Sat.X", seed=42, calc_second_order=False)
The flag must match between the two calls — sobol_analyze doesn't
re-derive it from the row count.
Reducing per-run output to a scalar¶
metric collapses the (run_id, time)-MultiIndexed sweep result to one
scalar per run:
metric="Sat.X"— convenience for value ofSat.Xat each run's final time-step (the typical end-of-mission state).metric=callable— receives the full DataFrame, returns apd.Seriesindexed byrun_id. Use this for derived quantities like miss distance, total Δv, or any quantile across the run's trajectory.
import numpy as np
def terminal_radius(df):
last = df.groupby(level="run_id").tail(1)
return np.sqrt(last["Sat.X"]**2 + last["Sat.Y"]**2 + last["Sat.Z"]**2)
indices = sobol_analyze(df, perturb, metric=terminal_radius, seed=42)
Distribution shapes¶
sobol_sample accepts the same perturb mapping as
monte_carlo: the three shorthand tuples
(("normal", mu, sigma), ("uniform", lo, hi), ("lognormal", mu, sigma))
and any pre-frozen scipy.stats.rv_frozen. The unit-cube design from
SALib is lifted into each parameter's marginal via the distribution's
ppf — generalising past SALib's own dists knob to anything
scipy.stats can freeze.
Failure modes¶
sobol_analyze rejects a sweep result that contains any __status != "ok"
rows: SALib cannot ingest the NaN-padded values that
failed/skipped runs leave behind. Filter explicitly first:
Be aware that filtering breaks the Saltelli design's row balance — if
runs failed in the middle, the indices on the survivors are no longer
unbiased estimates. For a clean answer, re-launch the failed rows
(gmat-sweep resume) before analysing.
Determinism¶
Two sobol_sample(..., seed=K) calls produce bit-equal DataFrames at
the same (perturb, n, seed, calc_second_order). Two
sobol_analyze(..., seed=K) calls produce bit-equal indices and
bootstrap confidence intervals against the same Y vector. Saltelli
seeding is independent of monte_carlo's
per-run-name seeding contract — they are separate samplers.
What SALib methods are wrapped¶
gmat_sweep.sensitivity ships Sobol only. SALib also exposes Morris,
FAST, RBD-FAST, and DGSM samplers and analyses; if your work needs one
of those, the same recipe (sobol_sample's unit-cube + ppf lift,
applied to the SALib sampler, then handed to sweep(samples=...))
generalises in three lines — open an issue if you'd like the wrappers
upstreamed.