Evaluating gmat-sweep¶

A ten-minute end-to-end recipe: install gmat-sweep, run a small grid sweep against a bundled mission script, and inspect the resulting manifest and per-run DataFrame.

Prerequisites¶

Python 3.10, 3.11, or 3.12.
A working GMAT install on the same machine. gmat-sweep does not bundle GMAT binaries; it depends on gmat-run for the single-run primitive, and gmat-run discovers your install at runtime. Follow gmat-run's install guide to download a build (R2025a or R2026a) from the SourceForge release page and let gmat-run discover it.

A quick sanity check that GMAT is wired up:

python -c "import gmat_run; inst = gmat_run.locate_gmat(); print(f'{inst.version} at {inst.root}')"

That prints, for example, R2026a at /home/you/gmat-R2026a. If it raises GmatNotFoundError, work through the gmat-run install guide before continuing — the typical fix is exporting GMAT_ROOT to point at your unpacked install.

Install gmat-sweep¶

pip install gmat-sweep

That pulls gmat-run as a transitive dependency. No extras are needed for this recipe.

The mission¶

Download mission.script and save it in your working directory. It is a 60-second LEO propagation: one Spacecraft (Sat) in a 7000 km circular orbit, a point-mass Earth force model, an RK89 propagator, and a ReportFile that records UTC time, ECI position, SMA, and ECC. It parses cleanly under R2025a and R2026a.

Run the sweep¶

Run mission.script three times with Sat.SMA = 7000, 7100, 7200:

from pathlib import Path

from gmat_sweep import sweep

out = Path("./sweep")
df = sweep(
    "mission.script",
    grid={"Sat.SMA": [7000, 7100, 7200]},
    out=out,
)
print(df.head())

Three fresh GMAT subprocesses, one per Sat.SMA value. The returned DataFrame is (run_id, time)-MultiIndexed; each row carries the columns declared in the ReportFile plus a __status column flagging ok / failed / skipped. Passing out=Path("./sweep") keeps the per-run Parquet files and the manifest on disk under that directory; omit it and they land in a temporary directory whose lifetime is tied to the returned DataFrame.

The same call from a shell:

gmat-sweep run --grid Sat.SMA=7000:7200:3 --out ./sweep mission.script

--grid name=lo:hi:count produces count evenly spaced points from lo to hi inclusive. See Parameter spec for the full grammar.

Inspect the manifest¶

Every sweep writes a JSON Lines manifest alongside its outputs — append-only, fsync'd after every entry. The first line is a header with sweep-level metadata; each remaining line is one run. For a 3-run sweep the file has four lines total.

The header line:

head -1 ./sweep/manifest.jsonl | python -m json.tool

Records the canonical inputs and the software-version fingerprint:

script_sha256 — SHA-256 of the canonicalised mission.script. A resumed sweep validates this before re-running anything.
parameter_spec — the grid definition ({"Sat.SMA": [7000, 7100, 7200], "_kind": "grid"}).
gmat_sweep_version, gmat_run_version, gmat_install_version, python_version, os_platform — the full reproduction fingerprint.
run_count — number of runs the manifest expects.

A per-run line:

head -2 ./sweep/manifest.jsonl | tail -1 | python -m json.tool

Records what happened for one run:

run_id — sequential integer assigned at grid-expansion time.
status — ok, failed, or skipped.
overrides — this run's parameter overrides (e.g. {"Sat.SMA": 7100}).
output_paths — paths to the per-run Parquet files.
started_at / ended_at / duration_s — wall-clock timing.
stderr — captured GMAT stderr if status is failed; null otherwise.

Full field reference: Manifest schema.

Inspect the DataFrame¶

The DataFrame has one row per (run_id, time-step). To compare runs at a single time point — for instance, the final report row of each run:

final_step = df.groupby("run_id").tail(1)
print(final_step[["Sat.Earth.SMA", "__status"]])

You should see three rows, one per run_id, with Sat.Earth.SMA close to the input values (the 60-second propagation does not change SMA appreciably for an unperturbed circular orbit) and __status == "ok" everywhere.

Where to go next¶

Getting started — the four-line vision snippet and the (run_id, time)-MultiIndex contract.
Backends — LocalJoblibPool (default), DaskPool, RayPool, KubernetesJobPool, MPIPool, and friends, for scaling the same sweep() call across multiple hosts.
Monte Carlo — stochastic dispersion sweeps with named distributions and a determinism contract.
Resume — finishing a sweep that was killed mid-run from the manifest.
Examples — twelve runnable notebooks covering Latin hypercube, Sobol sensitivity, Dask / Ray cluster recipes, archive bundles, solver convergence, and downstream-consumer pipelines.