Grammar surface¶
This page describes what the parser understands: the structure of a GMAT script, the concrete syntax tree (CST) node taxonomy, and the constructs that are covered versus deferred. For the reasoning behind these choices, see the design decisions.
The language model¶
A GMAT script has two sections, in strict order:
- Configuration —
Createresource declarations and literalresource.field = valueassignments. - Mission sequence — everything after the
BeginMissionSequencemarker: an ordered list of commands and control-flow / solver blocks.
The split is positional, not lexical: the same assignment syntax appears in both sections, and
which section a statement belongs to is determined by its position relative to
BeginMissionSequence, not by a different node type. GMAT enforces extra rules per section (for
example, configuration assignments must be literal, and Create is illegal after
BeginMissionSequence); those are semantic rules a later linter owns, not the grammar. The grammar
is deliberately permissive — it accepts the superset and leaves what GMAT would reject to later
layers.
GmatFunction (.gmf) files parse with the same grammar. They are a superset of the script
surface, adding only a function header and Global declarations.
Generic over enumerated¶
The grammar never enumerates resource types or command keywords. The R2026a corpus alone has 67
distinct Create types and roughly 37 command keywords, and GMAT plugins add more — so:
Create <Type> <name> …parses with<Type>as any identifier. There is one genericcreate_command, not a rule per resource family.- An unrecognised command keyword parses as a generic
commandnode, not an error.
Only the constructs the grammar must understand structurally — assignments, the bracket-output call form, the matched begin/end blocks, the section boundary, includes, and comments — get their own node type. Everything else is recovered by later layers from a field catalogue, not from the parse tree. A resource type or command keyword added or removed in any GMAT release therefore parses with no grammar change.
Node taxonomy¶
The named node types the parser produces, grouped by role.
Top level and structural¶
| Node | Surface | Notes |
|---|---|---|
source_file |
the whole file | the root node |
comment |
% … to end of line |
attaches anywhere, including mid-construct; no block comments; a % inside a string is data, not a comment |
include |
#Include 'path' |
preprocessor directive; top level only; trailing ; optional |
create_command |
Create <type> <name> [<name> …] |
<type> is any identifier; one or more names (Create Variable x y z); an Array declaration carries its [r,c] size as an array_size |
begin_mission_sequence |
BeginMissionSequence |
the configuration ↔ mission-sequence boundary |
Commands¶
| Node | Surface | Notes |
|---|---|---|
assignment_command |
[GMAT] [label] <lhs> = <rhs> |
optional leading GMAT keyword; optional 'label'; <lhs> is a reference or array-indexed target; <rhs> is the full expression grammar. The same node serves a literal configuration assignment and a computed mission-sequence assignment |
function_call_command |
[<out>, …] = <name>(<args>) |
the bracket-list LHS (an output_list) distinguishes it from an assignment; <name> may be dotted; <args> is an argument_list. This is the modelled function-call form |
command |
<keyword> [label] <args …> |
the generic command — Propagate, Maneuver, Report, Vary, Achieve, Minimize, Toggle, Save, Stop, Global, BeginFiniteBurn / EndFiniteBurn, and any unrecognised keyword. A bare no-output call (MyFunc(args);) is a command too |
Blocks¶
Matched begin/end constructs with a nested command body.
| Node | Surface | Notes |
|---|---|---|
if_statement |
If <cond> … [Else …] EndIf |
an Else branch is an else_clause |
for_statement |
For <var> = <range> … EndFor |
the start:step:stop or start:stop range is a for_range |
while_statement |
While <cond> … EndWhile |
|
target_statement |
Target <solver> [{opts}] … EndTarget |
nests Vary / Achieve / etc. as ordinary commands; brace options are option_assignments |
optimize_statement |
Optimize <solver> [{opts}] … EndOptimize |
nests Vary / Minimize / NonlinearConstraint |
script_block |
BeginScript … EndScript |
opaque: the body is a single raw-text script_body token, not re-parsed |
BeginFiniteBurn / EndFiniteBurn are not blocks
The commands they bracket are flat siblings, not a nested body, so each parses as an ordinary
command. The same goes for BeginFileThrust / EndFileThrust. Pairing them, if ever needed,
is a job for a layer above the grammar.
Values and expressions¶
The right-hand-side grammar.
| Node | Surface | Notes |
|---|---|---|
identifier |
Sat, true, On |
case-sensitive; true / false / On / Off are lexically identifiers (their booleanness is a catalogue fact, not a node type) |
member_expression |
Sat.Earth.RMAG, FM.GravityField.Earth.PotentialFile |
a dotted reference path of any depth |
call_expression |
A(1,1), sqrt(x), cross(r1, v1) |
a postfix (<args>). Array indexing and function invocation are syntactically identical — one node; which it is, is semantic |
number |
7000, 1.25e-1, 1e+070 |
integer / real / scientific; tolerates a zero-padded exponent |
string |
'01 Jan 2025 12:00:00.000' |
single-quoted; no escapes; cannot contain ', a newline, or % |
array_literal |
[1 2 3], [1 0 0; 0 1 0; 0 0 1] |
square brackets; elements separated by whitespace or commas; ; separates rows of a 2-D matrix |
list |
{Earth}, {Sun, Luna}, {} |
brace-list; comma-separated; may be empty; nestable |
binary_expression |
a + b, x^2, Sat.TA > 90, a & b |
arithmetic + - * / ^; relational < <= > >= == ~=; logical & \| |
unary_expression |
-Element1, +x |
a leading sign |
parenthesized_expression |
(a + b) |
grouping |
unquoted_value |
VectorType = Relative Position, Epoch = 19 Aug 2015 00:00:00.000 |
the raw rest of a logical line, used when the value is not one of the structured forms above — multi-word enums, unquoted paths and dates |
command_label |
'Raise apogee' |
a single-quoted label that is a statement's first element (before the command keyword or assignment LHS) |
GmatFunction header¶
| Node | Surface | Notes |
|---|---|---|
function_definition |
function [<out>, …] = <name>(<params>) |
the .gmf header. The output list (an output_list) and the parameter list (a parameter_list) are each optional; the trailing ; is optional. Global <name> … declarations parse as generic commands |
Layout and re-emission¶
Whitespace, newlines, comments, and the ... line continuation are preserved as the parser's
between-token text, so re-emission is lossless: concatenating every leaf token together with the
interstitial layout reproduces the input byte-for-byte. ... before a newline continues a
statement — it is layout, not a node. The statement terminator ; is optional and preserved
verbatim where present.
The library performs no end-of-line normalisation — it reads and writes the source's original line endings exactly, never converting CRLF↔LF.
Covered¶
These parse with zero ERROR nodes across the whole R2026a stock corpus:
- Every
Createresource declaration, generically — every resource family. - The configuration section: dotted
resource.field = valueassignments, the optionalGMATkeyword,ArraydeclarationsA[r,c]and accessA(i,j), brace-lists, square-bracket array and 2-D matrix literals, multi-word unquoted values, comments, blank lines, the...continuation. #Include 'path'.BeginMissionSequenceand the mission sequence: the generic command set and the full expression grammar (arithmetic, relational / logical, function calls).- The
Propagateargument grammar, includingProp(Sat) {Sat.ElapsedSecs = 8640}brace option blocks, multi-spacecraft propagation, chained propagators, and theBackProp/Synchronizedmodifiers. - Control-flow and solver blocks:
If/Else/EndIf,For/EndFor,While/EndWhile,Target/EndTarget,Optimize/EndOptimize, with solver-mode brace options and nested commands. BeginScript/EndScript(opaque body) andBeginFiniteBurn/EndFiniteBurn(a flat command pair).- Command labels on any command.
- The bracket-output function-call command
[out, …] = name(args)(including dotted names) and the bare no-output callName(args);. - GmatFunction (
.gmf) files — thefunctionheader in all its forms,Globaldeclarations, and the otherwise identical surface. - A configuration-only file with no
BeginMissionSequence(the boundary marker is optional in the grammar even though a runnable mission needs it).
Deferred and best-effort¶
These parse through the generic fallback — the file still round-trips — but they are not first-classed or corpus-tested:
- Older-release syntax. R2026a is the target; older files are best-effort.
ElseIf. It recovers if encountered, but is not a first-classif_statementbranch.BeginScript/EndScriptbodies. Opaque by design — the raw text round-trips but is not parsed into structure.- MATLAB / Python callback internals. The call command parses; what the external function does is out of scope.
What it does not do¶
The grammar answers "is this well-formed GMAT script text". It does not check whether a field name, resource type, or enum value is valid — those are semantics that vary by GMAT release and belong to a later linter. And it never runs anything: "does it parse" is structural; "does it run / converge" is GMAT's question. See What it is not.