This document describes the identifier-shape rules used by the TDL textual
grammar, the rationale for each rule's admission set, and the principles that
govern when a name slot or cross-ref should use which rule.
## Why multiple rules
In Xtext, every literal string in the grammar (`'name'`, `'type'`,
`'component'`, …) becomes its own keyword terminal at lexer time. The lexer
runs before the parser and produces keyword tokens whenever a keyword pattern
matches; the parser cannot retroactively ask the lexer to treat a keyword
token as a generic identifier. So if a user writes `Type name`, the second
token is lexed as the keyword `name`, not as `ID`.
To allow keywords to be used as names anyway, an identifier rule lists the
keywords as alternatives — `Identifier: ID | 'name' | 'type' | …`. The parser
then accepts either token kind in the name position. Each keyword added to
such a rule is a usability win (one less identifier the user has to escape
with `^`) but costs ambiguity budget: the parser has to be able to
distinguish that keyword as a name from any other role the keyword plays at
the same syntactic position.
Different name positions in the grammar tolerate different amounts of
admission. The rules below are factored along that axis so that each
position uses the widest rule it can while staying unambiguous.
## Position taxonomy
Each name slot in the grammar belongs to one of these position kinds:
| Position | Description | Example |
|---|---|---|
| **Free** | Reachable as a first token of an expression (DataUseWrapped). Competes with `'instance'`, `'parameter'`, `'omit'`, `'an'`, `'size'`, `'not'`, `'-'`, `'('`, `'?'`, `'*'`, `'@'` as alternative starts. | `dataElement` in DataElementUse, `componentInstance` in VariableUse |
| **Tight** | Anchored on at least one side by a structural token (`.`, `(`, type-ref, `=`, `\|`, `,`, `)`). No competing alternative starts with an identifier in this state. | Member name after `.`, parameter name after `(` in a binding |
| **Introducer-anchored** | Preceded by a strong introducer keyword (`Type`, `Action`, `'execute'`, `'perform'`, …). The next token's role is fixed by the parent rule. | Top-level decl names, anchored cross-refs |
| **Qualifier-prefixed** (TO) | Preceded by `Qualifier*` greedy chain in TO contexts. Both the qualifier loop and the name slot accept any identifier; the parser greedily takes all but the last. | TO `LiteralValueFragment.name`, `DataContent.name` |
Free positions are the most constrained — every keyword admitted there must
not be a sibling-first token, not a Qualifier-vocabulary word, and not an
expression operator. Tight positions are looser. Introducer-anchored
positions are the loosest, bounded mostly by what the rule's *follower*
keyword set is.
## TDL-side rules (three tiers)
### `Identifier` — free-position rule
Used wherever a name appears at a position reachable from inside an
expression: bare `DataElementUse`, `VariableUse`'s component segment,
cross-refs to `NamedElement`, `DataType`, `ComponentInstance`, `Variable`,
`Timer`, `TimeLabel`, `DataInstance`, formal parameters, etc.
Admits a conservative keyword set:
```
'name' 'type' 'value' 'attribute'
'time' 'point' 'default'
'entity' 'event' 'component' 'variable' 'timer'
'argument' 'action' 'behaviour'
'verdict' 'exception'
'get'
'when' 'then'
'check' 'where'
'sends' 'receives' 'triggers' 'accepts'
```
Excluded and why:
| Excluded | Reason |
|---|---|
| `size`, `instance`, `parameter`, `omit`, `not`, `an` | First-token of a sibling alternative under `DataUseWrapped` |
| `start`, `stop`, `from`, `to`, `before`, `after`, `during`, `within`, `of`, `in`, `on`, `for`, `by`, `into` | Member of a typed `Qualifier` rule (CommonWord, Direction, TimeConstraint). Admitting them would degrade TO comment classification |
| `a`, `an`, `the` | `ArticleQualifier` member; `'an'` also a sibling-first of `DataInstanceUse` |