Revamp of identifier rules (TO BE REVIEWED) (a2582831) · Commits · TDL Open Source / TOP IDE

plugins/org.etsi.mts.tdl.tx/IDENTIFIER_RULES.md

0 → 100644

+277 −0

Original line number	Diff line number	Diff line
		# Identifier rules in `TDLtx.xtext`

		This document describes the identifier-shape rules used by the TDL textual
		grammar, the rationale for each rule's admission set, and the principles that
		govern when a name slot or cross-ref should use which rule.

		## Why multiple rules

		In Xtext, every literal string in the grammar (`'name'`, `'type'`,
		`'component'`, …) becomes its own keyword terminal at lexer time. The lexer
		runs before the parser and produces keyword tokens whenever a keyword pattern
		matches; the parser cannot retroactively ask the lexer to treat a keyword
		token as a generic identifier. So if a user writes `Type name`, the second
		token is lexed as the keyword `name`, not as `ID`.

		To allow keywords to be used as names anyway, an identifier rule lists the
		keywords as alternatives — `Identifier: ID \| 'name' \| 'type' \| …`. The parser
		then accepts either token kind in the name position. Each keyword added to
		such a rule is a usability win (one less identifier the user has to escape
		with `^`) but costs ambiguity budget: the parser has to be able to
		distinguish that keyword as a name from any other role the keyword plays at
		the same syntactic position.

		Different name positions in the grammar tolerate different amounts of
		admission. The rules below are factored along that axis so that each
		position uses the widest rule it can while staying unambiguous.

		## Position taxonomy

		Each name slot in the grammar belongs to one of these position kinds:

		\| Position \| Description \| Example \|
		\|---\|---\|---\|
		\| Free \| Reachable as a first token of an expression (DataUseWrapped). Competes with `'instance'`, `'parameter'`, `'omit'`, `'an'`, `'size'`, `'not'`, `'-'`, `'('`, `'?'`, `'*'`, `'@'` as alternative starts. \| `dataElement` in DataElementUse, `componentInstance` in VariableUse \|
		\| Tight \| Anchored on at least one side by a structural token (`.`, `(`, type-ref, `=`, `\\|`, `,`, `)`). No competing alternative starts with an identifier in this state. \| Member name after `.`, parameter name after `(` in a binding \|
		\| Introducer-anchored \| Preceded by a strong introducer keyword (`Type`, `Action`, `'execute'`, `'perform'`, …). The next token's role is fixed by the parent rule. \| Top-level decl names, anchored cross-refs \|
		\| Qualifier-prefixed (TO) \| Preceded by `Qualifier*` greedy chain in TO contexts. Both the qualifier loop and the name slot accept any identifier; the parser greedily takes all but the last. \| TO `LiteralValueFragment.name`, `DataContent.name` \|

		Free positions are the most constrained — every keyword admitted there must
		not be a sibling-first token, not a Qualifier-vocabulary word, and not an
		expression operator. Tight positions are looser. Introducer-anchored
		positions are the loosest, bounded mostly by what the rule's follower
		keyword set is.

		## TDL-side rules (three tiers)

		### `Identifier` — free-position rule

		Used wherever a name appears at a position reachable from inside an
		expression: bare `DataElementUse`, `VariableUse`'s component segment,
		cross-refs to `NamedElement`, `DataType`, `ComponentInstance`, `Variable`,
		`Timer`, `TimeLabel`, `DataInstance`, formal parameters, etc.

		Admits a conservative keyword set:

		```
		'name' 'type' 'value' 'attribute'
		'time' 'point' 'default'
		'entity' 'event' 'component' 'variable' 'timer'
		'argument' 'action' 'behaviour'
		'verdict' 'exception'
		'get'
		'when' 'then'
		'check' 'where'
		'sends' 'receives' 'triggers' 'accepts'
		```

		Excluded and why:

		\| Excluded \| Reason \|
		\|---\|---\|
		\| `size`, `instance`, `parameter`, `omit`, `not`, `an` \| First-token of a sibling alternative under `DataUseWrapped` \|
		\| `start`, `stop`, `from`, `to`, `before`, `after`, `during`, `within`, `of`, `in`, `on`, `for`, `by`, `into` \| Member of a typed `Qualifier` rule (CommonWord, Direction, TimeConstraint). Admitting them would degrade TO comment classification \|
		\| `a`, `an`, `the` \| `ArticleQualifier` member; `'an'` also a sibling-first of `DataInstanceUse` \|
		\| `and`, `or`, `xor`, `not`, `mod`, `as` \| Expression operators / cast keyword \|
		\| `gate` \| Would clash with the `'on' (ComponentInstance \| 'gate' GR)` discriminator in `Quiescence` \|

		### `TightName` — tight-position rule

		Used at: `Member.name` (struct decl), `MemberReference` (after `.`),
		`MemberAssignment` cross-ref, TDL-side `ParameterBinding.parameter` cross-ref.

		Extends `Identifier` with keywords that are blocked from `Identifier`
		by expression-level conflicts but become safe when the position is
		anchored on at least one side by a structural token:

		```
		+ 'start' 'stop'
		+ 'from' 'to'
		+ 'before' 'after'
		+ 'size'
		+ 'instance' 'parameter'
		+ 'gate'
		```

		(The `Identifier`-blocking reasons — sibling-first tokens, Qualifier
		vocabulary, Quiescence's `'gate'` discriminator — are all unreachable from
		inside a tight-anchored slot, so the keywords become safe.)

		### `DeclName` — introducer-anchored rule

		Used at TDL-side decl-name slots (top-level introducer-anchored) and at
		their anchored cross-refs:

		- Decls: `TestObjective`, `Constraint`, `ProcedureSignature`, `Action`,
		`Function`, `GateType`, `ComponentType`, `TestConfiguration`,
		`TestDescription`, `Time`, `Timer`.
		- Cross-refs: `extending=[PackageableElement\|DeclName]`, `Import`'s
		packageable-element list, `'execute' [TestDescription\|DeclName]`,
		`'perform' [Action\|DeclName]`, `'uses' [TestConfiguration\|DeclName]`,
		`Objective: [TestObjective\|DeclName]`, `'calls'`/`'responds with'`
		`[ProcedureSignature\|DeclName]`, `'instance' 'returned' 'from'`
		`[Function\|DeclName]`, `[ComponentType\|DeclName]` and `[GateType\|DeclName]`
		in typed-decl positions, `[ConstraintType\|DeclName]`,
		`[MappableDataElement\|DeclName]` in `Map`, `[Timer\|DeclName]` after `'::'`.

		Admits everything `Identifier` admits, plus the same extras `TightName`
		adds, plus a few more that are tight-or-introducer-only:

		```
		+ everything Identifier admits
		+ everything TightName adds
		+ 'during' 'within'
		+ 'omit'
		```

		Excluded and why:

		\| Excluded \| Reason \|
		\|---\|---\|
		\| `with` \| Universal optional decl follower (body-block opener) \|
		\| `extends` \| Follower of `Type`, `Structure`, `Component`, `Gate` decl names \|
		\| `optional` \| Follower of `Structure` decl name \|
		\| `of` \| Follower of `Collection` decl name \|
		\| `returns` \| Follower of `Function` decl name \|
		\| `accepts` \| Follower of `Gate` decl name \|
		\| `uses` \| Follower of `TestDescription` decl name \|
		\| `now` \| Follower of `TimeLabel` decl name \|
		\| `as` \| Follower of `ComponentInstance` decl name; introducer of mapping clauses \|
		\| `Description`, `Reference`, `Configuration`, `PICS`, `PIXIT`, `Bindings`, `Objective` \| TP / Variant header-section keywords \|
		\| `Note` \| Annotation-comment introducer \|
		\| Top-level / capitalized section keywords (`Package`, `Type`, `Structure`, `Action`, `Function`, …) \| Sibling decl introducers \|
		\| `and`, `or`, `xor`, `not`, `mod` \| Expression operators \|

		## Auxiliary rules

		### `AIdentifier` — annotation key/name

		Annotation keys and `AnnotationType` decl names. Extends `Identifier` with
		the multi-word Test-Purpose-block keywords so `@Initial conditions`,
		`@Expected behaviour`, `@PICS`, `@PIXIT`, `@Test Purpose Description` parse
		as annotation keys. `@when`, `@then`, `@check`, `@where` parse via the
		`Identifier` branch since those four keywords are admitted there.

		### `CheckIdentifier` — closed vocabulary

		A two-keyword choice (`check` \| `where`) used at the annotation key slot
		that introduces a check-style annotation. Despite the suffix, this is a
		constraint, not a generalised name rule.

		### `KIdentifier` — TO Event names

		`ID` plus the gate-action keywords `'sends'`, `'receives'`, `'triggers'`,
		`'accepts'`, and `'in'`. Used for `to::Event` decl and refs only. Kept TO-
		local because Event naming is part of the TO sub-language's natural-language
		style.

		### `GRIdentifier` — qualified gate reference

		`Identifier ('::' Identifier)?` — qualified `component::gate` form. Used
		only at cross-ref slots that resolve to `GateReference` (`Quiescence`'s gate
		alternative, Message/ReceiveMessage source/target, ProcedureCall source,
		etc.). Each segment uses `Identifier` so the auto-qualified form
		`<CI.name>::<GI.name>` resolves correctly when component or gate names use
		keyword admissions — both `ComponentInstance.name` and `GateInstance.name`
		are on `Identifier`.

		The single-segment form (no `::`) targets the local-alias slot on
		`GateReference` (`name=ID`). Local aliases are intentionally narrow (raw
		`ID`); cross-ref input that uses an `Identifier` keyword as the
		single-segment form will parse but always fail to resolve, which is
		harmless.

		### `PackageName` — raw `ID`

		The Package decl name and each segment of `QIdentifier`. Kept narrow
		deliberately: package names appear as segments of qualified paths, and
		qualified paths may meet expression context in future extensions. Keeping
		the segment admission narrow avoids ambiguity at those future positions.

		### `QIdentifier` — dotted package path

		`PackageName ('.' PackageName)*`. Used only at `'from'
		importedPackage=[Package\|QIdentifier]`.

		### `NIdentifier` — numeric label

		`'-'? INT ('.' INT)?`. Despite the suffix, does not reference `ID`.
		Appears in name slots where numeric labels are acceptable (clause numbers
		in test-purpose descriptions, AnyValue / SpecialValue numeric names, TO
		content names).

		## TO sub-language: `TOIdentifier`

		Used in TO-only rules (`StructuredTestObjective`, `Variant`,
		`TestPurposeDescription`, `EventSpecificationTemplate`, `Entity`, `PICS`,
		the `Qualifier` family, `LiteralValueFragment`, `DataContent`,
		`LiteralValueReference` and `ContentReference` cross-refs, `TODataElementUse`,
		`TOParameterBinding`).

		Admits only the four canonical keywords (`'name'`, `'type'`, `'value'`,
		`'attribute'`). Held narrow so that TDL-side widening of `Identifier` does
		not leak into TO parsing — the TO sub-language has its own evolution path
		and its `Qualifier*` chains are sensitive to the keyword admission set
		(both for parsing safety, since unbounded keyword admission grows the
		LL(*) lookahead, and for fidelity, since admitting Qualifier-vocabulary
		keywords into `Qualifier` would degrade typed-comment classification).

		## Symmetry between decl and ref rules

		A name a user can declare must also be referenceable. For each element
		kind, the decl rule and the cross-ref rule for that element must be
		keyword-compatible: the cross-ref rule must admit at least every keyword
		the decl rule does, otherwise a declared name becomes unreferenceable.

		The current splits respect this:

		- Element kinds whose cross-refs reach expression positions (`DataType`,
		`ComponentInstance`, `GateInstance`, `Variable`, `TimeLabel`,
		`NamedElement`, `DataInstance`) keep their decl names on `Identifier`
		so the cross-ref can stay on `Identifier` too without expression-level
		conflicts.
		- Element kinds whose cross-refs are always introducer-anchored
		(`TestObjective`, `TestConfiguration`, `TestDescription`, `Action`,
		`Function`, `ConstraintType`, `ComponentType`, `GateType`,
		`ProcedureSignature`, `Time`, `Timer`, `MappableDataElement`,
		`PackageableElement` via `Import`/`extending`) use `DeclName` on both
		sides.
		- `Member`, `FormalParameter`, `ProcedureParameter`, and the abstract
		`Parameter` cross-refs (in `ParameterBinding`, `ParameterMapping`,
		`ValueAssignmentProcedure`, and the `'parameter' [FormalParameter\|...]`
		slot) all use `TightName` on both sides.
		- `GateReference` is two parities in one rule. Auto-qualified
		`<CI.name>::<GI.name>` references parse via `GRIdentifier`'s two-segment
		form, where each segment is `Identifier` to match the underlying
		`ComponentInstance.name` and `GateInstance.name` rules. Local aliases
		declared as `name=ID` are matched by the single-segment form of
		`GRIdentifier`; an Identifier-keyword in the single-segment form parses
		but never resolves (no decl admits keyword aliases), which is harmless.
		- TO sub-language uses `TOIdentifier` on both sides.

		Some intentional asymmetries:

		- The `MappableDataElement` cross-ref uses `DeclName` to cover its widest
		subclass (`Action`, on `DeclName`); `DataInstance` and `DataType` are
		narrower (`Identifier`) but still resolve through the wider rule.
		- `GateReference.name` and `ExtendedGateReference.name` local-alias slots
		use raw `ID` rather than the wider `Identifier` admitted by the
		single-segment cross-ref form. The narrower decl is deliberate —
		aliases are not expected to use keyword names — and the wider cross-ref
		side just yields harmless unresolvable lookups for keyword input.

		## Trade-offs at a glance

		\| Rule \| Reach \| Admits keywords \| Use it for \|
		\|---\|---\|---\|---\|
		\| `Identifier` \| Free (expression-reachable) \| Most conservative \| Names referenced in expressions \|
		\| `TightName` \| Tight (anchored both sides) \| + sibling-first / Qualifier-vocab safe \| Member / parameter-binding tight slots \|
		\| `DeclName` \| Introducer-anchored \| + nearly all non-follower keywords \| Top-level decls and their anchored cross-refs \|
		\| `TOIdentifier` \| TO sub-language only \| Frozen narrow \| All TO name positions \|
		\| `AIdentifier` \| Annotation key/name \| Identifier + TP-block multi-word keywords \| Annotation key/name slots \|
		\| `KIdentifier` \| TO Event names \| ID + gate-action keywords \| `to::Event` decl/ref \|
		\| `CheckIdentifier` \| Annotation key for check/where \| Closed two-keyword choice \| Check-annotation key only \|
		\| `GRIdentifier` \| Gate-reference cross-ref \| Identifier per segment \| `component::gate` cross-ref \|
		\| `PackageName` \| Package decl + QIdentifier segment \| Raw ID \| Package names and qualified-path segments \|
		\| `QIdentifier` \| Qualified package path \| Raw ID per segment \| `'from' [Package\|QIdentifier]` only \|
		\| `NIdentifier` \| Numeric-as-name \| Numeric literals only \| Name slots accepting numeric labels \|

plugins/org.etsi.mts.tdl.tx/src/org/etsi/mts/tdl/TDLtx.xtext

+198 −106

File changed.

Preview size limit exceeded, changes collapsed.