diff --git a/src/macros-by-example.md b/src/macros-by-example.md
index e95cd2e64..17e738897 100644
--- a/src/macros-by-example.md
+++ b/src/macros-by-example.md
@@ -1,5 +1,8 @@
# Macros By Example
+r[macro.decl]
+
+r[macro.decl.syntax]
> **Syntax**\
> _MacroRulesDefinition_ :\
> `macro_rules` `!` [IDENTIFIER] _MacroRulesDef_
@@ -39,6 +42,7 @@
> _MacroTranscriber_ :\
> [_DelimTokenTree_]
+r[macro.decl.intro]
`macro_rules` allows users to define syntax extension in a declarative way. We
call such extensions "macros by example" or simply "macros".
@@ -51,10 +55,15 @@ items), types, or patterns.
## Transcribing
+r[macro.decl.transcription]
+
+r[macro.decl.transcription.intro]
When a macro is invoked, the macro expander looks up macro invocations by name,
and tries each macro rule in turn. It transcribes the first successful match; if
-this results in an error, then future matches are not tried. When matching, no
-lookahead is performed; if the compiler cannot unambiguously determine how to
+this results in an error, then future matches are not tried.
+
+r[macro.decl.transcription.lookahead]
+When matching, no lookahead is performed; if the compiler cannot unambiguously determine how to
parse the macro invocation one token at a time, then it is an error. In the
following example, the compiler does not look ahead past the identifier to see
if the following token is a `)`, even though that would allow it to parse the
@@ -68,6 +77,7 @@ macro_rules! ambiguity {
ambiguity!(error); // Error: local ambiguity
```
+r[macro.decl.transcription.syntax]
In both the matcher and the transcriber, the `$` token is used to invoke special
behaviours from the macro engine (described below in [Metavariables] and
[Repetitions]). Tokens that aren't part of such an invocation are matched and
@@ -78,6 +88,8 @@ instance, the matcher `(())` will match `{()}` but not `{{}}`. The character
### Forwarding a matched fragment
+r[macro.decl.transcription.fragment]
+
When forwarding a matched fragment to another macro-by-example, matchers in
the second macro will see an opaque AST of the fragment type. The second macro
can't use literal tokens to match the fragments in the matcher, only a
@@ -116,9 +128,14 @@ foo!(3);
## Metavariables
+r[macro.decl.meta]
+
+r[macro.decl.meta.intro]
In the matcher, `$` _name_ `:` _fragment-specifier_ matches a Rust syntax
-fragment of the kind specified and binds it to the metavariable `$`_name_. Valid
-fragment specifiers are:
+fragment of the kind specified and binds it to the metavariable `$`_name_.
+
+r[macro.decl.meta.specifier]
+Valid fragment specifiers are:
* `item`: an [_Item_]
* `block`: a [_BlockExpression_]
@@ -136,18 +153,23 @@ fragment specifiers are:
* `vis`: a possibly empty [_Visibility_] qualifier
* `literal`: matches `-`?[_LiteralExpression_]
+r[macro.decl.meta.transcription]
In the transcriber, metavariables are referred to simply by `$`_name_, since
the fragment kind is specified in the matcher. Metavariables are replaced with
-the syntax element that matched them. The keyword metavariable `$crate` can be
-used to refer to the current crate; see [Hygiene] below. Metavariables can be
+the syntax element that matched them.
+
+r[macro.decl.meta.dollar-crate]
+The keyword metavariable `$crate` can be used to refer to the current crate; see [Hygiene] below. Metavariables can be
transcribed more than once or not at all.
+r[macro.decl.meta.expr-underscore]
For reasons of backwards compatibility, though `_` [is also an
expression][_UnderscoreExpression_], a standalone underscore is not matched by
the `expr` fragment specifier. However, `_` is matched by the `expr` fragment
specifier when it appears as a subexpression.
For the same reason, a standalone [const block] is not matched but it is matched when appearing as a subexpression.
+r[macro.decl.meta.edition2021]
> **Edition differences**: Starting with the 2021 edition, `pat` fragment-specifiers match top-level or-patterns (that is, they accept [_Pattern_]).
>
> Before the 2021 edition, they match exactly the same fragments as `pat_param` (that is, they accept [_PatternNoTopAlt_]).
@@ -156,22 +178,31 @@ For the same reason, a standalone [const block] is not matched but it is matched
## Repetitions
+r[macro.decl.repetition]
+
+r[macro.decl.repetition.intro]
In both the matcher and transcriber, repetitions are indicated by placing the
tokens to be repeated inside `$(`…`)`, followed by a repetition operator,
-optionally with a separator token between. The separator token can be any token
+optionally with a separator token between.
+
+r[macro.decl.repetition.separator]
+The separator token can be any token
other than a delimiter or one of the repetition operators, but `;` and `,` are
the most common. For instance, `$( $i:ident ),*` represents any number of
identifiers separated by commas. Nested repetitions are permitted.
+r[macro.decl.repetition.operators]
The repetition operators are:
- `*` --- indicates any number of repetitions.
- `+` --- indicates any number but at least one.
- `?` --- indicates an optional fragment with zero or one occurrence.
+r[macro.decl.repetition.optional-restriction]
Since `?` represents at most one occurrence, it cannot be used with a
separator.
+r[macro.decl.repetition.fragment]
The repeated fragment both matches and transcribes to the specified number of
the fragment, separated by the separator token. Metavariables are matched to
every repetition of their corresponding fragment. For instance, the `$( $i:ident
@@ -198,6 +229,9 @@ compiler knows how to expand them properly:
## Scoping, Exporting, and Importing
+r[macro.decl.scope]
+
+r[macro.decl.scope.intro]
For historical reasons, the scoping of macros by example does not work entirely
like items. Macros have two forms of scope: textual scope, and path-based scope.
Textual scope is based on the order that things appear in source files, or even
@@ -205,6 +239,7 @@ across multiple files, and is the default scoping. It is explained further below
Path-based scope works exactly the same way that item scoping does. The scoping,
exporting, and importing of macros is controlled largely by attributes.
+r[macro.decl.scope.unqualified]
When a macro is invoked by an unqualified identifier (not part of a multi-part
path), it is first looked up in textual scoping. If this does not yield any
results, then it is looked up in path-based scoping. If the macro's name is
@@ -224,6 +259,9 @@ self::lazy_static!{} // Path-based lookup ignores our macro, finds imported one.
### Textual Scope
+r[macro.decl.scope.textual]
+
+r[macro.decl.scope.textual.intro]
Textual scope is based largely on the order that things appear in source files,
and works similarly to the scope of local variables declared with `let` except
it also applies at the module level. When `macro_rules!` is used to define a
@@ -253,6 +291,7 @@ mod has_macro {
m!{} // OK: appears after declaration of m in src/lib.rs
```
+r[macro.decl.scope.textual.shadow]
It is not an error to define a macro multiple times; the most recent declaration
will shadow the previous one unless it has gone out of scope.
@@ -293,12 +332,14 @@ fn foo() {
m!();
}
-
// m!(); // Error: m is not in scope.
```
### The `macro_use` attribute
+r[macro.decl.scope.macro_use]
+
+r[macro.decl.scope.macro_use.mod-decl]
The *`macro_use` attribute* has two purposes. First, it can be used to make a
module's macro scope not end when the module is closed, by applying it to a
module:
@@ -314,6 +355,7 @@ mod inner {
m!();
```
+r[macro.decl.scope.macro_use.prelude]
Second, it can be used to import macros from another crate, by attaching it to
an `extern crate` declaration appearing in the crate's root module. Macros
imported this way are imported into the [`macro_use` prelude], not textually,
@@ -332,11 +374,15 @@ lazy_static!{}
// self::lazy_static!{} // Error: lazy_static is not defined in `self`
```
+r[macro.decl.scope.macro_use.export]
Macros to be imported with `#[macro_use]` must be exported with
`#[macro_export]`, which is described below.
### Path-Based Scope
+r[macro.decl.scope.path]
+
+r[macro.decl.scope.path.intro]
By default, a macro has no path-based scope. However, if it has the
`#[macro_export]` attribute, then it is declared in the crate root scope and can
be referred to normally as such:
@@ -358,11 +404,15 @@ mod mac {
}
```
+r[macro.decl.scope.path.export]
Macros labeled with `#[macro_export]` are always `pub` and can be referred to
by other crates, either by path or by `#[macro_use]` as described above.
## Hygiene
+r[macro.decl.hygiene]
+
+r[macro.decl.hygiene.intro]
By default, all identifiers referred to in a macro are expanded as-is, and are
looked up at the macro's invocation site. This can lead to issues if a macro
refers to an item or macro which isn't in scope at the invocation site. To
@@ -406,6 +456,7 @@ pub mod inner {
}
```
+r[macro.decl.hygiene.vis]
Additionally, even though `$crate` allows a macro to refer to items within its
own crate when expanding, its use has no effect on visibility. An item or macro
referred to must still be visible from the invocation site. In the following
@@ -429,6 +480,7 @@ fn foo() {}
> modified to use `$crate` or `local_inner_macros` to work well with path-based
> imports.
+r[macro.decl.hygiene.local_inner_macros]
When a macro is exported, the `#[macro_export]` attribute can have the
`local_inner_macros` keyword added to automatically prefix all contained macro
invocations with `$crate::`. This is intended primarily as a tool to migrate
@@ -449,9 +501,14 @@ macro_rules! helper {
## Follow-set Ambiguity Restrictions
+r[macro.decl.follow-set]
+
+r[macro.decl.follow-set.intro]
The parser used by the macro system is reasonably powerful, but it is limited in
-order to prevent ambiguity in current or future versions of the language. In
-particular, in addition to the rule about ambiguous expansions, a nonterminal
+order to prevent ambiguity in current or future versions of the language.
+
+r[macro.decl.follow-set.token-restriction]
+In particular, in addition to the rule about ambiguous expansions, a nonterminal
matched by a metavariable must be followed by a token which has been decided can
be safely used after that kind of match.
@@ -464,19 +521,32 @@ matcher would become ambiguous or would misparse, breaking working code.
Matchers like `$i:expr,` or `$i:expr;` would be legal, however, because `,` and
`;` are legal expression separators. The specific rules are:
+r[macro.decl.follow-set.token-expr-stmt]
* `expr` and `stmt` may only be followed by one of: `=>`, `,`, or `;`.
+
+r[macro.decl.follow-set.token-pat_param]
* `pat_param` may only be followed by one of: `=>`, `,`, `=`, `|`, `if`, or `in`.
+
+r[macro.decl.follow-set.token-pat]
* `pat` may only be followed by one of: `=>`, `,`, `=`, `if`, or `in`.
+
+r[macro.decl.follow-set.token-path-ty]
* `path` and `ty` may only be followed by one of: `=>`, `,`, `=`, `|`, `;`,
`:`, `>`, `>>`, `[`, `{`, `as`, `where`, or a macro variable of `block`
fragment specifier.
+
+r[macro.decl.follow-set.token-vis]
* `vis` may only be followed by one of: `,`, an identifier other than a
non-raw `priv`, any token that can begin a type, or a metavariable with a
`ident`, `ty`, or `path` fragment specifier.
+
+r[macro.decl.follow-set.token-other]
* All other fragment specifiers have no restrictions.
+r[macro.decl.follow-set.edition2021]
> **Edition differences**: Before the 2021 edition, `pat` may also be followed by `|`.
+r[macro.decl.follow-set.repetition]
When repetitions are involved, then the rules apply to every possible number of
expansions, taking separators into account. This means:
@@ -490,7 +560,6 @@ expansions, taking separators into account. This means:
* If the repetition can match zero times (`*` or `?`), then whatever comes
after must be able to follow whatever comes before.
-
For more detail, see the [formal specification].
[const block]: expressions/block-expr.md#const-blocks
diff --git a/src/procedural-macros.md b/src/procedural-macros.md
index a97755f7f..0ae6e26d5 100644
--- a/src/procedural-macros.md
+++ b/src/procedural-macros.md
@@ -1,5 +1,8 @@
## Procedural Macros
+r[macro.proc]
+
+r[macro.proc.intro]
*Procedural macros* allow creating syntax extensions as execution of a function.
Procedural macros come in one of three flavors:
@@ -11,6 +14,7 @@ Procedural macros allow you to run code at compile time that operates over Rust
syntax, both consuming and producing Rust syntax. You can sort of think of
procedural macros as functions from an AST to another AST.
+r[macro.proc.def]
Procedural macros must be defined in the root of a crate with the [crate type] of
`proc-macro`.
The macros may not be used from the crate where they are defined, and can only be used when imported in another crate.
@@ -23,6 +27,7 @@ The macros may not be used from the crate where they are defined, and can only b
> proc-macro = true
> ```
+r[macro.proc.result]
As functions, they must either return syntax, panic, or loop endlessly. Returned
syntax either replaces or adds the syntax depending on the kind of procedural
macro. Panics are caught by the compiler and are turned into a compiler error.
@@ -34,15 +39,20 @@ that the compiler has access to. Similarly, file access is the same. Because
of this, procedural macros have the same security concerns that [Cargo's
build scripts] have.
+r[macro.proc.error]
Procedural macros have two ways of reporting errors. The first is to panic. The
second is to emit a [`compile_error`] macro invocation.
### The `proc_macro` crate
+r[macro.proc.proc_macro]
+
+r[macro.proc.proc_macro.intro]
Procedural macro crates almost always will link to the compiler-provided
[`proc_macro` crate]. The `proc_macro` crate provides types required for
writing procedural macros and facilities to make it easier.
+r[macro.proc.proc_macro.token-stream]
This crate primarily contains a [`TokenStream`] type. Procedural macros operate
over *token streams* instead of AST nodes, which is a far more stable interface
over time for both the compiler and for procedural macros to target. A
@@ -51,6 +61,7 @@ can roughly be thought of as lexical token. For example `foo` is an `Ident`
token, `.` is a `Punct` token, and `1.2` is a `Literal` token. The `TokenStream`
type, unlike `Vec`, is cheap to clone.
+r[macro.proc.proc_macro.span]
All tokens have an associated `Span`. A `Span` is an opaque value that cannot
be modified but can be manufactured. `Span`s represent an extent of source
code within a program and are primarily used for error reporting. While you
@@ -59,6 +70,8 @@ with any token, such as through getting a `Span` from another token.
### Procedural macro hygiene
+r[macro.proc.hygiene]
+
Procedural macros are *unhygienic*. This means they behave as if the output
token stream was simply written inline to the code it's next to. This means that
it's affected by external items and also affects external imports.
@@ -71,13 +84,19 @@ other functions (like `__internal_foo` instead of `foo`).
### Function-like procedural macros
+r[macro.proc.function]
+
+r[macro.proc.function.intro]
*Function-like procedural macros* are procedural macros that are invoked using
the macro invocation operator (`!`).
+r[macro.proc.function.def]
These macros are defined by a [public] [function] with the `proc_macro`
[attribute] and a signature of `(TokenStream) -> TokenStream`. The input
[`TokenStream`] is what is inside the delimiters of the macro invocation and the
output [`TokenStream`] replaces the entire macro invocation.
+
+r[macro.proc.function.namespace]
The `proc_macro` attribute defines the macro in the [macro namespace] in the root of the crate.
For example, the following macro definition ignores its input and outputs a
@@ -109,6 +128,7 @@ fn main() {
}
```
+r[macro.proc.function.invocation]
Function-like procedural macros may be invoked in any macro invocation
position, which includes [statements], [expressions], [patterns], [type
expressions], [item] positions, including items in [`extern` blocks], inherent
@@ -116,14 +136,21 @@ and trait [implementations], and [trait definitions].
### Derive macros
+r[macro.proc.derive]
+
+r[macro.proc.derive.intro]
*Derive macros* define new inputs for the [`derive` attribute]. These macros
can create new [items] given the token stream of a [struct], [enum], or [union].
They can also define [derive macro helper attributes].
+r[macro.proc.derive.def]
Custom derive macros are defined by a [public] [function] with the
`proc_macro_derive` attribute and a signature of `(TokenStream) -> TokenStream`.
+
+r[macro.proc.derive.namespace]
The `proc_macro_derive` attribute defines the custom derive in the [macro namespace] in the root of the crate.
+r[macro.proc.derive.output]
The input [`TokenStream`] is the token stream of the item that has the `derive`
attribute on it. The output [`TokenStream`] must be a set of items that are
then appended to the [module] or [block] that the item from the input
@@ -161,11 +188,15 @@ fn main() {
#### Derive macro helper attributes
+r[macro.proc.derive.attributes]
+
+r[macro.proc.derive.attributes.intro]
Derive macros can add additional [attributes] into the scope of the [item]
they are on. Said attributes are called *derive macro helper attributes*. These
attributes are [inert], and their only purpose is to be fed into the derive
macro that defined them. That said, they can be seen by all macros.
+r[macro.proc.derive.attributes.def]
The way to define helper attributes is to put an `attributes` key in the
`proc_macro_derive` macro with a comma separated list of identifiers that are
the names of the helper attributes.
@@ -197,10 +228,14 @@ struct Struct {
### Attribute macros
+r[macro.proc.attribute]
+
+r[macro.proc.attribute.intro]
*Attribute macros* define new [outer attributes][attributes] which can be
attached to [items], including items in [`extern` blocks], inherent and trait
[implementations], and [trait definitions].
+r[macro.proc.attribute.def]
Attribute macros are defined by a [public] [function] with the
`proc_macro_attribute` [attribute] that has a signature of `(TokenStream,
TokenStream) -> TokenStream`. The first [`TokenStream`] is the delimited token
@@ -209,6 +244,8 @@ the attribute is written as a bare attribute name, the attribute
[`TokenStream`] is empty. The second [`TokenStream`] is the rest of the [item]
including other [attributes] on the [item]. The returned [`TokenStream`]
replaces the [item] with an arbitrary number of [items].
+
+r[macro.proc.attribute.namespace]
The `proc_macro_attribute` attribute defines the attribute in the [macro namespace] in the root of the crate.
For example, this attribute macro takes the input stream and returns it as is,
@@ -278,9 +315,13 @@ fn invoke4() {}
### Declarative macro tokens and procedural macro tokens
+r[macro.proc.token]
+
+r[macro.proc.token.intro]
Declarative `macro_rules` macros and procedural macros use similar, but
different definitions for tokens (or rather [`TokenTree`s].)
+r[macro.proc.token.macro_rules]
Token trees in `macro_rules` (corresponding to `tt` matchers) are defined as
- Delimited groups (`(...)`, `{...}`, etc)
- All operators supported by the language, both single-character and
@@ -296,6 +337,7 @@ Token trees in `macro_rules` (corresponding to `tt` matchers) are defined as
expansion, which will be considered a single token tree regardless of the
passed expression)
+r[macro.proc.token.tree]
Token trees in procedural macros are defined as
- Delimited groups (`(...)`, `{...}`, etc)
- All punctuation characters used in operators supported by the language (`+`,
@@ -306,11 +348,13 @@ Token trees in procedural macros are defined as
and floating point literals.
- Identifiers, including keywords (`ident`, `r#ident`, `fn`)
+r[macro.proc.token.conversion.intro]
Mismatches between these two definitions are accounted for when token streams
are passed to and from procedural macros. \
Note that the conversions below may happen lazily, so they might not happen if
the tokens are not actually inspected.
+r[macro.proc.token.conversion.to-proc_macro]
When passed to a proc-macro
- All multi-character operators are broken into single characters.
- Lifetimes are broken into a `'` character and an identifier.
@@ -322,6 +366,7 @@ When passed to a proc-macro
- `tt` and `ident` substitutions are never wrapped into such groups and
always represented as their underlying token trees.
+r[macro.proc.token.conversion.from-proc_macro]
When emitted from a proc macro
- Punctuation characters are glued into multi-character operators
when applicable.
@@ -330,6 +375,7 @@ When emitted from a proc macro
possibly wrapped into a delimited group ([`Group`]) with implicit delimiters
([`Delimiter::None`]) when it's necessary for preserving parsing priorities.
+r[macro.proc.token.doc-comment]
Note that neither declarative nor procedural macros support doc comment tokens
(e.g. `/// Doc`), so they are always converted to token streams representing
their equivalent `#[doc = r"str"]` attributes when passed to macros.