Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: ? repetition in macro rules #2298

Merged
merged 6 commits into from
Feb 27, 2018
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 171 additions & 0 deletions text/0000-macro-at-most-once-rep.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
- Feature Name: macro-at-most-once-rep
- Start Date: 2018-01-17
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)


Summary
-------

Add a repetition specifier to macros to repeat a pattern at most once: `$(pat)?`. Here, `?` behaves like `+` or `*` but represents at most one repetition of `pat`.

Motivation
----------

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another motivation you could add is an argument from familiarity with regular expressions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mention this in the rationale as a reason why ? was chosen, but since the RFC explicitly chooses to not add {M, N}, I think it would be hard to add familiarity with regexes to the motivation.

There are two specific use cases in mind.

## Macro rules with optional parts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you point out some macros from crates.io that would benefit from this syntax for optional parts? The example given here strikes me as contrived.

Copy link
Contributor

@durka durka Feb 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have used it in this crate to match the following syntaxes in one rule:

  1. trait Foo { ... }
  2. trait Foo: Bar { ... }
  3. pub trait Foo { ... }
  4. pub trait Foo: Bar { ... }

namely the rule:

($vis:vis trait $traitname:ident $(: $supertrait:ident)? { $($body:tt)* })

(edited to fix local ambiguity)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also strikes me as a candidate for reimplementation: https://docs.rs/clap/2.29.4/clap/macro.clap_app.html


Currently, you just have to write two rules and possibly have one "desugar" to the other.

```rust
macro_rules! foo {
(do $b:block) => {
$b
}
(do $b1:block and $b2:block) => {
foo!($b1)
$b2
}
}
```

Under this RFC, one would simply write:

```rust
macro_rules! foo {
(do $b1:block $(and $b2:block)?) => {
$b1
$($b2)?
}
}
```

## Trailing commas

Currently, the best way to make a rule tolerate trailing commas is to create another identical rule that has a comma at the end:

```rust
macro_rules! foo {
($(pat),*,) => { foo!( $(pat),* ) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule should be $(pat),+, so that foo!(,) is not permitted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 will fix shortly

($(pat),*) => {
// do stuff
}
}
```

or to allow multiple trailing commas:

```rust
macro_rules! foo {
($(pat),* $(,)*) => {
// do stuff
}
}
```

This is unergonomic and clutters up macro definitions needlessly. Under this RFC, one would simply write:

```rust
macro_rules! foo {
($(pat),* $(,)?) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like better support for trailing delimiters is a major part of the motivation for this RFC and why people are excited about it. But the old approach shown under "Currently" still seems better than the one here because with ? we cannot disallow an invocation consisting of only a comma, foo!(,). Is the idea that $(,)? would be a quick and easy shorthand but macro authors would continue to use the old approach if they want their macro to be maximally correct? If so, could you brainstorm some ideas for how optional trailing delimiters could be better supported?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way would be a new repetition operator, let's say $(pat),# (# obviously up for 🚲 🏡 ), which accepts a delimited list with optional trailing separator, the same way Rust does in most contexts. I think that could be done orthogonally to this RFC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... that's true. It works well if you have + repetition.

One option is to combine ? with | as described in #2298 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to create another repetition mode which indicates trailing commas: $[pat],*

// do stuff
}
}
```

Guide-level explanation
-----------------------

In Rust macros, you specify some "rules" which define how the macro is used and what it transforms to. For each rule, there is a pattern and a body:

```rust
macro_rules! foo {
(pattern) => { body }
}
```

The pattern portion is composed of zero or more subpatterns concatenated together. One possible subpattern is to repeat another subpattern some number of times. This is useful when writing variadic macros (e.g. `println`):

```rust
macro_rules! println {
// Takes a variable number of arguments after the template
($tempate:expr, $($args:expr),*) => { ... }
}
```
which can be invoked like so:
```rust
println!("") // 0 args
println!("", foo) // 1 args
println!("", foo, bar) // 2 args
...
```

The `*` in the pattern of this example indicates "0 or more repetitions". One can also use `+` for "at _least_ one repetition" or `?` for "at _most_ one repetition".

In the body of a rule, one can specify to repeat some code for every occurence of the pattern in the invokation:

```rust
macro_rules! foo {
($($pat:expr),*) => {
$(
println!("{}", $pat)
)* // Repeat for each `expr` passed to the macro
}
}
```

The same can be done for `+` and `?`.

The `?` operator is particularly useful for making macro rules with optional components in the invocation or for making macros tolerate trailing commas.

Reference-level explanation
---------------------------

`?` is identical to `+` and `*` in use except that it represents "at most once" repetition.

Drawbacks
---------
While there are grammar ambiguities, they can be easily fixed, as noted by @kennytm [here](https://internals.rust-lang.org/t/pre-rfc-at-most-one-repetition-macro-patterns/6557/2?u=mark-i-m):

> There is ambiguity: $($x:ident)?+ today matches a?b?c and not a+. Fortunately this is easy to resolve: you just look one more token ahead and always treat ?* and ?+ to mean separate by the question mark token.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should better be put back to reference-level explanation. You cannot implement ? without clarifying this bit.


Rationale and Alternatives
--------------------------

The implementation of `?` ought to be very similar to `+` and `*`. Only the parser needs to change; to the author's knowledge, it would not be technically difficult to implement, nor would it add much complexity to the compiler.

The `?` character is chosen because
- As noted above, there are grammar ambiguities, but they can be easily fixed
- It is consistent with common regex syntax, as are `+` and `*`
- It intuitively expresses "this pattern is optional"

One alternative to alleviate the trailing comma paper cut is to allow trailing commas automatically for any pattern repetitions. This would be a breaking change. Also, it would allow trailing commas in potentially unwanted places. For example:

```rust
macro_rules! foo {
($($pat:expr),*; $(foo),*) => {
$(
println!("{}", $pat)
)* // Repeat for each `expr` passed to the macro
}
}
```
would allow
```rust
foo! {
x,; foo
}
```

Also, rather than have `?` be a repetition operator, we could have the compiler do a "copy/paste" of the rule and insert the optional pattern. Implementation-wise, this might reuse less code than the proposal. Also, it's probably less easy to teach; this RFC is very easy to teach because `?` is another operator like `+` or `*`.

We could use another symbol other than `?`, but it's not clear what other options might be better. `?` has the advantage of already being known in common regex syntax as "optional".

It has also been suggested to add `{M, N}` (at least `M` but no more than `N`) either in addition to or as an alternative to `?`. Like `?`, `{M, N}` is common regex syntax and has the same implementation difficulty level. However, it's not clear how useful such a pattern would be. In particular, we can't think of any other language to include this sort of "partially-variadic" argument list. It is also questionable why one would want to _syntactically_ repeat some piece of code between `M` and `N` times. Thus, this RFC does not propose to add `{M, N}` at this time (though we note that it is forward-compatible).

Finally, we could do nothing and wait for macros 2.0. However, it will be a while (possibly years) before that lands in stable rust. The current implementation and proposals are not very well-defined yet. Having something until that time would be nice to fix this paper cut. This proposal does not add a lot of complexity, but does nicely fill the gap.

Unresolved Questions
--------------------
None that I can think of...