-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Unicode XID related functionalities to proc_macro
crate
#540
Comments
If the intent is to match what Rust considers a valid identifier, the XID start/continue functions may be a bit too low-level. They’re not quite enough on their own to ensure Can you give some examples of how, exactly, proc macros would use this new API? |
Maybe something like |
On the other hand, if we forget about proc macros for a while, I wouldn’t mind the standard library providing the character classes. The most downloaded reverse dependencies of These crates don’t need to match the unicode version used by rustc, but if they use the standard library for other unicode character classes then they already need to recompile with a new Rust release to fully pick up new Unicode versions. It doesn’t need to be in |
I did suggest adding these functions to
I won't propose anything here, instead i'll just show a few examples i collected from crates.io. Case 1: rocket-codegenThe code here https://github.com/rwf2/Rocket/blob/28891e8072136f4641a33fb8c3f2aafce9d88d5b/core/codegen/src/attribute/param/parse.rs#L242C4-L248 Case 2: derive-moreThe code here Case 3: bunt-macrosThe code here My original motivation was just exempting crates like this each year to bumping xid crates dependencies on new Unicode releases, following rust language toolchain's update. Instead they could just use standard libraries provided data. Maybe the motivation wasn't strong enough to put these into For implementing other-languages, those language itself often have its own pace of following the Unicode releases, actually the community-crate based approach might serve them better. For example, |
Thank you for the examples. I've looked a little bit into each and how they diverge from what
The most surprising thing I've learned now is that |
I don't think most dependents of
Especially for the purpose of defining "what's an identifier" making it precise is not very important because you don't get users filing issues like "I wanted to use U+10940 SIDETIC LETTER N01 in my variable names, when will you update to Unicode 17?" — nor will anyone complain when new code points are eventually accepted as part of identifiers. |
Proposal
Problem statement
When generating code in proc macros, sometimes one need to generate identifiers from user-provided strings. It's necessary to validate the validity of such strings to judge whether they can actually be used as identifiers.
One can use community-provided crates to do such, but they can easily go out of sync with later rustc updates, and needs to bump to new releases for updating such dependencies.
Motivating examples or use cases
Solution sketch
I'd propose adding new trait and implement on
char
withinproc_macro
crate.Alternatives
Add such methods on
char
as inherent methods. (But might not be worthwhile, because normal rust programs rarely needs to use xid-related checks.Leave things as is
Links and related work
What happens now?
This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.
Possible responses
The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):
Second, if there's a concrete solution:
The text was updated successfully, but these errors were encountered: