-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Give tip for unicode chars which might not be visible when rendered #100439
Conversation
(rust-highfive has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
@@ -322,6 +322,8 @@ impl<'a> StringReader<'a> { | |||
let token = unicode_chars::check_for_substitution(self, start, c, &mut err); | |||
if c == '\x00' { | |||
err.help("source files must contain UTF-8 encoded text, unexpected null bytes might occur when a different encoding is used"); | |||
} else if let Ok(v) = &err.suggestions && v.len() == 0 && !c.is_ascii() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs a different heuristic for "is not visible when rendered", since this triggers on every non-ascii char.
Also, why are you checking err.suggestions
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs a different heuristic for "is not visible when rendered", since this triggers on every non-ascii char.
Also, why are you checking
err.suggestions
here?
Yes, I'm still thinking any better method for check a unicode...
I'm checking the err.suggestions
here, because we already have a strategy to give suggestions for those possible substitution,
rust/compiler/rustc_parse/src/lexer/mod.rs
Line 322 in d76952f
let token = unicode_chars::check_for_substitution(self, start, c, &mut err); |
My idea is if we already have such kind of suggestion, we won't add more, since it's clear enough:
--> /home/cat/code/rust/src/test/ui/parser/unicode-quote-chars.rs:2:14
|
2 | println!(“hello world”);
| ^
|
= help: Unicode character \u{323} might not be visible when rendered
help: Unicode characters '“' (Left Double Quotation Mark) and '”' (Right Double Quotation Mark) look like '"' (Quotation Mark), but are not
I think = help: Unicode character \u{323} might not be visible when rendered
here will be redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm checking the err.suggestions here, because we already have a strategy to give suggestions for those possible substitution,
I think if you change your PR to only note characters that have a rendered width of zero (i.e. combining characters, etc) then this check is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't found a better way to figure out "not visible when rendered" chars.
We have a test case in
rust/src/test/ui/issues/issue-29227.rs
Line 125 in b998821
fn is_es_whitespace(self) -> bool { |
The unicode_space_separator
, unicode_space_separator
and combining_spacing_mark
categories seem contain the chars which are hard to spot, but those are also not totaly 'not visible`, and I think introduce those big tables into compiler code base maybe too heavy ..
@rustbot author |
You'll need to use the unicode table generator to generate a table of the characters with General Category=Nonspacing Mark and/or Bidi_Class=Nonspacing_Mark for which to warn on. Adding the table to rustc is probably fine, adding it to std is nondesirable. The UI test is unrelated. |
☔ The latest upstream changes (presumably #102302) made this pull request unmergeable. Please resolve the merge conflicts. |
ping from triage: Can you please address the merge conflicts - and post your status on this PR? FYI: when a PR is ready for review, send a message containing |
Fixes #100388