-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: syntax tree patterns #3875
RFC: syntax tree patterns #3875
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just skimmed through this RFC. That sounds like an interesting project.
s/clippy/Clippy/
|
||
Following the motivation above, the first goal this RFC is to **simplify writing and reading lints**. | ||
|
||
The second part of the motivation is clippy's dependence on unstable compiler-internal data structures. Clippy lints are currently written against the compiler's AST / HIR which means that even small changes in these data structures might break a lot of lints. The second goal of this RFC is to **make lints independant of the compiler's AST / HIR data structures**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT:
The second part of the motivation is clippy's dependence on unstable compiler-internal data structures. Clippy lints are currently written against the compiler's AST / HIR which means that even small changes in these data structures might break a lot of lints. The second goal of this RFC is to **make lints independant of the compiler's AST / HIR data structures**. | |
The second part of the motivation is clippy's dependence on unstable compiler-internal data structures. Clippy lints are currently written against the compiler's AST / HIR which means that even small changes in these data structures might break a lot of lints. The second goal of this RFC is to **make lints independent of the compiler's AST / HIR data structures**. |
|
||
#### Applicability | ||
|
||
Even though I'd expect that a lot of lints can be written using the proposed pattern syntax, it's unlikely that all lints can be expressed using patterns. I suspect that there will still be lints that need to be implemented by writing custom pattern matching code. This would lead to mix within clippy's codebase where some lints are implemented using patterns and others aren't. This inconsistency might be considered a drawback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding this, I especially see potential to provide recuring patterns in the utils
module, which can be used across lints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what I'd eventually like to see is that all lints either use something from utils
or something from the macro.
Eventually, we can even make these utils checks part of the macro, by allowing you to add additional checks like where named implements Trait
(This opens up the opportunity to write lints as text files devoid of rust code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might also be nice to have an external crate that could be re-used by rustc and potentially other linters.
} | ||
``` | ||
|
||
This pattern matches arrays that end with at least one literal. Now given the array `[x, 1, 2]`, should `1` be matched as part of the `_*` or the `Lit(_)+` part of the pattern? The difference is important because the named submatch `#literals` would contain 1 or 2 elements depending how the pattern is matched. In regular expressions, this problem is solved by matching "greedy" by default and "non-greedy" optionally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for matching regex style. If the pattern macro is sold as "like regex" it should be as near as possible to the regex behavior, so it can be picked up more easily.
This is a very interesting idea! I think it could help a lot for newcomers specially because the I have a question, from the maintainability point of view expressed in the RFC: this would shift the maintenance burden from "all the lint code" to just "the code that translates from the Syntax Tree Patterns to the current AST, HIR and MIR APIs", right? |
@felix91gr Yes, that's the idea. Implementation detail changes (e.g. renaming There are two restrictions to this though:
Regarding (1), I don't really know how likely larger changes to the AST / HIR data structures are at this point. Are there any plans already? What are the most frequent changes to these data structures? The second point could probably be solved by abstracting common additional checks (e.g. getting the span of a node) into functions in Does this answer your question? |
@fkohlgrueber yes it does! :) Regarding the two limitations,
|
PS: take a look at https://github.com/rust-lang/rust-clippy/projects/2, it would seem to me that this RFC has a place in there somewhere. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really well written and designed, kudos! I have some minor side comments but they shouldn't affect the RFC in its current form.
Thanks for working on this, I've wanted something like this for ages!
Block(Seq<Stmt>), | ||
} | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't we also need to deal with impls, structs, etc? A lot of clippy lints deal with items not expressions.
Of course as a first step this is still amazing :)
|
||
#### Applicability | ||
|
||
Even though I'd expect that a lot of lints can be written using the proposed pattern syntax, it's unlikely that all lints can be expressed using patterns. I suspect that there will still be lints that need to be implemented by writing custom pattern matching code. This would lead to mix within clippy's codebase where some lints are implemented using patterns and others aren't. This inconsistency might be considered a drawback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what I'd eventually like to see is that all lints either use something from utils
or something from the macro.
Eventually, we can even make these utils checks part of the macro, by allowing you to add additional checks like where named implements Trait
(This opens up the opportunity to write lints as text files devoid of rust code)
} | ||
``` | ||
|
||
The difference compared to the currently proposed two-stage filtering is that using early filtering, the condition (`!in_macro(#then.span)` in this case) would be evaluated as soon as the `Block(_)#then` was matched. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could allow where
statements to be shoved in anywhere, allowing the user to specify when the filtering happens. Not necessary for now, but nice to have.
|
||
#### Match descendant | ||
|
||
A lot of lints currently implement custom visitors that check whether any subtree (which might not be a direct descendant) of the current node matches some properties. This cannot be expressed with the proposed pattern syntax. Extending the pattern syntax to allow patterns like "a function that contains at least two return statements" could be a practical addition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can also be expressed as a mix of a walking function in utils and patterns with the current proposal.
Lit(Alt<Lit>), | ||
Array(Seq<Expr>), | ||
Block_(Alt<BlockType>), | ||
If(Alt<Expr>, Alt<BlockType>, Opt<Expr>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, perhaps users of opt can be made to use named parameters? I.e.
If(_, _)
, vs If(_, _, then=_)
vs If(_, _, then=_?)
. It might be cleaner that way. I'm not sure.
Friendly ping @fkohlgrueber. Any updates on this project? |
Hey Philipp, thanks for reminding me. I wanted to post updates here some time ago but have forgotten to do so unfortunately. I've continued to work on the project over the last two months and implemented some additional features (e.g. functional composition). I'm currently focusing on writing my thesis report and after that is finished, I'll have time to work on the integration into Clippy again. |
Hello everyone! It's been some time, but I can now share my finished thesis report with you: https://fkohlgrueber.github.io/thesis.pdf Over the next six months, I'll continue to work on this topic approximately one day per week. My goal is to integrate my thesis work into Clippy until February 2020. I'm currently thinking about how to do the integration and would like to hear your thoughts on this. Since it's been quite some time since I did most of the coding, I'll first update my projects to work with the latest clippy / rustc nightly. After that, I'm not sure how to proceed further. One possible way to go would be to integrate the pattern-based What do you think? |
Awesome! I think the first step would be to integrate a MVP of the pattern code in Clippy, which can be used optionally. Then documentation with "How do I switch to the pattern based code?"-instructions would be great. This documentation could be created while making the first lint ( Once it works with the first lint, it would be great to create one or more tracking issues, where lints are grouped by:
Making such a grouping of lints could be time consuming, but I think that's the best way to get contributors to help you with the transition. (And "There are 311 lints included in this crate!", so I think you'll need help on this project 😉). Once that is done we can swap to the patterns one lint at a time. |
@flip1995's plan sounds good to me! Probably should open a tracking issue for this. |
I just found the time to read the RFC. Some thoughts:
|
I should have been more explicit that the RFC linked above describes a preliminary version of the concept that has been extended afterwards. The thesis documents the current state and is a better description of the concept. The main differences are:
@llogiq I didn't know about PMD, thanks for the hint. Could you elaborate what you mean by type patterns? My understanding currently is that type information is only useful after name resolution has been performed, which would mean using HIR, right? What's the difference between |
That makes sense. Perhaps it's a good idea to update the PR description and add a link to the thesis so that others can instantly find it instead of having to read the comments? |
@JeanMertz That's true. I updated the PR's description. |
rust analyzer will probably integrate with this at some point rust-lang/rust-analyzer#8696 (comment) and rust-lang/rust-analyzer#8696 (comment) currently there exists |
@rust-lang/clippy So it occurred to me that this PR is still open, and even though we're not working on this, do y'all think we should still merge this? Overall it's a very neat design and will likely work into the lang team's plans of doing custom lints at some point. I'd definitely like to have this doc in a spot that's a bit more permanent than just a PR 😄 |
I plan to finish the work on the first version of the Clippy book on Saturday. So I would wait until the book is merged and then include this into the book. But sure, I'm fine with merging this 👍 |
Sounds good. I figure this doesn't necessarily go in the book yet because it's a propoed design, but if we can incorporate it that would still be great! |
Omg omg omg omg this is such a cool proposal, and it's been a while so I feared it might never be merged. I'm glad y'all have regained time to work on it / think about it. I'm looking forward to it! I hope things are okay for you both, @Manishearth and @flip1995. Times have been hard ❤️🩹 |
I guess I'll turn the "Roadmap" section that currently exists in the draft of the book into a "Proposal" section. This proposal section will then only include the 2021 Roadmap (that we aren't close to completing in 2022 lol) and this for now. But we leave things open for future bigger proposals for Clippy / linting in general.
@felix91gr Finding time to work on this is out of scope for me for the foreseeable future. But I like the proposal in general, so I'm happy to do the smallest possible amount of work for moving this forward with merging it so it becomes more official and maybe someone else feels encouraged to work on it. Hope you're also doing well! |
Ah, makes sense! |
I totally dropped the ball on this but I'm at least goign to hit merge, and @flip1995 can look at moving it into the book? |
Move syntax tree patterns RFC to the book r? `@Manishearth` Follow up to #3875 changelog: none
This PR adds an RFC for implementing lints using syntax tree patterns.
Rendered RFC (outdated)
Update: A more thorough description of the current concept is given in my thesis.
cc @oli-obk