-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Just use trees for HIR? #8713
Comments
Note: got this idea after reading "early on, we decided that syntax trees would be the basic API of Roslyn" in some blog. My first reaction was "of course we can't do that, because macros", but then I realized that a macro can have both expansion and argument as children... |
So, what would that concretely mean compared to the current API? I think 1. some types that currently are just data would gain an identity / location in the tree (e.g. so you can go back "up" from a I think this does sound like a good idea, and would probably give us some more guidance as to how to design the HIR APIs... |
Would there be a |
Yeah, I think that's roughly what I have in mind. I'd add the following as well:
|
Probably makes sense to take a second look at https://docs.microsoft.com/en-us/dotnet/api/microsoft.codeanalysis.semanticmodel?view=roslyn-dotnet-3.9.0, it feels like it has a lot of goodness
|
(Perpetual) Question: what should be the underlying storage strategy for such a hir? I can see three different strategies: // I. Copy type with reference to context
struct Store { ... }
#[derive(Clone, Copy)]
pub struct IfExpression<'a> {
id: u32,
store: &'a Store
}
impl IfExpression<'a> {
pub fn cond(self) -> Expression<'a> { ... }
}
// II. Copy type with explicitly passed-in context
pub struct Store { ... }
#[derive(Clone, Copy)]
pub struct IfExpression {
id: u32,
}
impl IfExpression {
pub fn cond(self, store: &Store) -> Expression { ... }
}
// III. !Copy type with shared reference to the context
#[derive(Clone)]
pub struct IfExpression {
id: u32,
store: Rc<Store>
}
impl IfExpression {
pub fn cond(self) -> Expression { ... }
} |
I feel like option 2 could get annoying when you need to pass the store as well as the database to some function calls. |
I think I would go with option 2 for now, like with the current HIR. I think the store can and maybe has to contain a reference to the DB as well? |
Yeah, the store would have to have a It also seems that the |
Observation: III is basically how our syntax tree is setup actually. |
Yeah. On the other hand, you don't usually need the DB to manipulate syntax trees, but you will need it to do things with the HIR. And I don't think we can put a DB reference into an |
You don't need DB literary, but a reference to a single SyntaxNode does give access to the whole. The role of DB is played by the root node, which owns the underlying data (green node). I don't think we can put a reference to db, but, presumably, we can store But it's a good point that, fundamentally, we have a lifetime here -- everything happens in a context of a running query. Even if we do fully owned API, that would be a footgun. If you store a hir node in some long-living datastructure, you'll notice that you can't actually apply modifications to the database, because the stored node will keep a database reference alive. This I think rules out the third variant. Which is good -- it's the most tantalizing one! |
Yeah. Of the remaining ones, option 1 is more inconvenient than 2, but it makes the connection of the nodes to the store clearer, and maybe that's a good thing 🤔 |
The extra reference in option 1 is quite significant wrt. the memory usage, right? |
Not sure, since we won't keep many of these around usually, I think? |
Yeah, due to the lifetime, neither setup will be stored in long-term storage. Although optimizing transient memory usage and cache efficiency will be nice! Couple of more considerations:
|
You could have a |
I am warming up to the idea more and more. At the same time, I feel a bit scared -- it looks like this might be the last time we have a chance to re-design the whole API. I feel a bit of waterfall can be helpful here! So, I suggest using this document to collaboratively iterate on the design: https://hackmd.io/@matklad/rJzQhvk2u/edit It's very bare bones, I plan to fill it in in the coming weeks, but I want to lay the design framework here upfront! |
Hello, I'm currently working on marker (formally rust-linting) where I try to create a stable driver independent linting interface for Rust. The IR will be based on the Rust Reference, to ensure that it's (as much as possible) driver independent. From reading this discussion and the linked document, it looks the goal of the these project mostly align. The project will mostly focus on usability and less on performance, which was mentioned in the linked document. However, based on the input, I'll try to make a few more things lazy evaluated and leave the cashing/node storage up to the driver. Right now, the project is in a very early stage and the current test representation has to be rewritten for various reasons. So, there is probably very little/nothing that can be done right now to help. I mainly wanted to get this idea out there. 🙃 Though, @HKalbasi has expressed interest to test rust-analyzer as a driver. |
EDIT: this now has a design doc: https://hackmd.io/@matklad/rJzQhvk2u/edit
One of our early design decisions was to keep all semantic info out of the syntax trees. I think that worked well for the compiler side of rust-analyzer -- we managed to do a bunch of relatively clear IRs, which are relatively incremental.
However, I am not so convinced that this split works quite as well for the hir API. When implementing, eg, assists, working with syntax is easy. Patterns like
node.ancestors().find_map(ast::Expr::cast)
are emerging nice idioms. Going from syntax to semantics and back creates a lot of friction, both because there are two layers to start with, and because the bridges between them are unclear.So I am thinking, what if hir exported API which looked exactly as our syntax-tree based API, but with additional semantic info? So that, you can ask
.type
for an expression,.resolve
for path, etc.When you ask for
.parent
of a module/file, you transparently get the module in another file.For macro calls, there are two children:
.arg()
and.expansion()
, and.expansion()
goes into another file.Similarly,
.parent
transparently goes via expansion stack.The text was updated successfully, but these errors were encountered: