-
Notifications
You must be signed in to change notification settings - Fork 11.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Move devex] Move source code linters to explain language subset #20
Comments
@sblackshear - this looks like a useful thing to have for GDC and Move seems to have some support for (custom?) linting already? I don't want to spend too much time looking into it if you think it's too much to chew for me at this point, which is very likely (I doubt I could write a Move's version of clippy from scratch including Move-side support for custom linting by the GDC deadline :-) ). Still, perhaps you can share some opinion and/or give some advice here? |
Here are some ideas on how to get this to work, after having some conversations on the topic with @tnowacki. The main goal is to (mostly) provide support for writing linters to blockchain (and Move infrastructure) developers, rather than to the end users (programmers targeting a given blockchain and its Move flavor). This is mostly pragmatic, as the latter would be preferred, but considering resources and time required to create a linting framework supporting some kind of declarative linter specification is a bit beyond our reach right now (and probably unnecessary). The idea the is to provide a rather simple set of primitives over the existing compiler that would support writing linting checks. We could provide a fairly "light" linting CLI that would offer the following:
Most of this code would live in the Move repo, including linters for core Move. A linter "client" (e.g. Sui) would use the driver implementation, optionally some/all core Move linters, and build a custom CLI tool with additional "abstracted" linters that would specify customized checks for a given client. |
I think most of the linting can be done over the typed AST, but I just remembered that there is this ObjectID escape analysis that is super general. It will need the abstract interpreter. Luckily exposing just that pass should be really really easy, as the abstract interpreter is already setup with a very general trait system |
I think an expression visitor could be a helpful utility, and would be helpful for writing the check that bans global storage operations. Writing that visitor is non-trivial due to the recursive structure. Everything else for the typed AST should be easy enough |
This one will be the most annoying for the compiler in its current architecture. There is this trait though for the diagnostic codes I set up, maybe we could make one just for sui things |
We should talk about this in more detail as I am not sure if ObjectID escape analysis comes before or after the typed AST is produced, nor why we necessarily have to care about it... |
I thought it would be pretty non-controversial in that we could re-use the diagnostics facility to easily generate error messages. Something like instantiating the Diagnostics struct with the location, error message etc. and re-using the existing facility to format and display it. If it's problematic, we could adopt a different (simpler?) way of reporting linting errors. |
I think we should consider using move-model for linters. It already exists, has hooks for reporting source diagnostics, and has an IR that has been used to write several linter-like tools (e.g., the escape analysis for the robust safety paper, some lints on prover specs) and is pretty nice to work with. Importantly, it supports interprocedural analysis, which I think will be important for a number of linters, but will be tougher to do inside the compiler because it heavily leverages modularity (as it should). |
The move model might be okay for some passes but really lacks for anything detailed. The biggest thing here is it does not have any source level expressions for function definitions, just compiled bytecode.
Interprocedural analysis something we have actively avoided for the source language. These sorts of passes are slow and brittle, and tend to lead to rather confusing error messages. They are really good for static analysis type checks, since if it is confusing or inaccurate, they can be ignored (which might certainly be true for certain "linter" style pases). But if you want to use this for anything that is more a "type system" style pass, you really want the compiler and all of its tooling. |
@awelc, sorry I missed your comments
The current rules in the Sui verifier I believe are control flow sensitive. So you will need to use the
Now that I think about it, it might not be that bad? We just have the Sui linter have a diagnostic code that implements |
I believe Move models produced from source have very detailed source context information. I have used this feature in diagnostics, anyway. I'm sure there are cases where it falls short of what you can do in the compiler, but my expectation is that the error messages we can produce will still be very high quality. |
Definitely agree with avoiding this in the core analyses performed by the compiler. But for linters (as we're discussing in this task), I think we want to give the linter writer as much power/flexibility as we can. As you say, linters can be ignored, so there's no problem if someone writes a slow or flaky one. |
This isn't for "general" linters though. This is things like "do my I would consider them more "Move type system extensions" rather than the sort of checks you would find in a tool like clippy |
And I would hope so as they are strictly a subset of the information in the AST ;) |
I guess my question is: what would go wrong if we implemented our Sui linters as |
I think we need both. The idea was to have a separate linting tool that could have different "severity" levels. The "Move type system extensions" linting should probably be directly integrated with the system so that these checks cannot be bypassed. |
I am certainly not opposed to having both. I think this is just a sequencing/prioritization question. |
Agreed. They could also reuse at least part of the common infrastructure, so it would be nice (though not strictly necessary) to agree on the infrastructure that could work for both linter types. |
I would rather never extend the move-model unless you want to work over the bytecode or want to do an inter-procedural analysis, so if you want one mechanism, use the ASTs.
Already doable in the source languages diagnostics, you can even have non-skippable errors that don't stop compilation |
This does not answer my question of what would go wrong--I would like to understand why you are so opposed to extending it :). |
(to be clear by extending I mean building a thing on top of it) I oppose it because I just could not imagine a linter framework that does not want the source expressions. Thats almost literally all of the linter cases (at least for clippy), "You had XYZ expression. ABC expression is cleaner". The move-model does not contain information about source language expressions outside of whats in the source map (mapping from bytecode instruction to source file location) Similarly for that reason, it is literally impossible to use the move-model for IDE features like hover types or type-based autocomplete (which will be a thing when we eventually have receiver style syntax) For the Sui type system cases:
The only one I think would produce equally readable errors are the "key" rules and maybe global storage banning. For entry points, the type checker and the typed AST has a lot of built up infra for good errors when comparing types or for checking ability constraints (as you might imagine lol). For the ID checks, you will be able to do these checks at the bytecode level, but you will lose a lot of structured information from the source language. As a further side note, Brian and I talked about trying to fix the move-model a bit before he left. Obviously we never started on it in much detail, but part of the problem is that the move-model is a lot of code for not doing much
So much of the complexity in the move-model was maybe needed when it was written, but is outdated now. In short, something like the move-model probably should exist, but hopefully something that doesn't have its own custom interning or location data that is at best out of step with, or at worst duplicates, the tooling in the package system, compiler, IR, unit tester, etc. |
Got it + makes sense. The goal of this task is to do better than raising an opaque verifier error for the Sui-specific passes, and it sounds like we won't be able to do much better unless we work inside the compiler. I would still be hesitant to go too far down the path of providing a generic linter framework inside the compiler when it will be limited w.r.t the kinds of analyses it can do compared to |
Adding to this, we should also consider linting against useless transfer wrappers. But this might be a bit tricky to do |
Tests can now add compute_unit_limit as a config. Followups: Remove DEFAULT_EXECUTION_BOUND_SOLANA_STDLIB_TEST added in MystenLabs#18 Parse GAS_BUDGET_HEAP_SIZE Parse GAS_BUDGET_MAX_CALL_DEPTH Fixes: MystenLabs#20
Tests can now add compute_unit_limit as a config. Followups - Remove `DEFAULT_EXECUTION_BOUND_SOLANA_STDLIB_TEST` added in MystenLabs#18 - Parse GAS_BUDGET_HEAP_SIZE - Parse GAS_BUDGET_MAX_CALL_DEPTH Fixes: MystenLabs#20
…own gas budget parameters (MystenLabs#26) Follow up from MystenLabs#24 Fixes: MystenLabs#20
FastX needs a number of custom bytecode verifier passes (see #18, #19, #16, #15) to keep users in the no-shared-memory subset of Move. This is sufficient for on-chain safety, but for an end-user trying to write code in the Move source language, finding out that you've strayed from the subset via a bytecode verifier error at module publish time is not a great experience.
We should address this is by adding custom linters that enforce the subset rules at the source level. See this for some examples in the existing compiler. The code for this linters should live in the fastNFT repo. We will probably need to extend the upstream compiler to make it easy to plug in custom linters.
The text was updated successfully, but these errors were encountered: