-
Notifications
You must be signed in to change notification settings - Fork 988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incremental verify checkpoints #4487
Incremental verify checkpoints #4487
Conversation
5abf2eb
to
c4810de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this change, sorry we kept going back and forth so much in the design phase :(. I did a quick pass, but I think there's a couple of issues with the interface that need to be fixed, then I'll do another pass once things are working a bit better. In particular
stellar-core --conf test.cfg verify-checkpoints --trusted-hash-file does-not-exist
crashes after syncing with the network, but it looks like this should work based on the help comment from --trusted-hash-file
. Either the comment should be changed and this error check should happen on startup if this is intended behavior, or it should be addressed.
I'm also not quite sure what the intended interface for this is. It looks like in the doc, we have
stellar-core verify-checkpoints –conf=core.cfg –trusted-hash-file=path/to/verified.json
which takes in a previous file called path/to/verified.json
, and at the end of the call updates path/to/verified.json
such that is contains hashes to lcl. However, it looks like the interface has changed in this PR, where we take in
stellar-core verify-checkpoints --trusted-hash-file=path/to/verified.json --output-file=path/to/verified2.json
where the output file is a new file which contains the hashes from path/to/verified.json
. The issue is, this doesn't actually work as an append operations, as the --output-file
must not be the same as trusted-hash-file
. To demonstrate this, I ran the following commands on testnet:
stellar-core ---conf testnet.cfg verify-checkpoints --output-file out --from-ledger 249443
This command succeeded. After a few checkpoints passed, I then attempted to append to the file to catch up to lcl with
stellar-core ---conf testnet.cfg verify-checkpoints --output-file out --trusted-hash-file out
which crashed. I doubt that Horizon operators will want to manager a collection of files, so we probably do want a truly append operation.
While I found a couple issues, I think it would be helpful to
- Validity checking on startup. If we crash due to a file not existing that's fine, but this should happen immediately on startup and not after waiting for the network's next checkpoint ledger.
- Take a step back and solidify what the interface should be. I know we've had some irl conversations back and forth and the expectations have been changing a lot throughout, but currently the design doc,
commands.md
doc, and command line "help" output all define different, mutually exclusive interfaces. I think this is making review and implementation a bit tricky.
Do you know what error was printed when you ran this? For me I get I agree that the error reporting should happen earlier. I thought that calling
Correct, the design was updated not to append to the
I've spotted a typo in commands.md ( |
Ya the error I was referring to was that one, with no output-file.
Sounds like a good idea!
That definitely cleans up most of it, but I think there's still an issue in the command help message for "--trusted-hash-file":
I don't think a non-existent file should be valid, and we should probably just crash immediately on startup in this case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall working much better! A few small issues regarding graceful failure and making sure we don't corrupt output files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just a few final cleanups and one edge case question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…llow for incremental verification of checkpoints.
7eb1104
to
b827e40
Compare
Resolves #4454
Description
Adds
--trusted-hash-file
argument to theverify-checkpoints
command to support appending new verified checkpoints starting from the last checkpoint in the trusted hash file.Adds
--from-ledger
to support generating a verified checkpoint hash file starting from a specific ledger to LCL/specified end ledger.Design doc: https://docs.google.com/document/d/1GRzHAO4_YrfanXqoVc1UDIMhUV10PFqIMQyOxlPOW_s/edit
Usage example:
--from-ledger
:% src/stellar-core verify-checkpoints --from-ledger=53736369 --output-file=out.json --conf=../stellar-core.cfg
Result:
Append to existing file:
src/stellar-core verify-checkpoints --trusted-hash-file=out.json --output-file=out2.json --conf=../stellar-core.cfg
Result:
Usage of both
--from-ledger
and--trusted-hash-file
-> ERRORPerformance
Time for verification of checkpoints
--from-ledger=53737040
to LCL=53739327Output: hashes for checkpoints 53737023 to 53739327, total of 2304 ledgers = 2287 ledgers (from
--from-ledger=53737040
to LCL=53739327) + 13 ledgers (from checkpoint 53737023 to --from-ledger=53737040):205 seconds / 2304 ledgers = 0.09 seconds, 90 milliseconds / ledger
Caveat: There is an overhead as the LCL is obtained from the network. On average we will wait 1/2 a checkpoint (32 ledgers) to find a checkpoint boundary LCL (32 ledgers * 5 seconds = 160 seconds).
Checklist
clang-format
v8.0.0 (viamake format
or the Visual Studio extension)