-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure the unpacked chunk is passed from the H2 CM peer to the H1 CM peer #628
Conversation
JamesPiechota
commented
Oct 10, 2024
•
edited
Loading
edited
- refactor the code for reading the PoA from disk (and then going to peers if it can't be found)
- after finding a solution, ensure that the H2 CM peer reads the PoA - including the unpacked_chunk - before passing it off to the H1 CM peer
197864c
to
f040e93
Compare
wait_for_cross_node(_Miners, _ValidatorNode, _CurrentHeight, _ExpectedPartitions, 0) -> | ||
?assert(false, "Timed out waiting for a cross-node solution"); | ||
wait_for_cross_node(_Miners, _ValidatorNode, _CurrentHeight, _ExpectedPartitions, 0) -> | ||
?assert(false, "Cross-node solutions can only have 2 partitions."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A check for ExpectedPartitions' length missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right. Thanks! fixed
{modules_covering_recall_byte, ModuleIDs}]), | ||
ar:console("WARNING: we have mined a block but did not find the PoA1 proofs " | ||
"locally - searching the peers...~n"), | ||
case fetch_poa_from_peers(RecallByte1, PackingDifficulty) of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we remove fetch_poa_from_peers from here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it all happens in prepare_poa now up here: https://github.com/ArweaveTeam/arweave/pull/628/files#diff-ea8e957c63acb7497b0ed4f9201b40f18e5d5b4fa371e67b4cff05850e4df854R646
We had 3 spots where we potentially needed to read or fetch poa when preparing a solution poa1, preparing a solution poa2, and before sending an H2 candidate to a CM peer. Previously the 3rd case was missing the read and fetch. So the intent was to pull the logic together into a single function (prepare_poa) and then call that function from each of the 3 spots that need to read/fetch the poa.
that was the intent at least...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, I see, thank you
prepare_solution(poa1, Candidate, Solution#mining_solution{ poa2 = PoA2 }); | ||
{error, Error} -> | ||
Chunk2Binary = case Chunk2 of | ||
not_set -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not a realistic case that we do not have the second chunk and leave it to the exit peer to fetch it, does it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah you're right, that case statement was preserving the existing logic but fixing a crash. the previous prepare_solution(poa2
call didn't check whether Chunk2 was set or not and just sent it directly to the exit peer here: https://github.com/ArweaveTeam/arweave/pull/628/files#diff-ea8e957c63acb7497b0ed4f9201b40f18e5d5b4fa371e67b4cff05850e4df854L773
Unfortunately when Chunk2 was not_set
it caused a crash during ar_serialize
. So the above change just fixes that ar_serialize
crash, but doesn't change the overall logic of deferring to the exit peer (which I agree, doesn't buy us much if it's the Chunk data that's missing rather than proof data)
That said I didn't fully investigate why chunk2
was not_set
in the first place... so unclear to me how we get to this code path anyways. The case
statement "fix" was just in response to the stack traces shared by miners which were showing a crash in ar_serialize
due to the poa.chunk value being not_set
rather than a binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, interesting
{error, Error} -> | ||
case Chunk1 of | ||
not_set -> | ||
Packing = ar_block:get_packing(PackingDifficulty, MiningAddress), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR did not introduce it (I did here,) but it does not look quite right - we should wrap the Solution#mining_solution{ poa1 = #poa{ chunk = SubChunk } }
below in may_be_leave_it_to_exit_peer
to make sure it errors our on exit peer because there are no proofs, only the raw chunk. Should we fix it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thank you!