Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/packing difficulty #590

Merged
merged 20 commits into from
Sep 20, 2024
Merged

Feature/packing difficulty #590

merged 20 commits into from
Sep 20, 2024

Conversation

ldmberman
Copy link
Member

No description provided.

@ldmberman ldmberman requested review from vird and JamesPiechota July 16, 2024 21:37
@ldmberman ldmberman force-pushed the feature/packing-difficulty branch from d723102 to eda26c0 Compare August 1, 2024 09:46
for (int k = 0; k < offsetByteSize; k++) {
offsetBytes[k] = ((offset + subChunkSize) >> (8 * (offsetByteSize - 1 - k))) & 0xFF;
}
SHA256_CTX sha256;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a correct understanding of this block?

  1. It computes a feistel hash which is used as the encryption key for the sub-chunk
  2. the feistel hash is generated from the packing address (i.e. inputData) and the offset of the sub-chunk
  3. this guarantees that the encryption key for each sub-chunk is unique but deterministic

If that's correct maybe we could add something like that as a comment? e.g.

// Sub-chunk encryption key is the feistel hash of the input data and the sub-chunk offset

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a sha2 hash (not a feistel)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha, my ignorance is showing. I was going off of pattern matching with the feistel_hash() function: https://github.com/ArweaveTeam/arweave/blob/master/apps/arweave/c_src/feistel_msgsize_key_cipher.cpp#L7-L13

The code blocks look pretty similar - is the main difference that a feistel hash forces a 32-length? (or is it just a misnaming of the feistel_hash(...) function and I should ignore?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The block computes sha256(<< packing_key_without_sub_chunk_offset, sub_chunk_offset >>).

// 3 bytes is sufficient to represent offsets up to at most MAX_CHUNK_SIZE.
int offsetByteSize = 3;
unsigned char offsetBytes[offsetByteSize];
for (int k = 0; k < offsetByteSize; k++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int k = 0; k < offsetByteSize; k++) {
// Byte string representation of the sub-chunk offset: i * subChunkSize
for (int k = 0; k < offsetByteSize; k++) {

SHA256_Update(&sha256, offsetBytes, offsetByteSize);
SHA256_Final(key, &sha256);

for (int j = 0; j < iterations; j++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int j = 0; j < iterations; j++) {
// Sequentially encrypt each sub-chunk 'iterations' times.
// The encrypted output of each iteration is the input for the following iteration.
for (int j = 0; j < iterations; j++) {

uint32_t subChunkSize = MAX_CHUNK_SIZE / subChunkCount;
uint32_t offset = 0;
unsigned char key[PACKING_KEY_SIZE];
for (int i = 0; i < subChunkCount; i++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int i = 0; i < subChunkCount; i++) {
// Encrypt each sub-chunk independently and then concatenate the encrypted sub-chunks to yield encrypted composite chunk
for (int i = 0; i < subChunkCount; i++) {

uint32_t subChunkSize = outChunkLen / subChunkCount;
uint32_t offset = 0;
unsigned char key[PACKING_KEY_SIZE];
for (int i = 0; i < subChunkCount; i++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int i = 0; i < subChunkCount; i++) {
// Decrypt each sub-chunk independently and then concatenate the decrypted sub-chunks to yield encrypted composite chunk
for (int i = 0; i < subChunkCount; i++) {

unsigned char* decryptedChunk = enif_make_new_binary(envPtr, outChunkLen,
&decryptedChunkTerm);
unsigned char* decryptedSubChunk;
// Both MAX_CHUNK_SIZE and subChunkCount are multiples of 64 so all sub-chunks
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Both MAX_CHUNK_SIZE and subChunkCount are multiples of 64 so all sub-chunks
// Both outChunkLen and subChunkCount are multiples of 64 so all sub-chunks

iterations < 1) {
return enif_make_badarg(envPtr);
}
if (!enif_get_int(envPtr, argv[8], &subChunkCount) ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need to test that subChunkCount is a multiple of 64? e.g. I believe a subChunkCount of 2 would pass these checks but would then run afoul of this comment // Both MAX_CHUNK_SIZE and subChunkCount are multiples of 64

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, maybe that comment just needs to be updated - since we don't care if subChunkCount is a multiple of 64 so long as all sub chunks are the same size and the sub-chunks are multiple of 64

for (int k = 0; k < offsetByteSize; k++) {
offsetBytes[k] = ((offset + subChunkSize) >> (8 * (offsetByteSize - 1 - k))) & 0xFF;
}
SHA256_CTX sha256;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SHA256_CTX sha256;
// Sub-chunk encryption key is the feistel hash of the input data and the sub-chunk offset
SHA256_CTX sha256;

SHA256_Update(&sha256, offsetBytes, offsetByteSize);
SHA256_Final(key, &sha256);

for (int j = 0; j < iterations; j++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int j = 0; j < iterations; j++) {
// Sequentially encrypt each sub-chunk 'iterations' times.
// The encrypted output of each iteration is the input for the following iteration.
for (int j = 0; j < iterations; j++) {

// 3 bytes is sufficient to represent offsets up to at most MAX_CHUNK_SIZE.
int offsetByteSize = 3;
unsigned char offsetBytes[offsetByteSize];
for (int k = 0; k < offsetByteSize; k++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int k = 0; k < offsetByteSize; k++) {
// Byte string representation of the sub-chunk offset: i * subChunkSize
for (int k = 0; k < offsetByteSize; k++) {

%% The number of sub-chunks in a compositely packed chunk with the packing difficulty
%% between 1 and 32 incl. The composite packing with the packing difficulty 1 matches
%% approximately the non-composite 2.6 packing in terms of computational costs.
-define(PACKING_DIFFICULTY_ONE_SUB_CHUNK_COUNT, 32).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of sub-chunks is the same for every packing difficulty, right? i.e. this isn't specifically about "Packing Difficulty 1" right? Or have I misunderstood?

I.e. could this be called COMPOSITE_PACKING_SUB_CHUNK_COUNT or something and still be accurate?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, okay I see elsewhere we use the term "atomic sub-chunk" or "basic atomic sub-chunk" - are those different from a "packing difficulty 1 sub chunk"?

if they're all the same thing, what do you think about just labeling them all "sub chunk" and dropping the qualifiers - I don't think we have the concept of a sub-chunk anywhere else in the code yet, right?

@@ -98,12 +98,13 @@ show_help() ->
"a particular data range. The data and metadata related to the module "
"are stored in a dedicated folder "
"([data_dir]/storage_modules/storage_module_[partition_number]_[packing]/"
") where packing is either a mining address or \"unpacked\"."
" Example: storage_module 0,En2eqsVJARnTVOSh723PBXAKGmKgrGSjQ2YIGwE_ZRI. "
") where packing is either a mining address + : + packing difficulty "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
") where packing is either a mining address + : + packing difficulty "
") where packing is either <mining address>:<packing difficulty> "

@@ -359,7 +376,8 @@
%% of the corresponding transaction. Proofs the inclusion of the chunk
%% in the corresponding "data_root" under a particular offset.
data_path = <<>>,
chunk = <<>>
chunk = <<>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
chunk = <<>>,
%% When packing difficulty is 0 chunk stores a full packed chunk
%% When packing difficulty >= 1, chunk stores a 8192-byte packed sub-chunk
chunk = <<>>,

@@ -359,7 +376,8 @@
%% of the corresponding transaction. Proofs the inclusion of the chunk
%% in the corresponding "data_root" under a particular offset.
data_path = <<>>,
chunk = <<>>
chunk = <<>>,
unpacked_chunk = <<>>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unpacked_chunk = <<>>
%% When packing difficulty is 0, unpacked_chunk remains <<>>
%% When packing difficulty >= 1, unpacked_chunk stores a full unpacked chunk
unpacked_chunk = <<>>

@@ -526,7 +605,9 @@ hash_wallet_list(WalletList) ->
(ar_serialize:encode_bin(LastTX, 8))/binary,
(ar_serialize:encode_int(Denomination, 8))/binary,
MiningPermissionBin/binary >>,
crypto:hash(sha384, Preimage)
Alg = case Height >= ar_fork:height_2_8() of true -> sha256;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forget the context on this change. Is this required for the packing depth changes, or we taking the opportunity to make this change since we have to do a hard fork anyways?

And the rationale is to reduce the size of the wallet list?

ChunkOffset, TXRoot, Chunk, ChunkSize, RandomXStateRef, External)
when RequestedAddr == StoredAddr,
StoredPackingDifficulty > RequestedPackingDifficulty ->
repack_no_nif({RequestedPacking, StoredPacking, ChunkOffset, TXRoot, Chunk,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These all use repack_no_nif because the nif doesn't handle repacking across formate (e.g. legacy to composite, composite to legacy, or changing composite difficulties)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, except for the current NIF implementation can change the packing difficulty up (make it more difficult) but not down.

{<<"bucketsize">>, BucketSize},
{<<"addr">>, EncodedAddr},
{<<"pdiff">>, PackingDifficulty}
]} when is_integer(PackingDifficulty), PackingDifficulty >= 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check >= 1? It looks like we only serialize to {pdiff, xxx} if the difficlty is >= 1, so if we get a 0 here it might indicate some weird error?

Indices = collect_missing_tx_indices(Prefixes),
IsSolutionHashKnown =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing all this code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one might be questionable, but I felt like this feature is more trouble maintaining than it is actually useful. The feature is about the validators optionally informing the block sender they have the chunks so that some bandwidth is saved while the validator reads and packs the chunk(s) themselves.

Thinking about it now, I believe the feature might be in fact hurtful in the long run especially given the composite packing specifics because it may make miners rely excessively on validators extracting the chunks and if such validators disappear over time the network might not notice it and might not react sufficiently quickly. In other words, I would rather make the network spend some bandwidth but require all miners to set up the chunk building correctly from the start.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works for me! I'm a fan of removing code.

get_max_nonce(0) ->
get_max_nonce2((?RECALL_RANGE_SIZE) div ?DATA_CHUNK_SIZE);
get_max_nonce(PackingDifficulty) ->
AdjustedRecallRangeSize = ?RECALL_RANGE_SIZE div PackingDifficulty,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
AdjustedRecallRangeSize = ?RECALL_RANGE_SIZE div PackingDifficulty,
AdjustedRecallRangeSize = get_recall_range(PackingDifficulty),

@@ -510,17 +566,29 @@ pre_validate_cumulative_difficulty(B, PrevB, SolutionResigned, Peer) ->
gen_server:cast(?MODULE, {enqueue, {B, PrevB, true, Peer}}),
enqueued;
false ->
pre_validate_quick_pow(B, PrevB, false, Peer)
pre_validate_packing_difficulty(B, PrevB, false, Peer)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ever called with SolutionResigned == true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good catch! This is wrong, fixed

@@ -34,6 +34,9 @@

encode_packing({spora_2_6, Addr}) ->
"spora_2_6_" ++ binary_to_list(ar_util:encode(Addr));
encode_packing({composite, Addr, PackingDifficulty}) ->
"composite_" ++ binary_to_list(ar_util:encode(Addr)) ++ ":"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flagging this discussion here (was started in the other repo). @vird pointed out that : is used for address checksum and it will cause a problem on windows (eventually). So perhaps we can select a different delimeter?

Slack discussion here: https://digitalhistor-8tx7838.slack.com/archives/C04Q93X02AJ/p1723576491561449

Maybe we use . as the delimiter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like .!

@ldmberman ldmberman force-pushed the feature/packing-difficulty branch 3 times, most recently from 8af401a to 77af830 Compare August 15, 2024 12:50
@ldmberman ldmberman force-pushed the feature/packing-difficulty branch from 7fb7006 to 2710c1f Compare August 28, 2024 09:37
@ldmberman ldmberman force-pushed the feature/packing-difficulty branch from b9f077c to 8a9ca76 Compare September 5, 2024 17:00
Lev Berman and others added 20 commits September 20, 2024 20:41
The commit introduces the new hard fork (2.8), the height is not set yet.

Composite packing allows to extract more performance out of bigger-capacity
lower-bandwidth drives by choosing a costlier packing that reduces the
required mining bandwidth.
some bug fixes and some refactoring

Co-authored-by: James Piechota <piechota@gmail.com>
Remove old randomx nifs:

bulk_hash_fast_nif
hash_fast_verify_nif
hash_fast_long_with_entropy_nif
bulk_hash_fast_long_with_entropy_nif
vdf_parallel_sha_verify_nif
- Fix a bug where the H1 peer receives H2 and builds incorrect solution after fetching proofs
  from the network instead of the local storage;
- Do not give up solutions when a CM miner cannot find proofs or VDF
  steps - delegate it to the exit peer;
- Fix incorrect sub-chunk picking during solution validation by the
  mining server.
Adjusting the hash rate by (TwoChunkCount + OneChunkCount) / TwoChunkCount counts
the number of all the hashing attempts. In other words, the adjusted value is useful
when we want to see the total amount of CPU work put into mining a block.
This is not what we use it for. We use it as a denominator when computing
a share contributed by a single partition - see ar_pricing:get_v2_price_per_gib_minute.
Therefore, the hash rate computed here needs to have the same "units" as
the hash rate we estimate for the partition - the "normalized" hash rate
where a recall range only produces 4 nonces from one recall range (chunk-1)
plus up to 400 nonces (chunk-2).

Note that we did not even adjust it by (TwoChunkCount + OneChunkCount) / TwoChunkCount but
by 100 div (100 + 1), what is wrong.
- Move Fast/Light flag into the randomx state

now we only need the _fast vs _light methods on init

- Add randomx_info_nif and update lib/RandomX submodule

randomx_info_nif provides some info about a RandomX state -
currently only used in tests.

Updated lib/RandomX submodule exposts randomx dataset and
cache size values via -D build flags

- Change up how we handle DEBUG vs. non-DEBUG for ar_mine_randomx.erl

No longer us -define, instead switch behavior based on the
randomx State. This will let us test the non-DEBUG functionality
for some more code paths

- Add more ar_mine_randomx tests:

1. Test for a memory corruption issue in the decrypt_composite_nif
2. Tests for the NIF wrappers

- Remove the reencrypt_legacy_composit nif

  in preparation for increasing the randomx dataset for composite packing

- Remove obsolete c_src tests binary
- Remove ar_randomx_state
- Applies to composite packing only;
- The composite packing with packing difficulty=2 has the recall range
  of 25 MiB, the packing difficulty=3 - 12.5 MiB etc.
Build two versions of the randomx NIFs one for the 512MIB dataset
(rx512) and one for the 4096MiB dataset (rx4096)
On MacOS the symbols in each of the statically linked librandomx.a libraries conflicted at runtime causing error or segfaults. The fix is to build the libraries with the -fvisibility=hidden flag. This prevents the symbols from being exported beyond the .so that the libraries are linked into.
NODE_NAME='blah@127.0.0.1' ./bin/start xxxx
@JamesPiechota JamesPiechota force-pushed the feature/packing-difficulty branch from d42664f to 50a0afe Compare September 20, 2024 20:41
@JamesPiechota JamesPiechota merged commit 5e77ca1 into master Sep 20, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants