Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new method to deal with offset==0 erroneous edge case #3937

Merged
merged 3 commits into from
Mar 12, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 0 additions & 14 deletions doc/decompressor_accepted_invalid_data.md

This file was deleted.

60 changes: 60 additions & 0 deletions doc/decompressor_permissive.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
Decompressor Permissiveness to Invalid Data
===========================================

This document describes the behavior of the reference decompressor in cases
where it accepts formally invalid data instead of reporting an error.

While the reference decompressor *must* decode any compliant frame following
the specification, its ability to detect erroneous data is on a best effort
basis: the decoder may accept input data that would be formally invalid,
when it causes no risk to the decoder, and which detection would cost too much
complexity or speed regression.

In practice, the vast majority of invalid data are detected, if only because
many corruption events are dangerous for the decoder process (such as
requesting an out-of-bound memory access) and many more are easy to check.

This document lists a few known cases where invalid data was formerly accepted
by the decoder, and what has changed since.


Offset == 0
-----------

**Last affected version**: v1.5.5

**Produced by the reference compressor**: No

**Example Frame**: `28b5 2ffd 0000 4500 0008 0002 002f 430b ae`

If a sequence is decoded with `literals_length = 0` and `offset_value = 3`
while `Repeated_Offset_1 = 1`, the computed offset will be `0`, which is
invalid.

The reference decompressor up to v1.5.5 processes this case as if the computed
offset was `1`, including inserting `1` into the repeated offset list.
This prevents the output buffer from remaining uninitialized, thus denying a
potential attack vector from an untrusted source.
However, in the rare case where this scenario would be the outcome of a
transmission or storage error, the decoder relies on the checksum to detect
the error.

In newer versions, this case is always detected and reported as a corruption error.


Non-zeroes reserved bits
------------------------

**Last affected version**: v1.5.5

**Produced by the reference compressor**: No

The Sequences section of each block has a header, and one of its elements is a
byte, which describes the compression mode of each symbol.
This byte contains 2 reserved bits which must be set to zero.

The reference decompressor up to v1.5.5 just ignores these 2 bits.
This behavior has no consequence for the rest of the frame decoding process.

In newer versions, the 2 reserved bits are actively checked for value zero,
and the decoder reports a corruption error if they are not.
2 changes: 1 addition & 1 deletion lib/decompress/zstd_decompress_block.c
Original file line number Diff line number Diff line change
@@ -1305,7 +1305,7 @@ ZSTD_decodeSequence(seqState_t* seqState, const ZSTD_longOffset_e longOffsets, c
} else {
offset = ofBase + ll0 + BIT_readBitsFast(&seqState->DStream, 1);
{ size_t temp = (offset==3) ? seqState->prevOffset[0] - 1 : seqState->prevOffset[offset];
temp += !temp; /* 0 is not valid; input is corrupted; force offset to 1 */
temp -= !temp; /* 0 is not valid: input corrupted => force offset to -1 => corruption detected at execSequence */
if (offset != 1) seqState->prevOffset[2] = seqState->prevOffset[1];
seqState->prevOffset[1] = seqState->prevOffset[0];
seqState->prevOffset[0] = offset = temp;
Binary file added tests/golden-decompression-errors/off0.bin.zst
Binary file not shown.