gh-151308: Avoid huge pre-allocation in wave.readframes() for crafted files#151488
Open
iamsharduld wants to merge 1 commit into
Open
gh-151308: Avoid huge pre-allocation in wave.readframes() for crafted files#151488iamsharduld wants to merge 1 commit into
iamsharduld wants to merge 1 commit into
Conversation
…rafted files A WAV data chunk records its size in a 4-byte header field that is not validated against the data actually present in the file. A small, truncated, or maliciously crafted file could therefore claim a chunk of several gigabytes and make wave.Wave_read.readframes() pre-allocate that much memory via a single file.read(chunksize) call, leading to a MemoryError (or memory exhaustion) from a tiny input. When the underlying file is seekable, clamp each read in the internal _Chunk.read() to the number of bytes physically available, so we never allocate more than the file can actually provide. The data returned for valid files is unchanged.
Contributor
Author
|
The only red check, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A WAV data chunk records its size in a 4-byte header field that is not
validated against the data actually present in the file. A small,
truncated, or maliciously crafted file can therefore claim a chunk of
several gigabytes and make
wave.Wave_read.readframes()pre-allocate thatmuch memory via a single
file.read(chunksize)call, leading to aMemoryError(or memory exhaustion) from a tiny input.When the underlying file is seekable, this clamps each read in the internal
_Chunk.read()to the number of bytes physically available, so we neverallocate more than the file can actually provide. The data returned for
valid files is unchanged.
Only the raw file object is probed, never a parent
_Chunk, so the probecan't seek to an untrusted chunk size (which would overflow on 32-bit
platforms such as WASI). Non-seekable streams retain the previous
behaviour, since their size can't be probed without buffering; the
realistic attack vector is a
.wavfile on disk, which is fully covered.wave.readframes()via Crafted Chunk Size #151308