gh-151308: Avoid huge pre-allocation in wave.readframes() for crafted files by iamsharduld · Pull Request #151488 · python/cpython

iamsharduld · 2026-06-15T08:22:04Z

A WAV data chunk records its size in a 4-byte header field that is not
validated against the data actually present in the file. A small,
truncated, or maliciously crafted file can therefore claim a chunk of
several gigabytes and make wave.Wave_read.readframes() pre-allocate that
much memory via a single file.read(chunksize) call, leading to a
MemoryError (or memory exhaustion) from a tiny input.

When the underlying file is seekable, this clamps each read in the internal
_Chunk.read() to the number of bytes physically available, so we never
allocate more than the file can actually provide. The data returned for
valid files is unchanged.

Only the raw file object is probed, never a parent _Chunk, so the probe
can't seek to an untrusted chunk size (which would overflow on 32-bit
platforms such as WASI). Non-seekable streams retain the previous
behaviour, since their size can't be probed without buffering; the
realistic attack vector is a .wav file on disk, which is fully covered.

Issue: Memory Exhaustion in wave.readframes() via Crafted Chunk Size #151308

…rafted files A WAV data chunk records its size in a 4-byte header field that is not validated against the data actually present in the file. A small, truncated, or maliciously crafted file could therefore claim a chunk of several gigabytes and make wave.Wave_read.readframes() pre-allocate that much memory via a single file.read(chunksize) call, leading to a MemoryError (or memory exhaustion) from a tiny input. When the underlying file is seekable, clamp each read in the internal _Chunk.read() to the number of bytes physically available, so we never allocate more than the file can actually provide. The data returned for valid files is unchanged.

iamsharduld · 2026-06-15T08:59:01Z

The only red check, Docs / Check for removed HTML IDs, fails in the Find merge base step with fatal: shallow file has changed since we read it — the known shallow-clone CI flake from #151365 (git 2.54.0 regression), unrelated to this PR: it changes no docs, and the doc-diff steps were all skipped. Could a maintainer re-run that job? All other checks pass.

bedevere-app Bot added the awaiting review label Jun 15, 2026

bedevere-app Bot mentioned this pull request Jun 15, 2026

Memory Exhaustion in wave.readframes() via Crafted Chunk Size #151308

Open

iamsharduld mentioned this pull request Jun 15, 2026

gh-151308: Avoid huge pre-allocation in wave.readframes() for crafted files #151487

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gh-151308: Avoid huge pre-allocation in wave.readframes() for crafted files#151488

gh-151308: Avoid huge pre-allocation in wave.readframes() for crafted files#151488
iamsharduld wants to merge 1 commit into
python:mainfrom
iamsharduld:gh-151308-wave-readframes

iamsharduld commented Jun 15, 2026

Uh oh!

iamsharduld commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

iamsharduld commented Jun 15, 2026

Uh oh!

iamsharduld commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant