Skip to content

feat(storage): implement GCS overrun logging and resume prevention#16150

Open
v-pratap wants to merge 10 commits into
googleapis:mainfrom
v-pratap:log-extra-reads-2
Open

feat(storage): implement GCS overrun logging and resume prevention#16150
v-pratap wants to merge 10 commits into
googleapis:mainfrom
v-pratap:log-extra-reads-2

Conversation

@v-pratap

Copy link
Copy Markdown
Contributor

This commit implements overrun logging, satisfied-resume prevention, and server-side transcoding bypass for both the synchronous and asynchronous GCS download paths.

Synchronous Path:

  • Introduced a private 'OverrunLoggingObjectReadSource' decorator inside 'object_read_streambuf.cc' that wraps the transport source to track received bytes lock-free.
  • Logs a warning with the exact overshoot byte count, bucket, and object name if GCS sends more bytes than requested.
  • Leaves 'object_read_streambuf.h' and 'object_read_stream.h' completely clean, untouched, and thread-safe.

Asynchronous Path (gRPC BiDi Stream & Streaming Reader):

  • Integrated chunk-level byte tracking inside 'ReadRange' and 'AsyncReaderConnectionResume'.
  • Under server-side transcoding (gzip), suppresses overrun warnings and bypasses checksum validation failures (since decompressed bytes naturally exceed the compressed request and mismatch the original checksums).
  • Implemented satisfied-resume prevention: if a connection fails but the requested number of bytes has already been received, we refuse to reconnect to prevent GCS from sending unwanted extra bytes, and return Status::OK directly, masking any subsequent stream errors.
  • Defined overloaded constructors to maintain 100% backward compatibility for all existing tests and client code.

@v-pratap v-pratap requested review from a team as code owners June 10, 2026 10:52
@product-auto-label product-auto-label Bot added the api: storage Issues related to the Cloud Storage API. label Jun 10, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces overrun logging and transcoding detection when reading objects from GCS, updating classes like ReadRange, AsyncReaderConnectionResume, and ObjectReadStreambuf to track received bytes and suppress warnings or checksum validation for transcoded data. The code review identified several critical issues: data races in ObjectDescriptorImpl::OnRead due to accessing metadata_ after unlocking and in AsyncReaderConnectionResume::Reconnect due to accessing impl_ without a lock; potential undefined behavior from signed integer overflow when negating p.start at its minimum value; duplicated overrun check logic in OverrunLoggingObjectReadSource; and a bug where casting a negative requested_length_ to std::size_t causes an unsigned wrap-around, bypassing the overrun check.

Comment thread google/cloud/storage/internal/async/object_descriptor_impl.cc
Comment thread google/cloud/storage/internal/async/reader_connection_resume.cc Outdated
Comment thread google/cloud/storage/internal/async/object_descriptor_impl.cc Outdated
Comment thread google/cloud/storage/internal/object_read_streambuf.cc Outdated
Comment thread google/cloud/storage/internal/object_read_streambuf.cc
@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 98.61751% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.21%. Comparing base (ce5aa34) to head (5a7d764).

Files with missing lines Patch % Lines
...d/storage/internal/async/object_descriptor_impl.cc 80.00% 3 Missing ⚠️
...le/cloud/storage/internal/async/read_range_test.cc 98.27% 2 Missing ⚠️
...le/cloud/storage/internal/object_read_streambuf.cc 95.45% 2 Missing ⚠️
google/cloud/storage/internal/async/read_range.cc 95.83% 1 Missing ⚠️
...storage/internal/async/reader_connection_resume.cc 96.55% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main   #16150    +/-   ##
========================================
  Coverage   92.20%   92.21%            
========================================
  Files        2264     2264            
  Lines      209092   209725   +633     
========================================
+ Hits       192802   193405   +603     
- Misses      16290    16320    +30     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@v-pratap v-pratap requested a review from kalragauri June 11, 2026 04:56
Comment thread google/cloud/storage/internal/async/object_descriptor_impl.cc
Comment thread google/cloud/storage/internal/async/object_descriptor_impl.cc
Comment thread google/cloud/storage/internal/object_read_streambuf.cc Outdated
Comment thread google/cloud/storage/internal/async/read_range.cc
Comment thread google/cloud/storage/internal/object_read_streambuf.cc Outdated
@kalragauri

Copy link
Copy Markdown
Contributor

Overall looks good - added some minor comments.

v-pratap added 9 commits June 15, 2026 07:15
This commit implements overrun logging, satisfied-resume prevention, and server-side transcoding bypass for both the synchronous and asynchronous GCS download paths.

Synchronous Path:
- Introduced a private 'OverrunLoggingObjectReadSource' decorator inside 'object_read_streambuf.cc' that wraps the transport source to track received bytes lock-free.
- Logs a warning with the exact overshoot byte count, bucket, and object name if GCS sends more bytes than requested.
- Leaves 'object_read_streambuf.h' and 'object_read_stream.h' completely clean, untouched, and thread-safe.

Asynchronous Path (gRPC BiDi Stream & Streaming Reader):
- Integrated chunk-level byte tracking inside 'ReadRange' and 'AsyncReaderConnectionResume'.
- Under server-side transcoding (gzip), suppresses overrun warnings and bypasses checksum validation failures (since decompressed bytes naturally exceed the compressed request and mismatch the original checksums).
- Implemented satisfied-resume prevention: if a connection fails but the requested number of bytes has already been received, we refuse to reconnect to prevent GCS from sending unwanted extra bytes, and return Status::OK directly, masking any subsequent stream errors.
- Defined overloaded constructors to maintain 100% backward compatibility for all existing tests and client code.

Testing:
- Added 15 new, comprehensive unit tests covering all overrun, tail-read, transcoding, and resume scenarios.
- Fix data race on metadata_ in ObjectDescriptorImpl::OnRead by extracting is_transcoded status before unlocking the mutex.
- Fix data race on impl_ in AsyncReaderConnectionResume::Reconnect by calling CurrentImpl() to safely acquire the lock.
- Fix potential signed integer negation overflow undefined behavior in ObjectDescriptorImpl::Read by explicitly checking for numeric_limits min() and capping the limit.
- De-duplicate overrun check logic in OverrunLoggingObjectReadSource by calling CheckOverrun() directly in Read().
- Fix potential signed-to-unsigned cast wrap-around bug in CheckOverrun() by checking that requested_length_ is non-negative before casting.
- Dynamically verify if server-side decompressive transcoding actually occurred by comparing total received bytes to the object's original stored size (when available).
- Prevents silently ignoring real data corruption when a user explicitly requests compressed gzip-encoded bytes and receives a corrupted file.
- Falls back to the standard is_transcoded flag if the object size is unknown (maintaining 100% backward compatibility and preventing test breakages).
- Applied this robust check to both ReadRange (async BiDi range reads) and AsyncReaderConnectionResume (async streaming reader).
- Added a new unit test 'TranscodingFailsOnRealChecksumMismatch' to verify that a checksum mismatch is correctly reported as an error if the received size matches the compressed object size.
- Resolved a compiler signedness warning in read_range.cc by casting object_size to std::size_t.
- Made the 'logged_warning_' flag in ObjectReadStreambuf a std::atomic<bool> to ensure absolute thread-safety.
- Refactored overrun warning logs to print once at the end of the download (stream Close or completion/failure) rather than immediately on the first overrunning chunk.
- Aligns the C++ GCS client's logging behavior with the official Google Cloud Go client (cloud.google.com/go/storage).
- Ensures that the warning log reports the true, final total overrun byte count instead of a partial overrun count from the first chunk.
- Sync Path: Removed CheckOverrun from the hot Read() path in ObjectReadStreambuf, moving it to Close() and the decorator destructor (~OverrunLoggingObjectReadSource).
- Async Range Path: Deferred CheckOverrun in ReadRange to the final range_end and OnFinish paths.
- Async Resume Path: Deferred CheckOverrun in AsyncReaderConnectionResume to the final status.ok, satisfied, and stop paths.
- Updated all unit tests in object_read_streambuf_test.cc, read_range_test.cc, and reader_connection_resume_test.cc to drive the streams to completion (EOF/Close) before verifying the logs.
- Converted the logged_warning_ check-and-set in ObjectReadStreambuf to a thread-safe atomic exchange (!logged_warning_.exchange(true)) to prevent any potential check-then-act race conditions.
- Added explicit static_cast<std::size_t> casts to requested_length_ in both ReadRange and AsyncReaderConnectionResume overrun logging statements. This prevents signed/unsigned mixing arithmetic and conforms perfectly to strict compiler sign-comparison flags.
- Documented and verified architectural correctness of total_received_bytes_ relative counting (essential for ranged reads starting at non-zero offsets) and the >0 read_limit check (mandatory for GCS proto3 default semantics).
@v-pratap v-pratap force-pushed the log-extra-reads-2 branch from 1e5b546 to 957d8a0 Compare June 15, 2026 07:15
@v-pratap v-pratap requested a review from kalragauri June 15, 2026 07:18

if (data.range_end()) {
status_ = Status{};
CheckOverrun();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do this if we have satisfied the requested length, even if range_end is false?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think yes, Once we get all the bytes the user asked for, the download is basically done. If we wait for the server to close the stream, any sudden network drop in the meantime would fail the download, even though we already have all the data. Completing immediately avoids this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: storage Issues related to the Cloud Storage API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants