Skip to content

fix(storage): fix flaky storage integration tests#13253

Draft
nidhiii-27 wants to merge 4 commits into
mainfrom
storage/flaky-test-fix
Draft

fix(storage): fix flaky storage integration tests#13253
nidhiii-27 wants to merge 4 commits into
mainfrom
storage/flaky-test-fix

Conversation

@nidhiii-27
Copy link
Copy Markdown
Contributor

This PR addresses intermittent flakiness, connection refused errors, and state validation exceptions in the Google Cloud Storage integration and unit test suites under highly concurrent CI (Kokoro) environments.

Key Fixes

  1. Testbench Lifecycle Isolation (Forks Race Condition):

    • Problem: Maven's parallel JVM forks (<forkCount>1C</forkCount>) collided on hardcoded ports (9000/9005) and container names. When a fork completed first, it prematurely killed the shared Testbench container, causing all remaining active tests in other forks to fail with ConnectException: Connection refused.
    • Solution: Implemented dynamic port allocation and unique container naming (fork-<port>) in Registry.java. This fully isolates each parallel JVM process's testbench container instance.
  2. Mock Concurrency Isolation (DirectWriteService):

    • Problem: The mock DirectWriteService in ITGapicUnbufferedWritableByteChannelTest held requests in a shared mutable class field. Under concurrent gRPC threads in CI, the request builder got corrupted, throwing false-positive PERMISSION_DENIED: Unexpected request chain errors.
    • Solution: Refactored DirectWriteService to keep the request builder completely isolated and local to each stream observer instance.
  3. Graceful Terminal Success Handling:

    • Problem: During channel auto-close, buffered.close() flushes the final bytes, successfully transitioning the stream's state to TERMINAL_SUCCESS. Immediately after, unbuffered.close() runs and calls finishWrite again. Querying the terminal state threw IllegalStateException: state mismatch.
    • Solution: Modified finishWrite and closeStream inside BidiUploadStreamingStream.java to return true (safe no-op) immediately if the upload state has already successfully reached TERMINAL_SUCCESS. Also updated mock classes in BidiUploadTest to support the new state checks.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces state checks in BidiUploadStreamingStream to return early when the stream is in a terminal success state during finishWrite and closeStream operations. It also refactors DirectWriteService in the integration tests to scope the request builder locally within the writeObject method. The review feedback suggests extending the state checks to cover all terminal states, including failures, to prevent potential IllegalStateException during subsequent state transitions.

Comment on lines +168 to +170
if (state.getState() == State.TERMINAL_SUCCESS) {
return true;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The check only accounts for State.TERMINAL_SUCCESS. If the stream has reached a terminal failure state (e.g., TERMINAL_FAILURE), subsequent calls to finishWrite might still trigger an IllegalStateException when calling state.isFinalizing() or other methods that enforce state transitions. Consider checking for any terminal state to ensure the method is robust against all completed stream states.

Comment on lines +198 to +200
if (state.getState() == State.TERMINAL_SUCCESS) {
return true;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the change in finishWrite, this check only handles TERMINAL_SUCCESS. To prevent IllegalStateException when the stream has failed or been cancelled, it would be safer to check for any terminal state before proceeding with state.finalFlush(length).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant