Skip to content

Python(fix): add channel timeout#609

Merged
alexluck-sift merged 3 commits into
mainfrom
al/python/chore/timeouts
Jun 3, 2026
Merged

Python(fix): add channel timeout#609
alexluck-sift merged 3 commits into
mainfrom
al/python/chore/timeouts

Conversation

@alexluck-sift
Copy link
Copy Markdown
Collaborator

@alexluck-sift alexluck-sift commented Jun 2, 2026

Summary

Synchronous SDK calls could hang indefinitely. A stalled gRPC connection, or a call issued after the client's background event loop had stopped during teardown, left the calling thread blocked forever because neither the RPC nor the blocking wait had a deadline. This adds a default request deadline plus two backstops so a stalled or post-shutdown call fails with a clear error instead of hanging.

Changes

  • Default RPC deadline (primary). Unary calls now carry a 60s deadline via the gRPC service config (methodConfig.timeout), alongside the existing retry policy. A stalled call fails with DEADLINE_EXCEEDED rather than hanging. Configurable per connection via GrpcConfig(request_timeout=...), overridable per call, and disabled by passing None.
  • Bounded synchronous wait. The sync wrapper's .result() now uses a deadline (RPC timeout plus a margin) and raises TimeoutError after cancelling the request. The gRPC deadline fires first, so this only trips if that fails.
  • Fail fast after close. A call made once the background loop has stopped now raises a clear RuntimeError instead of scheduling onto a dead loop and blocking.

Testing

  • _internal unit suite passes, including new coverage for the sync deadline and the post-close guard.
  • Confirmed the deadline lands in the channel service config: 60s default, explicit override honored, omitted when disabled.

Notes

The default of 60s sits above the retry budget (5 attempts, 5s max backoff). The service-config timeout applies to all methods via the wildcard matcher; this is safe because high-throughput streaming runs through the Rust stream bindings on a separate connection, not this channel.

@alexluck-sift alexluck-sift force-pushed the al/python/chore/timeouts branch from 18f62e5 to a276406 Compare June 2, 2026 19:46
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

Python docs preview: https://sift-stack.github.io/sift/python/pr-609/

Deployed from eeca98a. The link may take up to a minute to become live as GitHub Pages propagates.

Use dict.get for the NotRequired "timeout" key so pyright's
reportTypedDictNotRequiredAccess no longer fails static-checks.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexluck-sift alexluck-sift merged commit 07ec3d9 into main Jun 3, 2026
23 checks passed
@alexluck-sift alexluck-sift deleted the al/python/chore/timeouts branch June 3, 2026 00:22
timeout = getattr(client, "sync_call_timeout", None)
future = asyncio.run_coroutine_threadsafe(coro, loop)
try:
return future.result(timeout=timeout)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this timeout cap (75s) wrapping the coroutine will abort long-running sync processes like wait_until_complete / wait_and_download even if they're still polling, or if the user passes in a larger timeout_sec arg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants