Skip to content

Implement Standalone Nexus Operations#1461

Open
VegetarianOrc wants to merge 34 commits into
mainfrom
amazzeo/sano
Open

Implement Standalone Nexus Operations#1461
VegetarianOrc wants to merge 34 commits into
mainfrom
amazzeo/sano

Conversation

@VegetarianOrc
Copy link
Copy Markdown
Contributor

@VegetarianOrc VegetarianOrc commented Apr 17, 2026

I recommend viewing this diff with the histogram diff algorithm to reduce noise and see the additions more clearly. You can view that by checking out the branch and running git diff --histogram origin/main

Add Standalone Nexus Operation Support

API Changes

  • Client.create_nexus_client() - Create a typed NexusClient from an endpoint & nexusrpc.Service
  • Client.list_nexus_operations() - List operations via visibility query
  • Client.count_nexus_operations() - Count operations via visibility query
  • Client.get_nexus_operation_handle() - Create a typed NexusOperationHandle to an existing operation
  • NexusClient.start_operation() - Start a Nexus Operation and return a NexusOperationHandle.
  • NexusClient.execute_operation() - Same as start_operation, but poll the handle for the final result.
  • NexusOperationHandle.result() / .describe() / .cancel() / .terminate() / property accessors - interact with an existing operation.
  • Supporting types like input types, enums, and errors added as appropriate.

Interceptor Support

The following methods have been added to OutboundInterceptor

  • start_nexus_operation()
  • describe_nexus_operation()
  • cancel_nexus_operation()
  • terminate_nexus_operation()
  • list_nexus_operations()
  • count_nexus_operations()

Testing

Added integration test suite tests/nexus/test_standalone_operations.py (858 lines) covering start, get result, describe, cancel, terminate, list, count, ID conflict/reuse policies, and interceptor integration

Misc Changes

  • Update ruff target version to py310 to match the minimum support Python version.

Copy link
Copy Markdown

@Evanthx Evanthx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good overall!

Comment thread temporalio/api/compute/v1/config_pb2.pyi
Comment thread temporalio/client.py Outdated
Comment thread temporalio/client.py
…n defaults for id_reuse_policy and id_conflict_policy. Update integration tests with better typing and new assertions for the newly interceptable get_nexus_operation_result
Comment thread temporalio/client.py
…ch start/execute operation. Fix hardcoded retryable=True in fallback error serialization. Add type tests and rename test/nexus/test_type_errors.py to ensure that the type test file properly executes nexus tests.
Copy link
Copy Markdown
Contributor

@THardy98 THardy98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick passthrough

Comment thread temporalio/client.py
Comment on lines +4825 to +4828
_metadata_decoded: bool = field(
kw_only=True, default=False, compare=False, repr=False
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we need a field for this. I know we did this for workflow description, but I think it's reasonable to just lazy decode on demand without mutating _static_details or _static_summary.

nbd either way, thought I'd mention it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to remove if we feel like it's an over optimization

Comment thread temporalio/client.py
@VegetarianOrc VegetarianOrc marked this pull request as ready for review May 12, 2026 17:45
@VegetarianOrc VegetarianOrc requested a review from a team as a code owner May 12, 2026 17:45
Copy link
Copy Markdown

@Evanthx Evanthx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor comments

Comment thread temporalio/client.py Outdated
Comment thread temporalio/client.py
Comment thread temporalio/nexus/_operation_context.py Outdated
# Task completion should never be dropped in case of cancellation.
# The Rust future in core must complete for shutdown to happen without
# hanging.
async def _complete_task(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed for stand alone nexus operations or is this fixing a different bug?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a different bug, but implementing standalone nexus operations surfaced the race condition. Specifically starting an operation and then terminating it before the completion has been sent to Core. Previously, this wasn't caught as the cancellation propagation from a caller workflow was slow enough to not trigger the race.

Happy to move it to a separate PR to limit the scope if you'd like.

Copy link
Copy Markdown
Contributor

@THardy98 THardy98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Client impl and tests for SANO lgtm, though you may want to get additional approval from nexus folks and @tconley1428 - i wouldn't merge with only my approval

Comment thread tests/nexus/test_standalone_operations.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants