Skip to content

feature: add more object storage backends#346

Open
liujiwen-up wants to merge 2 commits into
apache:mainfrom
liujiwen-up:object-storage-opendal-support
Open

feature: add more object storage backends#346
liujiwen-up wants to merge 2 commits into
apache:mainfrom
liujiwen-up:object-storage-opendal-support

Conversation

@liujiwen-up
Copy link
Copy Markdown
Contributor

@liujiwen-up liujiwen-up commented May 29, 2026

Purpose

Linked issue: close #xxx

Add OpenDAL-backed support for more object storage backends in Paimon Rust, covering COS, Azure Data Lake Storage Gen2, OBS, and GCS.

Brief change log

Added storage feature flags for storage-cos, storage-azdls, storage-obs, and storage-gcs.
Added storage-all coverage for the new backends.
Implemented backend config parsing and OpenDAL operator construction for:

  • Tencent Cloud COS
  • Azure Data Lake Storage Gen2
  • Huawei Cloud OBS
  • Google Cloud Storage
    Added scheme aliases such as cosn, gs, abfs, and abfss.
    Added object storage FileIO path tests to verify scheme detection and relative path handling.
    Extended GCS config alias support for Hadoop/OpenDAL-style option keys.
    Updated docs to list and show usage examples for the new storage backends.

Tests

cargo check -p paimon
cargo check -p paimon --features storage-all
cargo test -p paimon --features storage-all io::object_storage_path_test --no-fail-fast
cargo test -p paimon --features storage-all io::storage_gcs --no-fail-fast
cargo test -p paimon --features storage-all io:: --no-fail-fast
cargo test -p paimon --features storage-all --no-fail-fast

API and Format

This change adds optional Cargo feature flags and storage backend support. It does not change the Paimon table format or existing storage format.

Documentation

Updated documentation to include the newly supported storage backends and basic configuration examples.

Copy link
Copy Markdown
Contributor

@QuakeWang QuakeWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@liujiwen-up please fix the CI check

Comment thread crates/paimon/src/io/storage_gcs.rs Outdated
Ok(Operator::new(builder)?.finish())
}

fn normalize_config(props: HashMap<String, String>) -> HashMap<String, String> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The config normalization and mirrored-key logic are duplicated across the backend modules. Maybe a small shared helper could be useful if we keep adding storage backends.

@XiaoHongbo-Hope
Copy link
Copy Markdown
Contributor

Non-blocking: could we add a test for listing an ADLS filesystem root without a trailing slash, e.g. abfs:// fs@account.dfs.core.windows.net? azdls_relative_path can return an empty relative path, and list_status reconstructs child paths
with base_path + entry_path, so root paths without / may produce malformed returned paths.

feature = "storage-gcs",
feature = "storage-obs"
))]
fn bucket_and_relative_path<'a>(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This generalized bucket_and_relative_path(storage_name, allowed_schemes) duplicates the logic of the existing OSS/S3-specific version above. Consider unifying into one method — OSS/S3 callers just pass their own allowed_schemes slice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants