Skip to content

[python] Fix Daft Paimon write column alignment#7947

Open
QuakeWang wants to merge 1 commit into
apache:masterfrom
QuakeWang:fix/daft-paimon-column-order
Open

[python] Fix Daft Paimon write column alignment#7947
QuakeWang wants to merge 1 commit into
apache:masterfrom
QuakeWang:fix/daft-paimon-column-order

Conversation

@QuakeWang
Copy link
Copy Markdown
Contributor

What

  • Align Daft write batches to the Paimon target schema by column name before casting.
  • Reject missing, extra, or duplicate fields instead of silently writing wrong data.
  • Add regression coverage with native pypaimon reader verification.

Tests

  • DAFT_RUNNER=native /tmp/paimon-daft-pr6951-venv/bin/pytest -q paimon-python/pypaimon/tests/daft/daft_sink_test.py

Related: Eventual-Inc/Daft#6951

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
@QuakeWang
Copy link
Copy Markdown
Contributor Author

PTAL @chenghuichen

@chenghuichen
Copy link
Copy Markdown
Contributor

+1.
This is a PR migrated from Daft, and the content is solid — I've already reviewed it there. Daft is order-agnostic on DataFrame columns, while pypaimon requires strict schema match including column order, so aligning by name in the sink is the necessary fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants