[format] Add row-oriented file format with O(1) row-number lookups#7934
Conversation
Introduce a new .row file format optimized for fast point lookups by row number, designed for deletion vector applications and changelog materialization. The format stores data in ZSTD-compressed blocks with a block index enabling binary search by row number. Key components: - RowFormatWriter/Reader: block-level write and read with projection and selection (RoaringBitmap) pushdown - BlockPrefetcher: concurrent IO with range coalescing (merges adjacent blocks within 256KB gap, up to 2MB per range) and prefetch sliding window - InputStreamPool: lazy stream pool that opens streams on demand for concurrent reads - RowBlockWriter/Reader: compact row serialization supporting all Paimon types including nested ARRAY, MAP, ROW, and VARIANT - RowBlockIndex: delta+zigzag+varint encoded block metadata - Documentation: rowformat.md specification and fileformat.md updates Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thanks for the PR. I found one correctness blocker and a couple of contract/resource issues in the current head (
The existing new tests pass for me: mvn -pl paimon-format -DskipITs -Dcheckstyle.skip -Drat.skip=true -Dspotless.check.skip=true -Dtest='org.apache.paimon.format.row.*Test' testBut the two temporary contract tests above expose the nested projection and validation issues. |
|
I re-reviewed the latest head (
A minimal reproducer is: RowType elementType = new RowType(Arrays.asList(
new DataField(10, "a", new IntType()),
new DataField(11, "b", new IntType())));
RowType dataSchema = new RowType(Collections.singletonList(
new DataField(0, "arr", new ArrayType(elementType))));
RowType projectedElementType = new RowType(Collections.singletonList(
new DataField(11, "b", new IntType())));
RowType projectedSchema = new RowType(Collections.singletonList(
new DataField(0, "arr", new ArrayType(projectedElementType))));
// Write: arr = [ROW<a=1, b=100>]
// Read with projectedSchema:
InternalArray array = row.getArray(0);
assertThat(array.getRow(0, 1).getInt(0)).isEqualTo(100);This currently returns Could you either make nested projection preserve the same semantics through |
|
Thanks for the update. The direct
RowType elementType = new RowType(Arrays.asList(
new DataField(10, "a", new IntType()),
new DataField(11, "b", new IntType())));
RowType dataSchema = new RowType(Collections.singletonList(
new DataField(0, "arr", new ArrayType(new ArrayType(elementType)))));
RowType projectedElementType = new RowType(Collections.singletonList(
new DataField(11, "b", new IntType())));
RowType projectedSchema = new RowType(Collections.singletonList(
new DataField(0, "arr", new ArrayType(new ArrayType(projectedElementType)))));
// arr = [[ROW<a=1, b=100>]]
assertThat(row.getArray(0).getArray(0).getRow(0, 1).getInt(0)).isEqualTo(100);This currently returns
Could you make the collection projection logic recursive for nested |
|
Thanks for the update. I rechecked the latest head ( |
leaves12138
left a comment
There was a problem hiding this comment.
I rechecked the latest head (06276b0a79bf). The existing row-format targeted tests pass, and the additional nested collection / MULTISET projection edge cases also pass. The previous correctness concerns are addressed.
Introduce a new .row file format optimized for fast point lookups by row number, designed for data evolution table. The format stores data in ZSTD-compressed blocks with a block index enabling binary search by row number.
Key components: