[common] Introduce RTreeFileIndex for query optimization by xuzifu666 · Pull Request #7919 · apache/paimon

xuzifu666 · 2026-05-20T12:23:57Z

Purpose

Paimon currently does not support rtree indexes. Refer to this paper https://postgis.net/docs/support/rtree.pdf for implementation instructions on how to implement this index.
The following are the relevant benchmark test results:

Hardware Configuration

CPU: MacBook Pro (M-series processor)
Memory: 16GB LPDDR5
Operating System: macOS 14.x

Software Configuration

Java Version: OpenJDK 11+
Build: Maven 3.8.x

Test Parameters

Warmup Iterations: 3
Benchmark Iterations: 10
Query Count: 1000-10000 queries
Random Seed: 42 (for reproducibility)

Query Performance (10,000 queries)

R-Tree:       0.47 µs per query
Linear Scan:  464.41 µs per query
Speedup:      985.58×
Average results per query: 20 records

Analysis by Dataset Size

Dataset Size	R-Tree (µs)	Linear Scan (µs)	Speedup	Query Selectivity
1K	0.20	14.90	75×	2%
10K	0.12	50.24	403×	2%
100K	0.35	492.44	1407×	2%
1M	0.39	495.25	1279×	2%

Query Type	Area Size	R-Tree (µs)	Linear Scan (µs)	Speedup	Selectivity
Small	500×500	0.22	366.27	1684×	0.02%
Medium	1500×1500	0.21	400.52	1899×	0.02%
Large	5000×5000	0.28	556.48	1997×	0.03%

Point Query vs Range Query
Search Performance on 100K Dataset:

Point queries (1000):      303.76 µs/query (with warmup optimization)
Range queries (100):       357.04 µs/query
Linear scan (100 scans):   65170.62 µs/scan

Improvement vs Linear Scan:
Point query: 214× speedup
Range query: 182× speedup

Sequential Data Access Pattern

1M grid data (1000×1000 points)

Average query time: 1.54 µs
Results returned: 30 records

Performance Characteristics:
- First query: 8.38 µs (cache warmup)
- Subsequent queries: 0.67-0.88 µs (steady state)

Tests

Run comparison benchmark
org.apache.paimon.fileindex.rtree.RTreeVsLinearScanBenchmark

Run detailed benchmark
org.apache.paimon.fileindex.rtree.RTreeBenchmark

###Here is a schematic diagram of the implementation:

R-Tree Basic Structure

             │         Root Node                   │
                        │  (Internal, stores child bboxes)    │
                        │  bbox: [0,0] to [100,100]           │
                        └──────────┬──────────────┬────────────┘
                                   │              │
                    ┌──────────────┘              └─────────────────┐
                    │                                              │
        ┌───────────▼──────────────┐              ┌────────────────▼──────┐
        │  Internal Node 1         │              │  Internal Node 2      │
        │  bbox: [0,0]-[50,100]    │              │  bbox: [50,0]-[100,100]
        └───────┬──────────────────┘              └────────────┬──────────┘
                │                                             │
        ┌───────┴────────┐                           ┌────────┴────────┐
        │                │                           │                 │
   ┌────▼──────┐  ┌─────▼──────┐           ┌───────▼────┐  ┌────────▼─┐
   │ Leaf 1    │  │ Leaf 2     │           │ Leaf 3     │  │ Leaf 4   │
   │ bbox:     │  │ bbox:      │           │ bbox:      │  │ bbox:    │
   │ [0,0]-    │  │ [20,30]-   │           │ [50,50]-   │  │ [75,75]- │
   │ [20,30]   │  │ [40,50]    │           │ [70,70]    │  │ [100,100]
   │ rowId: 1  │  │ rowId: 2   │           │ rowId: 3   │  │ rowId: 4 │
   └───────────┘  └────────────┘           └────────────┘  └──────────┘

Node Split Process
2.1 Full Node Before Split

┌──────────────────────────────────────────────┐
│           Full Leaf Node (maxEntries=4)      │
│  [1-entry]  [2-entry]  [3-entry]  [4-entry] │
│   point(5,5) point(8,3) point(10,7) point(12,9)
│                                              │
│  Need to insert 5th point: (15, 12) ✗ Over  │
└──────────────────────────────────────────────┘
                    │
                    │ Trigger split
                    ▼

2.2 Split Process

                 Before split (5 entries)
    ┌───────────────────────────────────────┐
    │ Node A: [①②③④⑤]                     │
    │ bbox: [5,3] to [15,12]                 │
    └─────────────────┬─────────────────────┘
                      │
            ┌─────────┴──────────┐
            │ Linear split:      │
            │ Keep first 2.5 ≈ 2 │
            │ New node gets 3    │
            ▼                    ▼

    ┌──────────────────┐        ┌─────────────────┐
    │ Node A (Leaf)    │        │ Node B (Leaf)   │
    │ [①②]            │        │ [③④⑤]          │
    │ bbox: [5,3]-     │        │ bbox: [10,7]-   │
    │       [8,3]      │        │       [15,12]   │
    └────────┬─────────┘        └────────┬────────┘
             │                          │
             └──────────┬───────────────┘
                        │
                        ▼
             ┌──────────────────────┐
             │  Parent Node Updated │
             │  Add Node B ref      │
             │  [Node A] [Node B]   │
             └──────────────────────┘

2.3 Leaf Node Content Details

Before split:
┌─────────────────────────────────────────────────┐
│          RTreeNode (leaf=true)                  │
│                                                  │
│  leafEntries: [                                  │
│    LeafEntry(rowId=1, bbox=[5,5]-[5,5]),        │
│    LeafEntry(rowId=2, bbox=[8,3]-[8,3]),        │
│    LeafEntry(rowId=3, bbox=[10,7]-[10,7]),      │
│    LeafEntry(rowId=4, bbox=[12,9]-[12,9]),      │
│    LeafEntry(rowId=5, bbox=[15,12]-[15,12])     │
│  ]                                               │
│                                                  │
│  boundingBox: [5,3] to [15,12]                   │
│  (automatically expanded from all entries)       │
└─────────────────────────────────────────────────┘

After split:
┌────────────────────────────────┐  ┌──────────────────────────┐
│   Node A (leaf=true)           │  │   Node B (leaf=true)     │
│                                │  │                          │
│ leafEntries: [                 │  │ leafEntries: [           │
│   LeafEntry(1, [5,5]),         │  │   LeafEntry(3, [10,7]),  │
│   LeafEntry(2, [8,3])          │  │   LeafEntry(4, [12,9]),  │
│ ]                              │  │   LeafEntry(5, [15,12])  │
│                                │  │ ]                        │
│ bbox: [5,3]-[8,3]              │  │ bbox: [10,7]-[15,12]     │
└────────────────────────────────┘  └──────────────────────────┘

Query Process
3.1 Point Query Flow

             Input: Point (12, 8)
                    │
                    ▼
        ┌─────────────────────────┐
        │ Check Root bbox         │
        │ Does Point(12,8) inter- │
        │ sect [0,0]-[100,100]?   │
        │       Yes ✓             │
        └────────┬────────────────┘
                 │
        ┌────────▼───────────────┐
        │ Check child bbox       │
        │ Point in [0,0]-[50,100]?
        │       No ✗             │
        │ Point in [50,0]-      │
        │ [100,100]?            │
        │       Yes ✓            │
        └────────┬────────────────┘
                 │
        ┌────────▼──────────────────────┐
        │ Recursively check Internal    │
        │ Node 2                        │
        │ Point in [50,50]-[70,70]?     │
        │       No ✗                    │
        │ Point in [75,75]-[100,100]?   │
        │       No ✗                    │
        └────────┬───────────────────────┘
                 │
        ┌────────▼──────────────────┐
        │ Reach leaf node           │
        │ Check Leaf 4 bbox         │
        │ Point(12,8) not in here   │
        └───────────────────────────┘
                 │
                 ▼
            ┌──────────┐
            │ Empty    │
            │ result   │
            └──────────┘

3.2 Range Query Flow

Input: BoundingBox [10,5] to [50,60]
           │
           ▼
   ┌───────────────────────────────┐
   │ Check Root bbox intersection  │
   │ [10,5]-[50,60] intersects     │
   │ [0,0]-[100,100]?              │
   │      Yes ✓ Continue           │
   └───────┬─────────────────────┬─┘
           │                     │
   ┌───────▼──────────┐  ┌──────▼──────────┐
   │ Subtree1 inter?  │  │ Subtree2 inter? │
   │ [0,0]-[50,100]?  │  │ [50,0]-[100,100]
   │ Yes ✓ Recurse    │  │ Yes ✓ Recurse  │
   └───────┬──────────┘  └──────┬──────────┘
           │                    │
   ┌───────▼──────┐    ┌────────▼──────┐
   │ Check Leaf   │    │ Check Leaf    │
   │ 1-2 entries  │    │ 3-4 entries   │
   │              │    │               │
   └───────┬──────┘    └────────┬──────┘
           │                    │
   ┌───────▼─────────────────┬──▼──────┐
   │ Return all intersecting │ rowIds  │
   │ [rowId1, rowId3, ...]   │         │
   └──────────────────────────────────┘

Complete Insert Flow

Insert operation: insert(point=[25, 35], rowId=10)

         ┌──────────────────────────────┐
         │ 1. Choose best path          │
         │ (minimum expansion area)     │
         └──────────────┬───────────────┘
                        │
         ┌──────────────▼───────────────┐
         │ 2. Recursively descend       │
         │ from Root to leaf            │
         │ Calculate expansion cost     │
         └──────────────┬───────────────┘
                        │
         ┌──────────────▼───────────────┐
         │ 3. Reach leaf node          │
         │ Create LeafEntry            │
         └──────────────┬───────────────┘
                        │
         ┌──────────────▼───────────────┐
         │ 4. Add Entry                │
         │ node.addLeafEntry(entry)    │
         │ Update bbox                 │
         └──────────────┬───────────────┘
                        │
         ┌──────────────▼───────────────┐
         │ 5. Check capacity           │
         │ if (node.canSplit())        │
         │   splitNode()               │
         └──────────────┬───────────────┘
                        │
         ┌──────────────▼───────────────┐
         │ 6. If split:                │
         │ Parent adds new node        │
         │ Possible cascading split    │
         └──────────────┬───────────────┘
                        │
         ┌──────────────▼───────────────┐
         │ 7. Backtrack update parent  │
         │ bboxes until Root           │
         └──────────────┬───────────────┘
                        │
                        ▼
                   ✓ Insert complete

Data Serialization Flow

In-memory RTree structure
       │
       ▼
┌──────────────────────────────────────────┐
│ Serialization format:                    │
│                                          │
│ ┌─ Metadata                             │
│ │ ├─ dimensions: int (2)                │
│ │ ├─ maxEntries: int (32)               │
│ │ └─ treeSize: int (1000)               │
│ │                                       │
│ ├─ Recursively serialize Node:          │
│ │ ├─ isLeaf: boolean (true/false)      │
│ │ ├─ entryCount: int (N)               │
│ │ ├─ boundingBox: double[4]            │
│ │ │  (min_x, min_y, max_x, max_y)      │
│ │ │                                    │
│ │ └─ If Leaf:                          │
│ │    ├─ rowId1: int + bbox             │
│ │    ├─ rowId2: int + bbox             │
│ │    └─ ...                            │
│ │                                       │
│ │ If Internal:                          │
│ │    ├─ Child1 Node (recursive)        │
│ │    ├─ Child2 Node (recursive)        │
│ │    └─ ...                            │
│ │                                       │
│ └─ All nodes in DFS order              │
│                                          │
│ Result: byte[] (binary data)            │
└──────────────────────────────────────────┘
       │
       ▼
   File storage

Class Relationship Diagram

┌─────────────────────────────────────────────────────┐
│                     RTree                          │
│  - root: RTreeNode                                 │
│  - dimensions: int                                 │
│  - maxEntries: int                                 │
│                                                     │
│  + insert(point[], rowId)                          │
│  + search(BoundingBox): List<rowId>                │
│  + search(point[]): List<rowId>                    │
└─────────────────────────────────────────────────────┘
                        │
                        │ contains
                        │ (tree structure)
                        ▼
┌──────────────────────────────────────────────────────────┐
│                    RTreeNode                           │
│  - boundingBox: BoundingBox                            │
│  - leafEntries: List<LeafEntry> (if leaf=true)        │
│  - children: List<RTreeNode> (if leaf=false)          │
│  - parent: RTreeNode                                  │
│  - isLeaf: boolean                                    │
│  - maxEntries: int                                    │
│                                                        │
│  + isLeaf(): boolean                                  │
│  + addLeafEntry(LeafEntry)                           │
│  + addChild(RTreeNode)                               │
│  + canSplit(): boolean                               │
│  + getBoundingBox(): BoundingBox                      │
└──────────────────────────────────────────────────────────┘
         │                              │
         │ contains                     │ references
         │                              │ parent
         ▼                              ▼
    ┌──────────────────┐         ┌─────────────┐
    │   LeafEntry      │         │BoundingBox  │
    │                  │         │             │
    │ - rowId: int     │         │ - min[]     │
    │ - bbox:BBox      │         │ - max[]     │
    │                  │         │ - dimensions│
    │ + getRowId()     │         │             │
    │ + getBbox()      │         │ + expand()  │
    └──────────────────┘         │ + intersects
                                 │ + contains()
                                 └─────────────┘

Typical Query Examples
Example 1: Point Query

Query: search(Point[15, 35])

Tree structure:
           Root [0,0]-[100,100]
          /            \
      [0,0]-[50,100]   [50,0]-[100,100]
      /         \         /          \
   Leaf1      Leaf2    Leaf3      Leaf4
 [5,5]      [20,30]  [50,50]    [75,75]
 
Execution:
Step 1: Point[15,35] in Root bbox? YES → Continue
Step 2: Point in left subtree [0,0]-[50,100]? YES → Recurse
        Point in right subtree [50,0]-[100,100]? NO → Skip
Step 3: Check Leaf1 [5,5]-[5,5]: Point inside? NO
Step 4: Check Leaf2 [20,30]-[20,30]: Point inside? NO
Step 5: Result: Empty

Complexity: O(log 4) = O(1) for 4 leaves
           Actual operations: 3 bbox intersection checks

Example 2: Range Query

Query: search(BoundingBox[10,10]-[60,60])

Execution:
Step 1: Range[10,10]-[60,60] intersects Root[0,0]-[100,100]? YES
Step 2: Intersects left subtree[0,0]-[50,100]? YES → Recurse
        Intersects right subtree[50,0]-[100,100]? YES → Recurse
Step 3: Left subtree:
        Leaf1[5,5] in range? NO
        Leaf2[20,30] in range? YES → Add to result
Step 4: Right subtree:
        Leaf3[50,50] in range? YES → Add to result
        Leaf4[75,75] in range? YES → Add to result
Step 5: Result: [Leaf2_rowId, Leaf3_rowId, Leaf4_rowId]

Complexity: O(log N + K) = O(1 + 3) = O(4)
           where N=4, K=3 (result count)

Memory Layout

Java heap memory R-Tree structure:

┌─────────────────────────────────────────────────────┐
│                  RTree object                      │
│  ┌──────────────────────────────────────────┐      │
│  │ root: RTreeNode                          │      │
│  │ dimensions: 2                            │      │
│  │ maxEntries: 32                           │      │
│  └──────────────────────────────────────────┘      │
└────────────┬──────────────────────────────────────┘
             │ references
             ▼
    ┌────────────────────┐
    │   RTreeNode obj    │ (root)
    │ (internal node)    │
    │                    │
    │ children:          │
    │  [Node1 ref]       │
    │  [Node2 ref]       │
    │  [Node3 ref]       │
    │                    │
    │ leafEntries: []    │
    │ (empty)            │
    │                    │
    │ boundingBox:       │
    │  ├─ min: [0, 0]    │
    │  └─ max: [100,100] │
    │                    │
    │ parent: null       │
    └────┬─┬─┬───────────┘
         │ │ │
    ┌────┘ │ │
    │      │ └────────┐
    │      └──────┐   │
    ▼             ▼   ▼
  Node1         Node2 Node3
  (leaf)        (leaf)(leaf)
   │             │     │
   ▼             ▼     ▼
[LeafEntry]  [LE][LE] [LE][LE][LE]
rowId:1      rowId:2-3 rowId:4-6

JingsongLi · 2026-05-21T02:35:59Z

+        } else {
+            for (int i = 0; i < entryCount; i++) {
+                RTreeNode child = new RTreeNode(dimensions, maxEntries, false);
+                node.addChild(child);


RTreeNode.isLeaf is final. When the RTree constructor creates root, isLeaf=true. However, during deserialization, if the root of the tree is an internal node, the deserialization code will add children to the node with isLeaf=true. Afterwards, when searching, node. isLeaf() returns true and will look for leafRowIds instead of recursive children. The deserialized tree cannot be queried correctly at all.

I changed private final boolean isLeaf → private boolean isLeaf, added setLeaf(boolean) method, then called setLeaf() during deserialization to correct the root node's leaf flag.

JingsongLi · 2026-05-21T02:36:45Z

+
+        RTreeNode newNode = new RTreeNode(dimensions, maxEntries, true);
+
+        int mid = entries.size() / 2;


This completely disregards spatial location. The correct R-Tree splitting should minimize the MBR overlap between two result nodes, otherwise a large number of nodes will "intersect" during queries, degenerating into linear scans.

I did these changes:

Implemented QuadraticSplit algorithm (2-step approach):
a. PickSeeds: Select two entries with maximum distance as initial seeds
b. Assign: Assign remaining entries to group with minimum bbox expansion

Implemented QuadraticSplitInternal: Same algorithm for internal node splitting

Modified RTree.java to use QuadraticSplit instead of linear splitting

JingsongLi · 2026-05-21T02:37:09Z

+
+        if (node.isLeaf()) {
+            for (Integer rowId : node.getLeafRowIds()) {
+                results.add(rowId);


Directly joined without checking if entry.bbox intersects with searchBox

Changed as:

Enhanced RTree.java search() method:
a. Leaf nodes: Added per-entry checking entry.getBbox().intersects(searchBox)
b. Previously only checked node.getBoundingBox().intersects()

LeafEntry structure: Stores rowId + bbox for precision verification

JingsongLi · 2026-05-21T02:37:36Z

+import org.apache.paimon.types.DataType;
+
+/** The implementation of R-Tree file index. */
+public class RTreeFileIndex implements FileIndexer {


As a File Index, all data is already known at the time of writing, and the quality of the tree constructed by inserting each item is much lower than that of STR (Sort Tile Recursive) bulk loading

My current alternative is:

Implemented STRBulkLoader (Sort-Tile-Recursive algorithm):
a. Sort entries by current dimension
b. Partition into vertical tiles (~maxEntries per tile)
c. Recursively process each tile with next dimension
d. Build tree bottom-up

Enhanced RTreeFileIndexWriter:
a. write() method: Collect entries into list
b. serializedBytes(): Use STRBulkLoader for batch tree construction

xuzifu666 · 2026-05-21T04:35:36Z

@JingsongLi Thank you for the review! Your comments are very helpful, and I will refine them based on these issues.

leaves12138 · 2026-05-21T14:27:09Z

Thanks for the update. I found two blocking correctness/contract issues in the current head (a4143bf63109).

RTreeFileIndexWriter does not accept the object type passed by the real file-index write path for ARRAY<DOUBLE> columns. DataFileIndexWriter.FileIndexMaintainer.write() calls fileIndexWriter.writeRecord(getter.getFieldOrNull(row)), so an array field is passed as Paimon's InternalArray implementation, e.g. GenericArray / binary array, not as double[] or java.util.List. However, RTreeFileIndexWriter.extractPoint() only handles List and double[]. A normal write through the file-index path can therefore fail with:
```
Cannot extract point from: org.apache.paimon.data.GenericArray
```
Production deserialization still loses the leaf flag for non-root nodes. RTreeFileIndexReader.deserializeNode() only calls node.setLeaf(isLeaf) for the root. Child nodes are always created with new RTreeNode(..., false), and the recursive call does not update their serialized isLeaf value. As a result, leaf children under an internal node remain isLeaf=false; after serializing a multi-level tree with RTreeFileIndexWriter and reading it back with RTreeFileIndex.createReader(), an equality query for an existing point can return SKIP.

I reproduced both with a small contract test:

writer.writeRecord(new GenericArray(new double[] {1.0, 2.0})) throws from RTreeFileIndexWriter.extractPoint().
Writing 200 points, serializing with RTreeFileIndexWriter, deserializing with the production RTreeFileIndexReader, and querying [50.0, 50.0] returns no match.

The existing tests pass because they mostly call writer.write(new double[] {...}) directly, bypassing FileIndexWriter.writeRecord(), and RTreeSerializationTest uses its own test deserializer that constructs nodes with the serialized isLeaf flag, bypassing the production reader bug.

Please fix these before merge:

Make the writer handle Paimon's InternalArray for ARRAY<DOUBLE> values, and check whether reader literals need the same treatment.
Preserve isLeaf for every node during production deserialization, not only the root.
Add regression coverage through the real FileIndexWriter.writeRecord() / RTreeFileIndex.createReader() path.

xuzifu666 · 2026-05-22T02:57:47Z

Thanks for the update. I found two blocking correctness/contract issues in the current head (a4143bf63109).
RTreeFileIndexWriter does not accept the object type passed by the real file-index write path for ARRAY<DOUBLE> columns. DataFileIndexWriter.FileIndexMaintainer.write() calls fileIndexWriter.writeRecord(getter.getFieldOrNull(row)), so an array field is passed as Paimon's InternalArray implementation, e.g. GenericArray / binary array, not as double[] or java.util.List. However, RTreeFileIndexWriter.extractPoint() only handles List and double[]. A normal write through the file-index path can therefore fail with:
Cannot extract point from: org.apache.paimon.data.GenericArray
Production deserialization still loses the leaf flag for non-root nodes. RTreeFileIndexReader.deserializeNode() only calls node.setLeaf(isLeaf) for the root. Child nodes are always created with new RTreeNode(..., false), and the recursive call does not update their serialized isLeaf value. As a result, leaf children under an internal node remain isLeaf=false; after serializing a multi-level tree with RTreeFileIndexWriter and reading it back with RTreeFileIndex.createReader(), an equality query for an existing point can return SKIP.
I reproduced both with a small contract test:

writer.writeRecord(new GenericArray(new double[] {1.0, 2.0})) throws from RTreeFileIndexWriter.extractPoint().

Writing 200 points, serializing with RTreeFileIndexWriter, deserializing with the production RTreeFileIndexReader, and querying [50.0, 50.0] returns no match.

The existing tests pass because they mostly call writer.write(new double[] {...}) directly, bypassing FileIndexWriter.writeRecord(), and RTreeSerializationTest uses its own test deserializer that constructs nodes with the serialized isLeaf flag, bypassing the production reader bug.

Please fix these before merge:

Make the writer handle Paimon's InternalArray for ARRAY<DOUBLE> values, and check whether reader literals need the same treatment.

Preserve isLeaf for every node during production deserialization, not only the root.

Add regression coverage through the real FileIndexWriter.writeRecord() / RTreeFileIndex.createReader() path.

Thank you for the suggestion！I made relevant changes, PTAL @leaves12138

xuzifu666 · 2026-05-23T01:45:34Z

Currently all comments should have be addressed in my view, could you help to review it again when you have a time~ @JingsongLi

JingsongLi

Review

Design Issues

1. visitEqual semantics don't fit spatial queries.

The reader only overrides visitEqual. In practice, spatial queries are range queries (WHERE point WITHIN bbox), not equality checks. The predicate system's Equal function passes the literal through LeafPredicate.literals(), which gets serialized/deserialized via JSON — a double[] or BoundingBox object won't survive that roundtrip without custom serialization support in LeafPredicate.deserializeLiterals().

How does the user actually trigger this index from SQL? There's no predicate pushdown path that converts a spatial function (e.g., ST_Within, ST_Intersects) into a visitEqual call with a BoundingBox literal. This means the index is unreachable from the query engine as implemented.

2. Row ID uses int — overflows at 2B+ rows.

LeafEntry.rowId is int. A single Parquet file can exceed 2 billion rows. The serialization format also uses dos.writeInt(entry.getRowId()). This should be long to match Paimon's row addressing.

3. The entire R-Tree is deserialized into memory at read time.

RTreeFileIndexReader.deserializeRTree() reconstructs the full tree in Java heap with all node objects. For a data file with millions of points, this creates millions of RTreeNode, LeafEntry, BoundingBox objects — severe GC pressure. Compare with the bitmap index which uses a compact RoaringBitmap32 that stays serialized until needed.

Consider a zero-copy or on-disk traversal approach: serialize in BFS/DFS order, seek through the byte stream during search without materializing all nodes.

4. RTreeIndexResult.remain() is semantically wrong.

public boolean remain() {
    return !getBitmap().isEmpty();
}

remain() should return true when the file might contain matching rows (i.e., cannot be skipped). A non-empty bitmap correctly means "some rows match, keep the file." However, what remain() does NOT do is filter at row level — the bitmap is constructed but never exposed to the reader pipeline for row-level filtering. The RTreeIndexResult has a rowCount field but it's unused for anything meaningful.

If the goal is row-level filtering (skip specific rows within a file), the result should implement a BitmapIndexResult-compatible interface. Otherwise the R-Tree only provides file-level skip/remain decisions, which limits its value to files with very few matching rows.

5. Writer uses one-by-one insertion, then bulk-loads — confusing dual path.

The writer collects all entries in a List<LeafEntry>, then calls buildRTreeWithSTRBulkLoader(). Good — STR bulk loading is efficient. But the RTree.insert() method (quadratic split path) is unused in production and only exists for tests/benchmarks. This dead code adds maintenance burden. Consider removing the dynamic insertion path or clearly marking it as test-only.

Implementation Issues

6. BoundingBox.fromPoint() clones min but shares the reference for max.

public static BoundingBox fromPoint(double[] point) {
    return new BoundingBox(point, point);
}

The constructor clones both, so this is safe. But it creates two array copies for every point — significant allocation pressure during bulk insert. Consider a specialized point constructor that stores a single array.

7. QuadraticSplit is O(n²) and duplicates logic.

QuadraticSplit and QuadraticSplitInternal are nearly identical except for the entry type (LeafEntry vs RTreeNode). This should be a single generic class. More importantly, since the writer uses STR bulk loading, the quadratic split is only triggered during dynamic insertion (which is unused in production). Dead code.

8. No support for visitIn, visitLessThan, visitGreaterThan, visitBetween.

Spatial queries naturally map to range predicates. The reader doesn't override visitBetween or visitLessOrEqual/visitGreaterOrEqual, which means multi-dimensional range filters from the optimizer won't use this index.

9. STRBulkLoader.buildLevel() modifies the input list by sorting it.

entries.sort((a, b) -> Double.compare(...));

This mutates the caller's list. Should sort a copy (the new ArrayList<>(entries) is done at the top level but not in recursive calls since entries.subList(i, end) is a view).

10. Serialization uses Java DataOutputStream — not portable.

Paimon's other indexes use MemorySegment-based serialization for alignment with the internal memory model. Using Java DataOutputStream/DataInputStream means no zero-copy reading, extra allocation, and platform-dependent behavior for edge cases.

Test Issues

11. Benchmark classes are in src/test but aren't JUnit tests.

RTreeBenchmark and RTreeVsLinearScanBenchmark have public static void main() methods but no @Test annotations. They won't run in CI. Either convert to JMH with proper @Benchmark annotations, or move to a dedicated benchmark module. RTreeJMHBenchmark exists but it's also in test — it needs the JMH dependency properly configured.

12. Too many test classes for a single PR.

8 test classes (BoundingBoxTest, RTreeCriticalFixTest, RTreeFileIndexTest, RTreeIntegrationTest, RTreeQuadraticSplitTest, RTreeSTRBulkLoaderTest, RTreeSerializationTest, RTreeSplitFixTest, RTreeTest) plus 3 benchmark classes. This is hard to review. Consolidate to 2-3 test classes: unit tests, integration test, and (optional) benchmark.

Summary

The core R-Tree algorithm implementation is reasonable, but the integration with Paimon's predicate/file-index framework is incomplete. Without a way to push spatial predicates from the query engine to visitEqual with proper literal serialization, this index cannot be triggered from SQL. I'd suggest:

Define how spatial predicates flow from SQL → optimizer → file index reader
Use long for row IDs
Avoid full in-memory deserialization — use on-disk traversal
Remove dead code (dynamic insertion, quadratic split) from production
Make RTreeIndexResult interoperable with BitmapIndexResult for row-level filtering

JingsongLi requested changes May 21, 2026

View reviewed changes

xuzifu666 added 17 commits May 22, 2026 19:45

[core] Support rtree file index

7b5d34b

add files

9ac7c8c

add files

0808ff4

add files

3552934

add docs

32cdc22

improve adjustParent

575f2f1

serialization fix

c23e2e1

Address 3

4fe8c81

Addressed

40bfb20

fix

2f53821

fix

ce2bb4f

addressed

485300e

Addressed

e43d796

Addressed

42ec52f

fix doc

10a363b

fix test

4dc44c3

add docs

0d22805

xuzifu666 force-pushed the rtree_support branch from 54edc79 to 0d22805 Compare May 22, 2026 11:51

xuzifu666 added 2 commits May 22, 2026 20:07

fix doc

4b217e1

improve comments

3c70745

xuzifu666 requested a review from JingsongLi May 23, 2026 01:45

JingsongLi reviewed May 23, 2026

View reviewed changes


		RTreeNode newNode = new RTreeNode(dimensions, maxEntries, true);

		int mid = entries.size() / 2;

Conversation

xuzifu666 commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Hardware Configuration

Software Configuration

Test Parameters

Tests

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xuzifu666 commented May 21, 2026

Uh oh!

leaves12138 commented May 21, 2026

Uh oh!

xuzifu666 commented May 22, 2026

Uh oh!

xuzifu666 commented May 23, 2026

Uh oh!

JingsongLi left a comment

Choose a reason for hiding this comment

Review

Design Issues

Implementation Issues

Test Issues

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xuzifu666 commented May 20, 2026 •

edited

Loading