Skip to content

[Feature] ParquetReader: f32/f64 columns + Arrow cast #1340

@0lai0

Description

@0lai0

What

Make ParquetReader generic over FloatElem so it can read both Float32 and Float64 Parquet columns, with zero-copy for same-dtype and an Arrow cast for cross-dtype reads.

Why

ParquetReader::new rejects non-Float64 columns at schema check (readers/parquet.rs ~line 91), making f32 Parquet files completely unusable today. This blocks any f32 file source pipeline.

How

  • Introduce ParquetReader<T: FloatElem> accepting List<Float32> and List<Float64> (and FixedSizeList variants)
  • Same dtype → zero-copy path; different dtype → arrow::compute::cast with safe options (document NaN/Inf/overflow behavior)
  • Apply the same policy to ParquetStreamingReader
  • New test file qdp-core/tests/parquet_f32.rs covering: f32 column read as f32, f64 column cast to f32, unsupported column type returns InvalidInput with dtype in message

Out of scope: pipeline_runner wiring

Acceptance criteria:

  • ParquetReader::<f32> succeeds on f32 column
  • f64 column read as f32: documented cast rules applied
  • Unsupported column type returns InvalidInput error with column dtype in message
  • New tests in qdp-core/tests/parquet_f32.rs

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions