Skip to content

[Feature] Pipeline file load respects PipelineConfig.dtype #1341

@0lai0

Description

@0lai0

What

Wire PipelineConfig.dtype through read_file_by_extension and the producer layer so that f32-configured pipelines actually reach the f32 encoder kernel when loading from file sources.

Why

read_file_by_extension always returns Vec<f64> (pipeline_runner.rs). InMemoryProducer and StreamingProducer populate BatchData::F64 even when config.dtype == Float32, so the f32 encoder is never reached from file sources. The user's dtype choice is silently ignored.

How

  • read_file_by_extension (and streaming path) selects the f32 vs f64 reader based on PipelineConfig.dtype
  • InMemoryProducer / StreamingProducer use the matching BatchData variant
  • Python API (QuantumDataLoader.source_file) unchanged at the surface
  • Integration test: f32 Parquet + dtype=f32 config → encode_batch_f32_for_pipeline path is exercised

Acceptance criteria:

  • Integration test: f32 Parquet + dtype=f32 config → encode_batch_f32_for_pipeline path is used
  • Existing f64 file tests still pass
  • No silent f64 kernel when user requested f32 file pipeline

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions