Skip to content

[C++][Gandiva] Duplicate function aliases with same parameters #49985

@lriggs

Description

@lriggs

Describe the bug, including details regarding any error messages, version, and platform.

Problem

It is possible to register different Gandiva functions with the same alias and parameters but different return types, resulting in confusing function overloads.

For example, DATE_EXTRACTION_TRUNCATION_FNS in [cpp/src/gandiva/function_registry_datetime.cc] was invoked twice with the same SQL alias lists — once for extract* (returns int64) and once for date_trunc_* (returns the input date/timestamp type):

DATE_EXTRACTION_TRUNCATION_FNS(EXTRACT_SAFE_NULL_IF_NULL, extract)
DATE_EXTRACTION_TRUNCATION_FNS(TRUNCATE_SAFE_NULL_IF_NULL, date_trunc_)

As a result the registry contained four entries for day(...) where there should have been two:

int64 day(timestamp)   → extractDay_timestamp
int64 day(date)        → extractDay_date64
timestamp day(timestamp) → date_trunc_Day_timestamp
date day(date)         → date_trunc_Day_date64

The same problem existed for every calendar-unit alias: year, month, quarter, week, weekofyear, yearweek, dayofmonth, hour, minute, second. Resolution behavior depended on the caller's inferred return type, which is not the SQL semantics anyone expects from day(timestamp_col).

FunctionRegistry::Add was silently allowing these registrations: unordered_map::emplace keeps the first entry and discards subsequent ones with no warning.

Component(s)

Gandiva

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions