Skip to content

chore: reuse time bin logic across apis#1660

Open
nikhilsinhaparseable wants to merge 1 commit into
parseablehq:mainfrom
nikhilsinhaparseable:reuse-time-bin
Open

chore: reuse time bin logic across apis#1660
nikhilsinhaparseable wants to merge 1 commit into
parseablehq:mainfrom
nikhilsinhaparseable:reuse-time-bin

Conversation

@nikhilsinhaparseable
Copy link
Copy Markdown
Contributor

@nikhilsinhaparseable nikhilsinhaparseable commented Jun 2, 2026

remove binning logic in counts api
make reusable component
reuse in counts api, errors api, agent-observability related apis

Summary by CodeRabbit

  • New Features
    • Added time-binning utilities for computing intervals, generating expected bin boundaries, and normalizing timestamp formats for more consistent bin alignment.
  • Refactor
    • Count queries now compute adaptive time-interval binning based on the requested range instead of using fixed thresholds, improving aggregation accuracy and consistency.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

Walkthrough

Extracts time-binning logic into new utilities and updates CountsRequest SQL generation to compute DATE_BIN intervals dynamically, using a shared epoch anchor for bin start/end expressions.

Changes

Dynamic time-binning for counts query

Layer / File(s) Summary
Time-binning utility functions
src/utils/time.rs
Adds DATE_BIN_EPOCH_ANCHOR and exported helpers: count_api_bin_interval, interval_for_num_bins, expected_time_bins, and match_time_bin_key.
Counts query refactoring
src/query/mod.rs
Imports the new utilities and replaces hardcoded threshold-based DATE_BIN interval selection with a dynamic bin_interval from count_api_bin_interval, using DATE_BIN_EPOCH_ANCHOR to build *_bin_start_time_ and *_bin_end_time_ SQL expressions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • parseablehq/parseable#1425: Prior refactor of counts query date-binning SQL that relates to this PR's interval selection and formatting changes.

Poem

I nibble at epochs, carrots in rows, 🥕
From fixed ticks to intervals that know,
I stitch anchors and bins through night,
So queries hop in measured flight,
A rabbit's tally, soft and bright.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is minimal and lacks the required template structure including a detailed description section, rationale, key changes, and testing/documentation checklists. Expand description to follow the template: add detailed goal/rationale, document which APIs are affected, explain the chosen approach, and include the testing and documentation checklists.
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'chore: reuse time bin logic across apis' accurately describes the main objective of consolidating and reusing time binning logic across multiple APIs.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

remove binning logic in counts api
make reusable component
reuse in counts api, errors api, agent-observability related apis
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/utils/time.rs (1)

60-121: ⚡ Quick win

Missing unit tests for new time-binning utilities.

The four new exported functions (count_api_bin_interval, interval_for_num_bins, expected_time_bins, match_time_bin_key) lack unit test coverage. These are foundational utilities that will be reused across multiple APIs per the PR objectives.

Do you want me to generate unit tests for these functions?

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/utils/time.rs` around lines 60 - 121, Tests are missing for the four
exported utilities: count_api_bin_interval, interval_for_num_bins,
expected_time_bins, and match_time_bin_key; add a new test module (or file) that
exercises these functions: verify count_api_bin_interval returns
"1m","5m","1h","1d" for durations around the 5-minute, 24-hour, and 10-day
thresholds using DateTime<Utc> instances; test interval_for_num_bins for correct
seconds calculation and formatting and for edge case num_bins=0/1 (or guard
against 0 by using at least 1); test expected_time_bins returns num_bins
entries, that bins cover contiguous ranges and that the final bin_end equals the
provided end; and test match_time_bin_key with RFC3339 strings with and without
timezone suffixes (Z, +00:00, other offsets) to ensure normalization and
matching within 1 second and that non-matching inputs are returned unchanged—use
DateTime::parse_from_rfc3339 and Utc conversions in asserts and reference the
functions count_api_bin_interval, interval_for_num_bins, expected_time_bins, and
match_time_bin_key when locating code to test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/utils/time.rs`:
- Around line 101-106: The timezone normalization in match_time_bin_key
incorrectly treats strings with negative offsets as lacking a timezone; change
the logic that builds normalized from sql_bin_start so it detects any explicit
timezone suffix (either 'Z' or a +/-HH:MM offset) instead of only checking for
'+'; for example, use a regex or attempt parse_from_rfc3339 to detect a present
offset and only append "+00:00" when no timezone is present, keeping function
name match_time_bin_key and variable sql_bin_start as the points to modify.

---

Nitpick comments:
In `@src/utils/time.rs`:
- Around line 60-121: Tests are missing for the four exported utilities:
count_api_bin_interval, interval_for_num_bins, expected_time_bins, and
match_time_bin_key; add a new test module (or file) that exercises these
functions: verify count_api_bin_interval returns "1m","5m","1h","1d" for
durations around the 5-minute, 24-hour, and 10-day thresholds using
DateTime<Utc> instances; test interval_for_num_bins for correct seconds
calculation and formatting and for edge case num_bins=0/1 (or guard against 0 by
using at least 1); test expected_time_bins returns num_bins entries, that bins
cover contiguous ranges and that the final bin_end equals the provided end; and
test match_time_bin_key with RFC3339 strings with and without timezone suffixes
(Z, +00:00, other offsets) to ensure normalization and matching within 1 second
and that non-matching inputs are returned unchanged—use
DateTime::parse_from_rfc3339 and Utc conversions in asserts and reference the
functions count_api_bin_interval, interval_for_num_bins, expected_time_bins, and
match_time_bin_key when locating code to test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f5900f5a-f916-4783-a03d-a99b6fa0c426

📥 Commits

Reviewing files that changed from the base of the PR and between cbfbfa1 and 06d123e.

📒 Files selected for processing (2)
  • src/query/mod.rs
  • src/utils/time.rs

Comment thread src/utils/time.rs
Comment on lines +101 to +106
pub fn match_time_bin_key(sql_bin_start: &str, expected_bins: &[(String, String)]) -> String {
let normalized = if !sql_bin_start.contains('+') && !sql_bin_start.ends_with('Z') {
format!("{sql_bin_start}+00:00")
} else {
sql_bin_start.to_string()
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Timezone normalization doesn't handle negative UTC offsets.

If sql_bin_start contains a negative timezone offset (e.g., "2023-01-01T00:00:00-05:00"), it won't contain '+' and won't end with 'Z', causing the code to incorrectly append "+00:00". The subsequent parse_from_rfc3339 would fail on the malformed string, and the function falls back to returning the original—but no matching occurs.

🔧 Proposed fix to detect existing timezone offsets
 pub fn match_time_bin_key(sql_bin_start: &str, expected_bins: &[(String, String)]) -> String {
-    let normalized = if !sql_bin_start.contains('+') && !sql_bin_start.ends_with('Z') {
+    // Check for existing timezone: ends with Z, or contains + or - followed by HH:MM pattern
+    let has_tz = sql_bin_start.ends_with('Z')
+        || sql_bin_start.contains('+')
+        || sql_bin_start.rfind('-').map_or(false, |pos| pos > 10); // '-' after date portion
+    let normalized = if !has_tz {
         format!("{sql_bin_start}+00:00")
     } else {
         sql_bin_start.to_string()
     };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn match_time_bin_key(sql_bin_start: &str, expected_bins: &[(String, String)]) -> String {
let normalized = if !sql_bin_start.contains('+') && !sql_bin_start.ends_with('Z') {
format!("{sql_bin_start}+00:00")
} else {
sql_bin_start.to_string()
};
pub fn match_time_bin_key(sql_bin_start: &str, expected_bins: &[(String, String)]) -> String {
// Check for existing timezone: ends with Z, or contains + or - followed by HH:MM pattern
let has_tz = sql_bin_start.ends_with('Z')
|| sql_bin_start.contains('+')
|| sql_bin_start.rfind('-').map_or(false, |pos| pos > 10); // '-' after date portion
let normalized = if !has_tz {
format!("{sql_bin_start}+00:00")
} else {
sql_bin_start.to_string()
};
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/utils/time.rs` around lines 101 - 106, The timezone normalization in
match_time_bin_key incorrectly treats strings with negative offsets as lacking a
timezone; change the logic that builds normalized from sql_bin_start so it
detects any explicit timezone suffix (either 'Z' or a +/-HH:MM offset) instead
of only checking for '+'; for example, use a regex or attempt parse_from_rfc3339
to detect a present offset and only append "+00:00" when no timezone is present,
keeping function name match_time_bin_key and variable sql_bin_start as the
points to modify.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/utils/time.rs (1)

101-106: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Negative UTC offsets are still normalized incorrectly.

Strings like 2023-01-01T00:00:00-05:00 still fail the '+'/'Z' check, so this appends +00:00 and produces an invalid RFC3339 value. That leaves valid SQL timestamps unmatched.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/utils/time.rs` around lines 101 - 106, The code in match_time_bin_key
incorrectly assumes absence of '+' or 'Z' means no timezone and appends
"+00:00", which breaks negative offsets like "2023-01-01T00:00:00-05:00"; update
the normalization to detect a timezone suffix properly by checking for a
trailing 'Z' or a numeric offset pattern (e.g. use a regex like
r"[+-]\d{2}:\d{2}$" or equivalent) against sql_bin_start, and only append
"+00:00" when that pattern does not match; modify the normalization logic around
sql_bin_start in match_time_bin_key to use this check.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/utils/time.rs`:
- Around line 80-97: The expected_time_bins function currently anchors bins at
time_range.start which diverges from the SQL DATE_BIN(...,
DATE_BIN_EPOCH_ANCHOR) behavior; change expected_time_bins to compute the first
bin start aligned to the same epoch anchor used by SQL (or accept a resolved
anchor/bin_start parameter) and then generate subsequent bins by adding
bin_seconds increments so the Rust keys match the SQL keys; modify the signature
of expected_time_bins (or add an extra parameter) to accept an epoch_anchor or a
precomputed first_bin_start, locate logic in expected_time_bins and replace the
bin_start/bin_end calculation to start from that aligned anchor (use TimeRange
and num_bins to compute total_seconds and bin_seconds as before), and ensure the
last bin_end still uses time_range.end.
- Around line 74-77: interval_for_num_bins currently divides by num_bins and
will panic on num_bins == 0; change the function to validate num_bins first and
return a Result<String, ValidationError> (or a suitable crate error) instead of
a bare String, returning Err when num_bins == 0 and Ok(format!("{bin_seconds}
seconds")) otherwise; apply the same fix to the other public helper that
performs division by num_bins (the sibling helper defined nearby), update
callers to handle the Result, and preserve the existing calculation logic after
the guard.

---

Duplicate comments:
In `@src/utils/time.rs`:
- Around line 101-106: The code in match_time_bin_key incorrectly assumes
absence of '+' or 'Z' means no timezone and appends "+00:00", which breaks
negative offsets like "2023-01-01T00:00:00-05:00"; update the normalization to
detect a timezone suffix properly by checking for a trailing 'Z' or a numeric
offset pattern (e.g. use a regex like r"[+-]\d{2}:\d{2}$" or equivalent) against
sql_bin_start, and only append "+00:00" when that pattern does not match; modify
the normalization logic around sql_bin_start in match_time_bin_key to use this
check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 416da683-6946-4d4e-a3ca-14a900e1e4e0

📥 Commits

Reviewing files that changed from the base of the PR and between 06d123e and 28361c9.

📒 Files selected for processing (2)
  • src/query/mod.rs
  • src/utils/time.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/query/mod.rs

Comment thread src/utils/time.rs
Comment on lines +74 to +77
pub fn interval_for_num_bins(start: &DateTime<Utc>, end: &DateTime<Utc>, num_bins: u64) -> String {
let total_seconds = end.signed_duration_since(*start).num_seconds().max(1) as u64;
let bin_seconds = (total_seconds / num_bins).max(1);
format!("{bin_seconds} seconds")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard num_bins before dividing.

Both public helpers panic on num_bins == 0 here. That turns a bad bin-count input into a runtime failure instead of a handled validation error.

Also applies to: 80-86

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/utils/time.rs` around lines 74 - 77, interval_for_num_bins currently
divides by num_bins and will panic on num_bins == 0; change the function to
validate num_bins first and return a Result<String, ValidationError> (or a
suitable crate error) instead of a bare String, returning Err when num_bins == 0
and Ok(format!("{bin_seconds} seconds")) otherwise; apply the same fix to the
other public helper that performs division by num_bins (the sibling helper
defined nearby), update callers to handle the Result, and preserve the existing
calculation logic after the guard.

Comment thread src/utils/time.rs
Comment on lines +80 to +97
pub fn expected_time_bins(time_range: &TimeRange, num_bins: u64) -> Vec<(String, String)> {
let total_seconds = time_range
.end
.signed_duration_since(time_range.start)
.num_seconds()
.max(1) as u64;
let bin_seconds = (total_seconds / num_bins).max(1);

(0..num_bins)
.map(|i| {
let bin_start = time_range.start + chrono::Duration::seconds((i * bin_seconds) as i64);
let bin_end = if i == num_bins - 1 {
time_range.end
} else {
time_range.start + chrono::Duration::seconds(((i + 1) * bin_seconds) as i64)
};
(bin_start.to_rfc3339(), bin_end.to_rfc3339())
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

expected_time_bins is using a different anchor than the SQL bins.

The query layer now bins with DATE_BIN(..., DATE_BIN_EPOCH_ANCHOR), but this helper still starts every bucket at time_range.start. For any unaligned range, the generated keys drift from the SQL keys, so match_time_bin_key stops matching them. Please derive these boundaries from the same epoch anchor (or pass the already-resolved SQL bin start values through) so both sides use one contract.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/utils/time.rs` around lines 80 - 97, The expected_time_bins function
currently anchors bins at time_range.start which diverges from the SQL
DATE_BIN(..., DATE_BIN_EPOCH_ANCHOR) behavior; change expected_time_bins to
compute the first bin start aligned to the same epoch anchor used by SQL (or
accept a resolved anchor/bin_start parameter) and then generate subsequent bins
by adding bin_seconds increments so the Rust keys match the SQL keys; modify the
signature of expected_time_bins (or add an extra parameter) to accept an
epoch_anchor or a precomputed first_bin_start, locate logic in
expected_time_bins and replace the bin_start/bin_end calculation to start from
that aligned anchor (use TimeRange and num_bins to compute total_seconds and
bin_seconds as before), and ensure the last bin_end still uses time_range.end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant