feat: for offsite audits, read brand presence execution data from PostgREST (LLMO-4307) by rarescheseli · Pull Request #2389 · adobe/spacecat-audit-worker

rarescheseli · 2026-04-15T12:40:56Z

Summary:

Add PostgREST as the primary data source for brand presence execution data in offsite audits, falling back to the existing SharePoint file-backed approach when no DB rows are found
Extract isBrandalfEnabled and resolveOrganizationIdForSite into a shared brandalf-utils.js module, reused by the offsite handler, enrichment, and customer analysis flows
Rename brand-presence-enrichment.js to offsite-brand-presence-enrichment.js for consistency with related modules

Please ensure your pull request adheres to the following guidelines:

make sure to link the related issues in this description
when merging / squashing, make sure the fixed issue references are visible in the commits, for easy compilation of release notes
If data sources for any opportunity has been updated/added, please update the wiki for same opportunity.

Related Issues

Thanks for contributing!

github-actions · 2026-04-15T12:58:53Z

This PR will trigger no release when merged.

codecov · 2026-04-16T06:49:44Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…o-json

solaris007

Hey @rarescheseli,

This PR directly addresses one of the key findings from the aem.live traffic spike investigation - adding the brandalf guard to brand-presence-enrichment.js that was missing. The overall approach is solid: PostgREST as primary source, feature-flag-gated rollout, legacy fallback, and thorough test coverage. Good work on the module extraction and the legacy format mapping - changing one layer at a time is the right transitional strategy.

Strengths

Clean extraction of isBrandalfEnabled and resolveOrganizationIdForSite into shared brandalf-utils.js with improved null-safety over the inline original
Well-designed org ID resolution chain: site.getOrganizationId() -> explicit fallback -> Site.findById() lookup
All PostgREST interactions use parameterized queries via supabase-js fluent API - no injection risk. Every query filters by both organization_id AND site_id for tenant isolation
Fail-closed design throughout - any PostgREST failure returns null, cleanly triggering the legacy path
mapExecutionsToLegacyBrandPresenceRows reshapes DB results into the existing { Sources, Region, Topics, Prompt, Category } shape so downstream consumers stay untouched
Thorough test coverage across all new modules: brandalf-utils.test.js covers all 15 branches, offsite-brand-presence-postgrest.test.js covers pagination, chunking, null inputs, and error paths
Safe rollout via feature flag - can be enabled per-org without risk to non-migrated sites

Issues

Important (Should Fix)

1. Fallback to aem.live for brandalf-enabled sites defeats the traffic reduction goal

In both offsiteBrandPresenceRunner and computeTopicsFromBrandPresence, when isBrandalfOrg is true but loadBrandPresenceDataFromPostgrest returns null (no data, empty sources, or error), the code falls through to the legacy aem.live file-fetch path. But the whole motivation for this PR is that migrated sites generate 404s against aem.live. The fallback re-creates the exact problem.

For brandalf-enabled sites, "no data in PostgREST" should mean "no brand presence data exists for this period" - not "try fetching from the old path that will 404." The fallback should only apply when isBrandalfOrg is false.

Recommendation: after the PostgREST attempt, if isBrandalfOrg is true, return an empty/no-data result instead of falling through to the aem.live path.

2. Feature flag HTTP call per audit invocation - no caching

isBrandalfEnabled issues a fresh HTTP request to /organizations/{id}/feature-flags on every audit invocation. At 212K invocations/day (and growing), that's 212K round trips to the SpaceCat API just for flag checks. A module-level Map keyed by organizationId with a 5-minute TTL would cut this traffic by orders of magnitude.

3. Source query fan-out scales poorly

fetchSourcesForExecutionIds chunks execution IDs into groups of 50 and queries them sequentially. A site with 5000 executions yields 100 sequential PostgREST round trips. Options: increase EXECUTION_ID_CHUNK_SIZE to 200-500 (Postgres handles .in() with hundreds of UUIDs comfortably), use Promise.all with a concurrency limiter, or query brand_presence_sources by the same (org_id, site_id, date_range) filter to avoid the ID fan-out entirely.

4. Duplicated PostgREST-first orchestration logic

The same sequence (resolve org ID -> check brandalf flag -> load from PostgREST -> fallback) is repeated nearly verbatim in offsiteBrandPresenceRunner and computeTopicsFromBrandPresence. If the fallback strategy changes (e.g., fixing issue 1 above), both sites must be updated in lockstep. Consider extracting a shared loadBrandPresenceWithFallback(siteId, context) function.

5. No successful multi-page pagination test

The pagination test for fetchExecutionsBatched creates a 5000-row first batch, then the second batch errors. There is no test where two pages both succeed and results are accumulated correctly. Add a test with a first batch of exactly EXECUTION_FETCH_BATCH_SIZE rows and a second batch of fewer rows, then assert the combined count.

Minor (Nice to Have)

6. 'all' -> 'chatgpt-paid' mapping is undocumented - In BRAND_PRESENCE_DB_MODEL_BY_PROVIDER, all: 'chatgpt-paid' is semantically surprising. A one-line comment explaining why would help future maintainers.

7. Wasted topicMap computation in handler.js PostgREST path - extractUrlsAndTopics populates topicMap but it goes out of scope unused - only allUrls is consumed downstream.

8. computeTopicsFromBrandPresence doesn't receive the site object - Only siteId is passed, forcing resolveOrganizationIdForSite to do a Site.findById() lookup. All callers (cited, reddit, youtube handlers) already have site in scope. Threading it through would eliminate one DB round trip per invocation.

Recommendations

Issue 1 is the highest-priority fix - it directly determines whether this PR achieves its goal of reducing aem.live traffic for migrated sites
Issues 2-4 can ship in a fast-follow if the team prefers, but the caching (issue 2) will matter at current scale
The companion PR adobe/spacecat-api-service#2229 (lowering default proxy limit to 5K) is already approved - together these two PRs address the top recommendations from the traffic spike analysis

Assessment

Ready to merge? With fixes

The code is well-structured, well-tested, and directly addresses the aem.live traffic spike root cause. The one issue that should be fixed before merge is the fallback behavior for brandalf-enabled sites (issue 1) - without it, migrated sites will still generate 404 traffic against aem.live, which is the problem this PR is meant to solve. The remaining issues (caching, fan-out, duplication, pagination test) are important for operational health and can be addressed in a fast-follow.

Next Steps

Fix the brandalf-enabled fallback (issue 1) - short-circuit to empty result instead of falling through to aem.live
Add the successful multi-page pagination test (issue 5)
Consider the caching, fan-out, and duplication improvements (issues 2-4) as fast-follow

rarescheseli · 2026-04-23T15:13:18Z

- * Gets the ISO week number and year for the previous two weeks.
- * @returns {Array<{ week: number, year: number }>} Previous two weeks (most recent first)
- */
-function getPreviousWeeks() {


This block was moved to offsite-brand-presence-enrichment.js

…o-json

solaris007

Hey @rarescheseli,

All five Important items from the prior review are resolved, most exceeding what was asked:

Strengths (previously flagged, now resolved)

Item 1 (brandalf fallback): loadBrandPresenceData returns null for brandalf orgs when PostgREST returns no data. The legacy file path is reachable only when isBrandalfOrg === false. Fix is in the right place.
Item 3 (source query fan-out): Exceeded expectations. Replaced the N/50 chunked fan-out with a single PostgREST FK-embedded query (brand_presence_sources(source_urls(url))) plus keyset pagination on (execution_date, id). Both the round-trip pattern and the offset performance cliff are gone.
Item 4 (duplicated orchestration): loadBrandPresenceData in enrichment.js is now the single canonical entry point. Both offsiteBrandPresenceRunner and computeTopicsFromBrandPresence delegate to it. handler.js dropped 270 lines, gained 36.
Item 5 (site threading): computeTopicsFromBrandPresence(siteId, context, site) now accepts the site entity. All three callers (cited, reddit, youtube) forward it, avoiding redundant Site.findById().
Pagination coverage: New test accumulates rows across two successful pages via keyset cursor seeds a full first page (5000 rows) plus a partial second page (3 rows) and asserts correct accumulation. Prior gap closed.
Orphan-source rows eliminated structurally: The embedded FK join means orphan source rows cannot exist by construction, not by guard code.
Observability bonus: USER_AGENT = 'Offsite Audits - Spacecat/1.0' and &source=offsite-audits query param added to outbound calls.

Issues

Important (Should Fix)

1. Stale log message in loadBrandPresenceDataFromPostgrest catch block. src/utils/offsite-brand-presence-postgrest.js (~line 177) still reads "Falling back to file-backed brand presence for site ${siteId}". With the no-fallback behavior for brandalf orgs, this message is now actively misleading on every PostgREST failure - ops will read it as "we fell back to files" when in fact the function returns null and no fallback occurs. Reword to e.g. "PostgREST query failed for site ${siteId}: ${error.message}" and update the corresponding test matcher. One-line fix with real incident-response value.

2. No test pins the site-parameter forwarding through computeTopicsFromBrandPresence. The signature change to accept site is the mechanism that avoids the extra org lookup (item 5), but no test passes a non-undefined site and asserts that resolveOrganizationIdForSite received it. A regression that drops the forwarding (e.g., someone reintroduces a 2-arg call) would not be caught. Add one test that stubs resolveOrganizationIdForSite, calls with a fakeSite, and asserts .firstCall.args[0].site === fakeSite.

Minor (Nice to Have)

Fail-open on brandalf check blip (pre-existing, impact increased). isBrandalfEnabled returns false for non-2xx responses and thrown errors. In loadBrandPresenceData, that means a feature-flag endpoint outage silently routes a brandalf-enabled site to the legacy aem.live path - the exact behavior item 1 was meant to prevent. Distinguishing "flag is false" from "flag check failed" (e.g., return null/throw for failure) would close this hole. Acceptable as a tracked follow-up, not blocking.
isBrandalfEnabled caching (item 2, not addressed). Still one HTTP call per audit invocation. Post-consolidation the call fires once per audit rather than once per provider, so the rate is reduced. Acceptable as a tracked follow-up.
'all' -> 'chatgpt-paid' mapping undocumented. A two-line comment would prevent future confusion.
fetchExecutionsWithSources has no max-iteration guard. A MAX_PAGES constant (e.g., 50 = 250K executions) would prevent a pathological condition from looping the Lambda to timeout.
Keyset cursor interpolation trust assumption implicit. Values are ISO dates and UUIDs (safe today). A one-line comment noting this prevents future schema-change surprises.
Cross-year boundary test coverage narrowed. Old handler test exercised full QI-fetch flow with [{w:1,y:2026},{w:52,y:2025}]. New test mocks at the loadBrandPresenceData boundary. Add one cross-year case in the enrichment layer to retain that coverage.

Assessment

Ready to merge? With fixes.

The prior CHANGES_REQUESTED items are all resolved, with the source-query refactor substantially exceeding what was asked. The two remaining Important items are a one-line log reword and a small test addition. The fail-open and caching concerns are pre-existing, their impact is reduced by this PR's consolidation, and they have a clean follow-up path.

Next Steps

Reword the stale catch-block log in loadBrandPresenceDataFromPostgrest and update the test matcher.
Add one test pinning site parameter forwarding through computeTopicsFromBrandPresence.
File a follow-up for the fail-open brandalf check and the caching concern so they do not get lost.

solaris007 · 2026-04-28T16:23:22Z

+      return null;
+    }
+
+    const rows = mapExecutionsToLegacyBrandPresenceRows(executions);


Important: This log message still reads "Falling back to file-backed brand presence" but with the new no-fallback behavior for brandalf orgs, no fallback occurs - the function returns null. Reword to e.g. PostgREST query failed for site ${siteId}: ${error.message} and update the test matcher. One-line fix, real incident-response value.

good catch! Addressed it in 827daa8.

…o-json

solaris007

Hey @rarescheseli,

All prior findings addressed cleanly in the two new commits.

Strengths (previously flagged, now resolved)

Stale log message - loadBrandPresenceDataFromPostgrest catch block now reads "PostgREST query failed for site ${siteId}: ${error.message}" and the test matcher is updated to match. No more misleading "fallback" wording during incidents.
Site parameter forwarding test - new test "forwards the provided site object to organization resolution" stubs resolveOrganizationIdForSite and asserts site: fakeSite was passed through. Closes the regression-window for the item-5 fix.
Fail-open on brandalf check (improved beyond what was asked) - isBrandalfEnabled now returns null (instead of false) for missing env, non-2xx responses, non-array payloads, and thrown errors. loadBrandPresenceData short-circuits with isBrandalfOrg === null and skips both PostgREST and the legacy aem.live path. This closes the silent-fallback hole that the item-1 fix would otherwise leave open during a feature-flag endpoint outage.
'all' -> 'chatgpt-paid' mapping documented with a one-line comment.
MAX_EXECUTION_FETCH_PAGES = 50 guard added to fetchExecutionsWithSources with throw on overrun and a corresponding test.
Keyset cursor trust assumption annotated with a comment noting the ISO-date and UUID dependency.
Cross-year boundary explicitly tested with [{w:1,y:2026},{w:52,y:2025}].

Assessment

Ready to merge? Yes.

Each prior item has a matching code change and a matching test. The brandalf null-return change is a particularly nice touch - the prior reviews flagged the fail-open as a separate concern and this PR resolves it tighter than the workaround we discussed.

Rares Cheseli added 2 commits April 15, 2026 15:36

try to read from Postgrest brand presence execution data

16c8fdf

fix coverage

9a1e563

Rares Cheseli added 2 commits April 15, 2026 17:13

add unit tests

f2b19ce

add unit tests for brandalf-utils

dca672e

Rares Cheseli added 5 commits April 16, 2026 11:43

sort descending by date

aa13719

refactor to use existing date-related utils and refactor unit tests

93bf8b0

Merge branch 'main' into rcheseli/read-from-postgrest-then-fallback-t…

d4b0dd6

…o-json

update unit test

b78d170

Merge branch 'main' into rcheseli/read-from-postgrest-then-fallback-t…

610c559

…o-json

rarescheseli marked this pull request as ready for review April 16, 2026 11:58

rarescheseli changed the title ~~feat: read from Postgrest brand presence execution data and fallback to share-point data if there's no data in DB (LLMO-4307)~~ feat: read from PostgREST brand presence execution data and fallback to share-point data if there's no data in DB (LLMO-4307) Apr 16, 2026

rarescheseli changed the title ~~feat: read from PostgREST brand presence execution data and fallback to share-point data if there's no data in DB (LLMO-4307)~~ feat: for offsite audits, read brand presence execution data from PostgREST (LLMO-4307) Apr 16, 2026

solaris007 assigned rarescheseli Apr 17, 2026

solaris007 added the enhancement New feature or request label Apr 17, 2026

solaris007 requested changes Apr 17, 2026

View reviewed changes

Comment thread src/offsite-brand-presence/handler.js Outdated

Comment thread src/utils/offsite-brand-presence-enrichment.js Outdated

Rares Cheseli added 7 commits April 23, 2026 15:11

if isBrandalfOrg do not fallback to fetching json files

22126d1

refactor source fetch from db

a8a3ce4

refactor

5c9207c

pass site to computeTopicsFromBrandPresence

bec3b7d

lint

f23ba67

improve Postgres query

568112c

refactor unit tests

6299726

rarescheseli commented Apr 23, 2026

View reviewed changes

Merge branch 'main' into rcheseli/read-from-postgrest-then-fallback-t…

93400f0

…o-json

rarescheseli requested a review from solaris007 April 23, 2026 15:26

rarescheseli temporarily deployed to dev-branches April 23, 2026 15:31 — with GitHub Actions Inactive

solaris007 requested changes Apr 28, 2026

View reviewed changes

address PR feedback

827daa8

rarescheseli temporarily deployed to dev-branches April 29, 2026 08:18 — with GitHub Actions Inactive

rarescheseli requested a review from solaris007 April 29, 2026 08:40

Merge branch 'main' into rcheseli/read-from-postgrest-then-fallback-t…

e240d58

…o-json

rarescheseli temporarily deployed to dev-branches April 29, 2026 08:45 — with GitHub Actions Inactive

solaris007 approved these changes May 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: for offsite audits, read brand presence execution data from PostgREST (LLMO-4307)#2389

feat: for offsite audits, read brand presence execution data from PostgREST (LLMO-4307)#2389
rarescheseli wants to merge 19 commits into
mainfrom
rcheseli/read-from-postgrest-then-fallback-to-json

rarescheseli commented Apr 15, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 15, 2026

Uh oh!

codecov Bot commented Apr 16, 2026

Uh oh!

solaris007 left a comment

Uh oh!

Uh oh!

Uh oh!

rarescheseli Apr 23, 2026

Uh oh!

solaris007 left a comment

Uh oh!

solaris007 Apr 28, 2026

Uh oh!

rarescheseli Apr 29, 2026

Uh oh!

solaris007 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rarescheseli commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issues

Uh oh!

github-actions Bot commented Apr 15, 2026

Uh oh!

codecov Bot commented Apr 16, 2026

Codecov Report

Uh oh!

solaris007 left a comment

Choose a reason for hiding this comment

Strengths

Issues

Important (Should Fix)

Minor (Nice to Have)

Recommendations

Assessment

Next Steps

Uh oh!

Uh oh!

Uh oh!

rarescheseli Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

solaris007 left a comment

Choose a reason for hiding this comment

Strengths (previously flagged, now resolved)

Issues

Important (Should Fix)

Minor (Nice to Have)

Assessment

Next Steps

Uh oh!

solaris007 Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

rarescheseli Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

solaris007 left a comment

Choose a reason for hiding this comment

Strengths (previously flagged, now resolved)

Assessment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rarescheseli commented Apr 15, 2026 •

edited

Loading