Skip to content

AI analytics#1339

Open
aadesh18 wants to merge 106 commits into
devfrom
ai-analytics
Open

AI analytics#1339
aadesh18 wants to merge 106 commits into
devfrom
ai-analytics

Conversation

@aadesh18
Copy link
Copy Markdown
Collaborator

@aadesh18 aadesh18 commented Apr 15, 2026

Summary

This PR ships AI analytics on top of the existing internal MCP Review tool: a new Unified AI Endpoint Analytics tab over the AI proxy/query path, plus the supporting backend telemetry (ai-proxy-handlers, ai-query-handlers, SpacetimeDB log sync, OpenRouter usage attribution, MCP-call / Q&A reducers). It also reworks several MCP Review components and tightens the in-product AI dashboard builder (vibe-coding chat + sandbox host) with tests around the new handlers and SpacetimeDB client.

Base: dev → Head: ai-analytics · 104 files changed · +7,874 / -1,312

What this PR actually adds vs. what was already there

The MCP Review tool itself (Call Logs / Knowledge Base / Analytics tabs) shipped in #1321 ("LLM MCP Flow") — that's the dev baseline shown in the before-shots. This PR layers on top of that:

Pre-existing on dev (from #1321) New in this PR
MCP Review tool app + MCP Review Tool shell Tabs renamed: Call Logs → MCP Review, Analytics → Unified AI Endpoint Analytics
Analytics.tsx — MCP-only QA-score / flag-types dashboard Usage.tsx + UsageDetail.tsx — full Unified AI Endpoint Analytics surface (new file)
CallLogList.tsx, CallLogDetail.tsx, ConversationReplay.tsx, KnowledgeBase.tsx, AddManualQa.tsx All five reworked (auth, schema, polish)
mcp_call_log SpacetimeDB table (public: true) mcp_call_log reshaped (sharding, indices, optional fields) + brand-new ai_query_log table + reducers (log_mcp_call_reducer, update_mcp_qa_review_reducer)
New backend: ai-proxy-handlers.ts, ai-query-handlers.ts, ai-proxy-logger.ts, mcp-call-logger.ts (+ tests), QA pipeline (qa-reviewer, reviewer-auth, verified-qa), models/prompts/tools
Dashboard: rewrites of dashboards/[id]/page-client.tsx, vibe-coding/*, assistant-ui/thread.tsx (Grok model swap from #1476 + sandbox host fixes)

Screenshots are captured against a live local dev env (NEXT_PUBLIC_STACK_PORT_PREFIX=82) with SpacetimeDB bindings published per branch — dev's mcp_call_log-only schema for the before-pass, then this PR's schema (mcp_call_log + ai_query_log) for the after-pass — so the table-renders below are real, not stubs.

MCP Review home (was: "Call Logs")

Tab rename + new empty-state messaging. Same data on both sides (zero logged MCP calls in this seed), but the dev shell is just a filter + table, whereas this PR adds the KPI strip, calls-over-time chart, QA score distribution, top flag types, response-time card, and tool-usage card on top — all visible even with zero calls.

Before (dev) After (this PR)
internal-home-before internal-home-after

Unified AI Endpoint Analytics — populated

The headline new surface — Usage.tsx + UsageDetail.tsx did not exist on dev. The before-pass shows dev's old "Analytics" tab, which was scoped to MCP QA review stats. The after-pass shows this PR's tab populated with 11 live AI calls fired through the dashboard's vibe-coding chat during this session.

Before (dev: MCP "Analytics" tab) After (Unified AI Endpoint Analytics, 11 live calls)
internal-analytics-before internal-analytics-after

What's wired on the after side:

  • Filters: range (24h / 7d / 30d / all), mode (all / stream / generate), auth (all / authed / anon), status (all / ok / error), free-text search, system-prompt chip row (command-center-ask-ai, create-dashboard, docs-ask-ai, email-assistant family, email-wysiwyg-editor, rewrite-template-source, run-query, wysiwyg-edit), and a model chip (x-ai/grok-build-0.1).
  • KPI strip: total calls (11), errors, input/output tokens, cache hit %, total cost, cache savings, avg + p95 duration (6,019 / 6,036ms). All live aggregates from ai_query_log via SpacetimeDB subscriptions.
  • Breakdowns: calls over time, token volume (input + output), cached vs fresh, cache hit % by system prompt, by-system-prompt + by-model rollups, tool usage (update-dashboard, patch-dashboard, sql-query), latency histogram.
  • Paginated calls table CALLS (11) with sortable columns: time, system prompt, model, mode, in/out tokens, cache read/write, $ cost, duration, status.

Knowledge Base — populated

Q&A management surface. Backed by mcp_call_log.qa_* and the new update_mcp_qa_review_reducer. Below: seeded with 2 published entries + 1 draft via "+ Add Q&A". The structural diff is the schema rework (sharding, indexes, optional fields), not so much new chrome here.

Before (dev) After (this PR, 2 published + 1 draft)
internal-kb-before internal-kb-after

Each row carries status (Draft / Published), source (manual), timestamp, author, and per-row Publish / Unpublish / Edit / Delete actions. The "+ Add Q&A" modal:

internal-kb-add-modal

Dashboard view — vibe-coding chat + sandbox host

apps/dashboard/.../dashboards/[dashboardId]/page-client.tsx, the vibe-coding/ chat pieces, and the dashboard sandbox host all change here. The chat now drives a live IDE-style preview while a layout is being assembled, with chat history + Generate / Stop controls.

Empty state — dark Empty state — light
dashboard-after-dark dashboard-after-light
Active chat — multiple user turns Mid-generation: sandbox host streaming dashboard.tsx
dashboard-chat-active dashboard-ai-generating
dev baseline (empty state on dev looks similar — the meaningful diff is the sandbox host wiring + the Grok model swap from #1476)

dashboard-before-dark

Notes for reviewers

  • All 11 calls in the analytics shot show error. That's a model-side issue (Grok build-0.1 default in dev-forwarding mode rejects this PR's tool-call schema), not a logging bug — the proxy still attributes, times, and records every call, which is exactly the property the new dashboard is meant to expose. With a working tool-call response you'd see non-zero Output Tokens + Total Cost.
  • Capture method: before-pass on origin/dev with pnpm --filter @stackframe/internal-tool spacetime:publish:local so dev's mcp_call_log schema is live (otherwise the page hangs on "Connecting to SpacetimeDB…"); then back to ai-analytics, re-publish (this DESTROYS the dev database — --delete-data=on-conflict), and re-fire calls. The dev-side data shown is purely the chrome / empty state, since dev's schema is what gets wiped each cycle.
  • MCP Review home still shows 0 calls because dashboard chats hit the AI query/proxy path, not the MCP server. Populating that surface would need a real MCP client; the visible diff is the new KPI scaffold around the (currently empty) call list.
  • /questions route exists but trips a separate Postgres probe on this local DB seed — not regressed, just not seeded. Not screenshotted.

Test plan

  • pnpm typecheck clean
  • pnpm lint clean
  • Backend test suite passes (new tests around ai-proxy-handlers, ai-query-handlers, spacetimedb-client, spacetimedb-bindings-sync, tools/index, ai-proxy-logger)
  • Internal-tool: MCP Review tab renders KPI scaffold + empty call list (no "Connecting to SpacetimeDB…" stall) on a seeded local env
  • Internal-tool: Unified AI Endpoint Analytics renders KPIs + charts under realistic call volume; filter changes (range, mode, auth, status, system-prompt) refetch correctly
  • Internal-tool: Knowledge Base "+ Add Q&A" → publishes a draft → reviewer mark/unmark/retry flow
  • Dashboard: vibe-coding chat → Generate → sandbox host renders + persists; delete flow works

mantrakp04 and others added 30 commits March 23, 2026 10:48
- Added new internal API endpoint for documentation tools, allowing actions such as listing available docs, searching, and fetching specific documentation by ID.
- Updated environment configuration to support optional internal secret for enhanced security.
- Refactored existing search functionality to utilize the new docs tools API instead of the previous MCP server.
- Improved error handling and response parsing for documentation-related requests.
- Expanded documentation to clarify the relationship between the new tools and existing API functionalities.

This update streamlines the documentation access process and enhances the overall developer experience.
- Introduced error capturing for failed HTTP requests in the docs tools API, improving debugging capabilities.
- Updated the API response for unsupported methods to include an 'Allow' header, clarifying the expected request type.

These changes enhance the robustness of the documentation tools integration and improve developer experience.
- Updated the key name in the capabilities section of the API documentation to follow a consistent naming convention, improving clarity and maintainability.
The .gitmodules was updated in d22593d to point at
apps/backend/src/private/implementation, but the gitlink entry (mode
160000) was never added to the tree. This caused
`git clone --recurse-submodules` to silently skip the private submodule.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…on for docs tools

- Added `STACK_DOCS_INTERNAL_BASE_URL` to backend `.env` and `.env.development` files for AI tool bundle configuration.
- Removed references to `STACK_INTERNAL_DOCS_TOOLS_SECRET` from backend and docs environment files and validation logic from the docs tools API route.
- Introduced a new `.env` file for the docs app with essential configuration variables.
@aadesh18
Copy link
Copy Markdown
Collaborator Author

@greptile review

Comment thread apps/backend/src/lib/ai/qa/verified-qa.ts
Comment thread apps/backend/src/lib/ai/ai-query-handlers.ts
…minal callbacks

- Added a new test file for `handleStreamMode` that verifies logging behavior for various terminal callback scenarios.
- Introduced a guard in `handleStreamMode` to ensure that logging occurs at most once per request lifecycle, addressing the issue of double-logging when both `onError` and `onAbort` are triggered.
- Tests include cases for single callbacks and multiple callbacks firing in rapid succession, ensuring correct logging behavior under all conditions.
…rift test

The server schema added `qa_entries.requestId` (4th column) and
`add_manual_qa.requestId` (6th arg) on May 11 but the auto-generated client
bindings were last regenerated on May 5 and never picked up the new field.
Because BSATN is positional, every column after position 3 in
`my_visible_qa_entries` would deserialize one slot off (question bytes read as
answer, etc.) once the new schema is published — corrupting the Knowledge Base
view in the internal-tool.

Add `requestId` to `my_visible_qa_entries_table.ts`, `types.ts:QaEntries`, and
`add_manual_qa_reducer.ts` in the same positions the codegen would produce,
plus a structural test that parses both the server schema and the generated
bindings and asserts field/arg orders match. The test fails on the prior
state with the exact missing-`requestId` diff and passes after the patch,
catching any future drift in CI.

Co-authored-by: Cursor <cursoragent@cursor.com>
@mantrakp04
Copy link
Copy Markdown
Collaborator

@greptile-ai please review

Addressed the remaining 2 issues from your latest summary in 1a8e38e:

  • my_visible_qa_entries_table.ts / types.ts:QaEntries — added requestId: __t.option(__t.string()) at column 4, matching the server's qa_entries schema. This unshifts every subsequent BSATN-decoded column for the Knowledge Base subscription.
  • add_manual_qa_reducer.ts — added requestId: __t.string() as the 6th arg, matching the server reducer signature. The backend already passes 6 args via callReducerStrict, so this only realigns the binding's documented contract.

Also added apps/backend/src/lib/ai/spacetimedb-bindings-sync.test.ts — a structural drift test that parses the server schema source and the generated bindings, then asserts column/arg orders match. Fails on the prior state with the exact missing-requestId diff, passes after the patch, and will catch any future drift in CI without requiring a live SpacetimeDB instance.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 100 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/backend/src/app/api/latest/integrations/ai-proxy/[[...path]]/route.ts">

<violation number="1" location="apps/backend/src/app/api/latest/integrations/ai-proxy/[[...path]]/route.ts:55">
P2: The `passthrough()` fallback will produce an empty response if `observeAndLog` throws after consuming the response body. Once `response.arrayBuffer()` is called inside `observeAndLog`, the original `response.body` stream is exhausted, so the catch-block's `new Response(response.body, ...)` sends nothing to the client.</violation>
</file>

Partial review: This PR has more than 50 files, so cubic reviewed the highest-priority files first. During the trial, paid plans get a higher file limit.
You can try an ultrareview to bypass the file limit, comment @cubic-dev-ai ultrareview. Learn more.

Fix all with cubic | Re-trigger cubic

Comment thread apps/backend/src/lib/ai/loggers/ai-proxy-logger.ts Outdated
Comment thread apps/backend/src/lib/ai/tools/index.ts Outdated
Comment thread apps/backend/src/lib/ai/spacetimedb-client.ts Outdated
responseHeaders,
});
} catch (e) {
captureError("ai-proxy-log-pipeline", e);
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot May 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The passthrough() fallback will produce an empty response if observeAndLog throws after consuming the response body. Once response.arrayBuffer() is called inside observeAndLog, the original response.body stream is exhausted, so the catch-block's new Response(response.body, ...) sends nothing to the client.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/backend/src/app/api/latest/integrations/ai-proxy/[[...path]]/route.ts, line 55:

<comment>The `passthrough()` fallback will produce an empty response if `observeAndLog` throws after consuming the response body. Once `response.arrayBuffer()` is called inside `observeAndLog`, the original `response.body` stream is exhausted, so the catch-block's `new Response(response.body, ...)` sends nothing to the client.</comment>

<file context>
@@ -1,97 +1,60 @@
+      responseHeaders,
     });
+  } catch (e) {
+    captureError("ai-proxy-log-pipeline", e);
+    return passthrough();
   }
</file context>
Fix with Cubic

mantrakp04 and others added 3 commits May 23, 2026 11:31
- Introduced unit tests for `observeAndLog` in `ai-proxy-handlers.test.ts` to ensure correct behavior when logging functions throw errors.
- Added tests for `callSql` in `spacetimedb-client.test.ts` to verify 401 enrollment retry logic and error handling.
- Created tests for `buildProxyLogRow` in `ai-proxy-logger.test.ts` to validate tool-name extraction from parsed logs.
- Implemented validation tests for tool names in `index.test.ts` to ensure consistency with defined tool names.

These additions enhance test coverage and reliability of the AI-related functionalities.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants