FailproofAI · SiddarthAA · May 27, 2026 · May 27, 2026 · May 29, 2026 · May 31, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,10 +1,25 @@
 # Changelog
 
-## 0.0.11-beta.3 — 2026-05-25
+## 0.0.11-beta.3 — 2026-05-31
 
 ### Features
+- Add email-OTP auth wired to the Rust `failproof-api-server` (`/v0/auth/login/request`, `/login/verify`, `/token/refresh`, `/logout`, `/me`). New `failproofai auth --login | --logout | --whoami` CLI subcommand (`src/auth/cli.ts`, dispatched from `bin/failproofai.mjs`) persists tokens to `~/.failproofai/auth.json` at mode `0600` via a shared store (`lib/auth/auth-store.ts` + `lib/auth/api-server-client.ts`); the store auto-refreshes the access token within a 60s leeway window and treats refresh-token reuse / 401 as "wipe local session". Four Next.js API routes (`app/api/auth/{status,login-request,login-verify,logout}/route.ts`) proxy the same flow for the dashboard so the refresh token never reaches the browser — only `{authenticated, user}` does. The "set a reminder" CTA in `/audit`'s `return-section.tsx` now probes `/api/auth/status` on mount and, for un-authed visitors, opens a new `AuthDialog` (`app/audit/_components/auth-dialog.tsx`, styled to match the audit aesthetic: pink corner-glyphs, dashed-frame backdrop, terminal mono inputs, masked OTP entry, live resend countdown, ESC-to-close) that walks email → OTP → "you are <email>" inline; signed-in users get a green "signed in as …" pill under the CTA. Configurable via `FAILPROOF_API_URL` (defaults to `http://localhost:8080`) and `FAILPROOFAI_AUTH_DIR` (defaults to `~/.failproofai`).
+
+- `/audit` polish pass: simplify the "next audit" CTA to `[ install policies ]` copying the bare `failproofai policies --install` command (no longer appends per-slipping-policy short names); fix the `[ share → ]` header button to scroll to the Show-off section reliably by accounting for the sticky in-page `.app-header` height with a manual y-coord scroll + a `scroll-margin-top` fallback on `.showoff`; harden the "make poster" PNG export so the captured archetype frame no longer collides with the sigil / tagline — `show-off-cta.tsx` now `await document.fonts.ready` before capture, applies a `.capturing` class that locks every viewport-clamped font-size and grid column to an absolute value tuned for the 1100px capture width, drops `text-shadow` / `box-shadow` that html2canvas crops unpredictably, and captures with a 12px bleed on each side so the frame's corner accents and box-shadow survive the crop; and expand every archetype in `src/audit/archetypes.ts` from a single hand-written copy block to a multi-variant catalog (4–6 taglines, keyword sets, descriptions, signature blocks, "common in" / "primary risk" / closing lines per archetype, all 8 archetypes covered). A new `pickArchetypeVariant(key, seed)` picker deterministically selects one variant from each list via a djb2-seeded per-field hash mixed with a per-field axis, so the persona blurb stays stable across renders for a given seed but two different projects landing on the same archetype see different copy. `IdentitySection` consumes the resolved variant; the seed flows in from `audit-dashboard.tsx` as the inferred project name.
+
+- Add an in-app `/audit` dashboard that turns the existing `failproofai audit` data into a personality-driven report. The page classifies every audited agent into one of 8 archetypes (`optimist`, `cowboy`, `explorer`, `goldfish`, `paranoid architect`, `precision builder`, `hammer`, `ghost`) via a weighted classifier (`src/audit/archetypes.ts`) that maps every builtin policy + every audit-only detector (47/47 coverage) to an archetype with a tuned weight. A scoring module (`src/audit/scoring.ts`) derives a 0-100 score with S/A/B/C/D/F grade thresholds, a projected-score uplift if every recommended policy were enabled, and a stable synthetic cohort rank. The page composes six sections — Identity (archetype hero with 8x8 pixel sigil + meta grid), Show-off CTA, Strengths (real numbers derived from the audit), Score + cohort leaderboard with distribution histogram, Findings (per-policy cards with what happened / cost / evidence / fix), Prescribed Policies (with projected-score callout), and a "re-audit in 7 days" return loop. Every audit-only detector is now mapped to its closest real-time builtin policy as the prescribed fix (`findings.ts:DETECTOR_TO_POLICY`) so the report never carries an "audit-only — no real-time policy" framing. New dashboard cache at `~/.failproofai/audit-dashboard.json` (mode `0600`, single slot, helper at `src/audit/dashboard-cache.ts`); `AuditResult` schema bumped to version 2 with new fields `eventsScanned`, `projectsScanned`, `enabledBuiltinNames`. New routes `app/audit/page.tsx`, `app/api/audit/run/route.ts` (POST, in-process `runAudit()` call, module-scoped run lock that 409s on concurrent clicks), `app/api/audit/status/route.ts` (GET, drives client polling), and server action `app/actions/get-audit-result.ts` (cache read, mirrors `getHooksConfigAction`'s read-only contract). "Make poster" downloads a 2x PNG of the archetype frame via html2canvas. Navbar gains an Audit entry between Policies and Projects with a slipping-through count chip. Existing runtime policy enforcement is untouched — `policy-registry.ts` gets two additive exports (`getAllPolicies` / `setAllPolicies`) used only by the new `replay.ts:restoreReplay()` snapshot/restore so embedding `runAudit()` in a long-running process no longer wipes pre-existing registrations. Ports the brand team's design kit verbatim from `assets/audit/styles.css` (1235 lines, JetBrains Mono + VT323 via Google Fonts, Architype Stedelijk shipped locally under `public/audit/fonts/`).
+
 - Stamp `product: "failproofai-oss"` on every PostHog event across all four telemetry channels — hooks/audit (`trackHookEvent`), server (`trackEvent`), web UI (`captureClientEvent`), and npm-lifecycle install/uninstall (`trackInstallEvent`) — so OSS events stay distinguishable from any future hosted surface. The value lives in a single `POSTHOG_PRODUCT` constant in `src/posthog-key.ts`, reused by the three TypeScript channels; the standalone `scripts/install-telemetry.mjs` inlines the same literal because it can't import the TS module at install time. Honors `FAILPROOFAI_TELEMETRY_DISABLED=1` like all other telemetry (#380).
 
+- Polish pass across `/audit`, `/policies`, and `/projects`: bump base font from `13px → 14.5px` and widen `.report` from `1180px → 1380px` (with `40px` side padding) in `globals.css` so default-zoom readability stops requiring a browser zoom-in; restore `.section` vertical padding to `64px` to match the audit reference. Remove the second in-page audit `<header>` (`app/audit/_components/app-header.tsx` deleted) and all three of its mount sites in `audit-dashboard.tsx` — the global navbar plus per-section masts cover the same chrome without the duplicate `failproof_ai / AUDIT [share →]` strip. Rewrite `score-section.tsx` end-to-end: drop the synthetic cohort leaderboard and replace with a single dashed-frame `.panel` (the new `.score-share-card`) split into two columns — left is the audit score (big tier-colored number, tier badge, progress bar to the next grade band, three stat boxes for missing policies / pts-to-next / est. days-to-fix, plus a top-N policy-status chip strip), right is share (pre-written X / Twitter and LinkedIn templates derived from `score + archetype + missing-count`, `[share on X]`, `[share on LinkedIn]`, and `[download audit card]` that html2canvas-captures the whole panel as a PNG named `failproofai-card-<grade>-<score>.png`). `audit-dashboard.tsx` drops the now-unused `syntheticRank` import / `rank` prop and threads `result` into the score section. Replace `empty-state.tsx` and `run-progress.tsx` with audit-pixel-craft versions: a `.empty-panel` with a pixel-grid sigil, Architype Stedelijk headline, and `.btn-press` CTA replaces the shadcn `Button` + `lucide-react` icon center-card; the running view becomes a terminal-style `.running-panel` (`$ failproofai audit --since 30d ▮` header with a blinking pink cursor, stage list with `✓` / `▮▮` / `○` markers and a per-stage braille spinner, and a marquee `audit-bar-fill` progress bar). Persistent **next-audit reminder** added — new `~/.failproofai/next-audit.json` (mode 0600, separate file from `auth.json` so the reminder is independent of token refresh), new `lib/auth/auth-store.ts` helpers (`readReminder` / `writeReminder` / `deleteReminder` / `getReminderFilePath` + `StoredReminder` type), new `app/api/auth/reminder/route.ts` (GET / POST / DELETE, defaults to a 7-day offset, scoped to the active session so a reminder for `a@x.com` is invisible to a CLI-authed `b@x.com`), and `/api/auth/status` now returns `reminder: { next_audit_at, user_email, set_at } | null` alongside the user. `return-section.tsx` flips behavior accordingly: signed in + reminder set → status panel ("next audit set for `<Mon Jun 8> · in 7 days`" + "signed in as `<email>`" + a `[re-audit now]` button next to `[install policies]` and a tiny "clear reminder" link); anon → `[set a reminder]` opens the existing AuthDialog and on successful sign-in writes the reminder automatically; signed in + no reminder → `[set a reminder]` writes it directly with no dialog. The `[re-audit now]` button (also shown to anon users with audit data) reuses the existing `triggerRun` poller and reloads the page once the run completes. No new dependencies; the deleted `app-header.tsx` was a 38-line component with no callers other than the three audit-dashboard mounts.
+
+- Unify the dashboard design system around the brutalist pixel-craft aesthetic that previously lived only in `/audit`. The audit token set (`--bg`, `--ink`, `--accent-pink`, `--accent-green`, `--font-mono` → JetBrains Mono, `--font-display` → Architype Stedelijk / VT323) is now declared once in `app/globals.css`, and every shadcn-style Tailwind alias (`--background`, `--card`, `--foreground`, `--primary`, `--border`, `--radius: 0`, …) is repointed at the audit palette so existing utility classes like `bg-card` / `text-foreground` / `border-border` produce audit visuals across the whole app without rewriting any component markup. The `:root` block, body cross-hatch + grain overlays, JetBrains Mono import, and all canonical chrome classes (`.app-header`, `.h-brand*`, `.btn`, `.btn-press`, `.tabs`, `.tab`, `.section`, `.section-mast`, `.section-h`, `.report`, plus a new reusable `.panel` with pink corner brackets) are promoted to `globals.css`. `app/audit/audit-styles.css` keeps only the audit-page-only widgets (archetype frame, sigil, score grade, leaderboard, findings cards, return hook, auth dialog), so the styles loaded specifically by `/audit` no longer leak into `/policies` or `/projects` on client-side navigation. `app/layout.tsx` drops the `next/font/google` Geist Mono import — fonts now ship via the single CSS `@import url('…JetBrains+Mono…')` in `globals.css`. `components/navbar.tsx` is rewritten around `.app-header` with the pink `▮▮` mark, lowercase Architype wordmark, optional version chip, a current-section eyebrow, and `.tab` links with sharp pink underline on the active route (lucide icons in the bar removed). `app/projects/page.tsx` and its `loading.tsx` are wrapped in the `.report` + `.section` + `.panel` chrome with a green-eyebrow masthead and "your agent footprint." section heading; the inner `ProjectList` component is unchanged and picks up the unified palette automatically. `app/policies/hooks-client.tsx` swaps its outer `<div className="min-h-screen bg-background…">` for a `.report` + `.section` shell with audit masthead copy ("what your agents tried." / "what to stop them doing."), replaces the rounded-pill `TabBar` with the global `.tabs` / `.tab` underline tabs, and drops the now-redundant "Back to /projects" link (the new navbar covers cross-page navigation). No functional changes — all 1701 tests pass and the production `next build` succeeds.
+
+### Docs
+- Extend `docs/cli/auth.mdx` with a "Persistent re-audit reminder" section covering the new `~/.failproofai/next-audit.json` file and the `GET / POST / DELETE /api/auth/reminder` dashboard endpoint that backs the `/audit` `[ set a reminder ]` CTA — including the file shape, the per-email scoping rule, and the 7-day default offset.
+
+- Document the new `failproofai auth --login | --logout | --whoami` subcommand in a dedicated `docs/cli/auth.mdx` page (mirrors the style of `cli/audit.mdx`: usage block, sign-in / sign-out / whoami sections, on-disk `auth.json` shape, env-var table, and a short troubleshooting list for the common `Could not reach the api-server` / `Rate limited` / `Code rejected` cases). Add an Authentication section to `docs/cli/environment-variables.mdx` covering `FAILPROOF_API_URL` (override the api-server base URL) and `FAILPROOFAI_AUTH_DIR` (override where `auth.json` is stored). i18n mirrors left for the translation-sync workflow.
+
 ## 0.0.11-beta.2 — 2026-05-21
 
 ### Features

diff --git a/__tests__/audit/dashboard-cache.test.ts b/__tests__/audit/dashboard-cache.test.ts
@@ -0,0 +1,95 @@
+// @vitest-environment node
+import { describe, it, expect, beforeEach, afterEach } from "vitest";
+import { existsSync, mkdirSync, mkdtempSync, rmSync, writeFileSync, statSync } from "node:fs";
+import { tmpdir } from "node:os";
+import { join } from "node:path";
+import {
+  readDashboardCache,
+  writeDashboardCache,
+  isCacheStale,
+} from "../../src/audit/dashboard-cache";
+import type { AuditResult } from "../../src/audit/types";
+
+const FAKE_RESULT: AuditResult = {
+  version: 2,
+  scannedAt: "2026-05-26T00:00:00.000Z",
+  scope: { cli: ["claude"], projects: "all", since: null },
+  transcripts: { scanned: 5, skipped: 0, errors: 0, durationMs: 100 },
+  results: [],
+  totals: { hits: 0, projectsWithHits: 0 },
+  projectsScanned: ["/home/u/a", "/home/u/b"],
+  eventsScanned: 42,
+  enabledBuiltinNames: ["block-failproofai-commands"],
+};
+
+describe("dashboard cache", () => {
+  let tmpHome: string;
+  let originalHome: string | undefined;
+
+  beforeEach(() => {
+    // Redirect homedir() to a tmp directory by overriding HOME — os.homedir()
+    // reads it on every call on POSIX, so the dashboard-cache module sees
+    // our tmp path without needing module mocks.
+    tmpHome = mkdtempSync(join(tmpdir(), "fpa-audit-cache-test-"));
+    originalHome = process.env.HOME;
+    process.env.HOME = tmpHome;
+  });
+
+  afterEach(() => {
+    if (originalHome === undefined) delete process.env.HOME;
+    else process.env.HOME = originalHome;
+    try { rmSync(tmpHome, { recursive: true, force: true }); } catch { /* ignore */ }
+  });
+
+  it("returns null when no cache file exists", () => {
+    expect(readDashboardCache()).toBeNull();
+  });
+
+  it("round-trips a written entry", () => {
+    writeDashboardCache({ since: "7d" }, FAKE_RESULT);
+    const entry = readDashboardCache();
+    expect(entry).not.toBeNull();
+    expect(entry?.params).toEqual({ since: "7d" });
+    expect(entry?.result.transcripts.scanned).toBe(5);
+    expect(entry?.result.projectsScanned).toEqual(["/home/u/a", "/home/u/b"]);
+    expect(typeof entry?.cachedAt).toBe("string");
+  });
+
+  it("writes mode 0600 on the file", () => {
+    writeDashboardCache({}, FAKE_RESULT);
+    const cachePath = join(tmpHome, ".failproofai", "audit-dashboard.json");
+    expect(existsSync(cachePath)).toBe(true);
+    const mode = statSync(cachePath).mode & 0o777;
+    // Some filesystems (FAT, etc.) can't honor mode bits perfectly — just
+    // assert no world-readable bit is set.
+    expect(mode & 0o004).toBe(0);
+  });
+
+  it("returns null for a corrupt JSON cache file", () => {
+    const dir = join(tmpHome, ".failproofai");
+    mkdirSync(dir, { recursive: true });
+    writeFileSync(join(dir, "audit-dashboard.json"), "{ not json", "utf-8");
+    expect(readDashboardCache()).toBeNull();
+  });
+
+  it("returns null when shape is wrong", () => {
+    const dir = join(tmpHome, ".failproofai");
+    mkdirSync(dir, { recursive: true });
+    writeFileSync(join(dir, "audit-dashboard.json"), JSON.stringify({ foo: 1 }), "utf-8");
+    expect(readDashboardCache()).toBeNull();
+  });
+
+  it("isCacheStale returns true past the threshold", () => {
+    const old = new Date(Date.now() - 60 * 60_000).toISOString(); // 1 hour ago
+    expect(isCacheStale(old, 30)).toBe(true);
+  });
+
+  it("isCacheStale returns false within the threshold", () => {
+    const recent = new Date(Date.now() - 10 * 60_000).toISOString(); // 10 min ago
+    expect(isCacheStale(recent, 30)).toBe(false);
+  });
+
+  it("isCacheStale treats unparseable timestamps as stale", () => {
+    expect(isCacheStale("not-a-date")).toBe(true);
+  });
+});
diff --git a/__tests__/audit/replay.test.ts b/__tests__/audit/replay.test.ts
@@ -1,6 +1,12 @@
 // @vitest-environment node
 import { describe, it, expect, beforeEach } from "vitest";
-import { resetReplay, replayEvent } from "../../src/audit/replay";
+import { resetReplay, replayEvent, initReplay, restoreReplay } from "../../src/audit/replay";
+import {
+  clearPolicies,
+  getAllPolicies,
+  registerPolicy,
+} from "../../src/hooks/policy-registry";
+import { allow } from "../../src/hooks/policy-helpers";
 import type { NormalizedToolEvent } from "../../src/audit/types";
 
 function bash(command: string): NormalizedToolEvent {
@@ -50,3 +56,48 @@ describe("replay engine", () => {
     expect(hits.some((h) => h.eventType === "PostToolUse")).toBe(true);
   });
 });
+
+describe("replay registry snapshot/restore", () => {
+  beforeEach(() => {
+    resetReplay();
+    clearPolicies();
+  });
+
+  it("restoreReplay puts back the pre-init registry", () => {
+    registerPolicy(
+      "test/custom-marker",
+      "test policy",
+      async () => allow(),
+      { events: ["PreToolUse"] },
+    );
+    const before = getAllPolicies().map((p) => p.name).sort();
+    expect(before).toContain("test/custom-marker");
+
+    initReplay();
+    const duringInit = getAllPolicies().map((p) => p.name);
+    expect(duringInit).not.toContain("test/custom-marker");
+    expect(duringInit.length).toBeGreaterThan(10); // builtins are loaded
+
+    restoreReplay();
+    const after = getAllPolicies().map((p) => p.name).sort();
+    expect(after).toEqual(before);
+  });
+
+  it("restoreReplay is idempotent when called twice", () => {
+    registerPolicy(
+      "test/another-marker",
+      "test policy",
+      async () => allow(),
+      { events: ["PreToolUse"] },
+    );
+    initReplay();
+    restoreReplay();
+    restoreReplay(); // second call should be a no-op
+    expect(getAllPolicies().map((p) => p.name)).toContain("test/another-marker");
+  });
+
+  it("restoreReplay before initReplay is a no-op", () => {
+    expect(() => restoreReplay()).not.toThrow();
+    expect(getAllPolicies()).toEqual([]);
+  });
+});
diff --git a/app/actions/get-audit-result.ts b/app/actions/get-audit-result.ts
@@ -0,0 +1,24 @@
+"use server";
+
+import { readDashboardCache } from "@/src/audit/dashboard-cache";
+import type { AuditResult, RunAuditOptions } from "@/src/audit/types";
+
+export type AuditResultPayload =
+  | { status: "cached"; cachedAt: string; params: RunAuditOptions; result: AuditResult }
+  | { status: "empty" };
+
+/**
+ * Read the dashboard cache. Never triggers a run — `/audit` shows the empty
+ * state when there's no cache and lets the user opt in to scanning. Mirrors
+ * the read-only ergonomics of `getHooksConfigAction()`.
+ */
+export async function getAuditResultAction(): Promise<AuditResultPayload> {
+  const entry = readDashboardCache();
+  if (!entry) return { status: "empty" };
+  return {
+    status: "cached",
+    cachedAt: entry.cachedAt,
+    params: entry.params,
+    result: entry.result,
+  };
+}