Skip to content

Add capture event transport and server-side write classification#112

Open
jspahn80134 wants to merge 46 commits into
bjones1:mainfrom
jspahn80134:main
Open

Add capture event transport and server-side write classification#112
jspahn80134 wants to merge 46 commits into
bjones1:mainfrom
jspahn80134:main

Conversation

@jspahn80134
Copy link
Copy Markdown
Contributor

@jspahn80134 jspahn80134 commented Apr 22, 2026

Summary:

  • Added privacy-conscious capture configuration with ignored local config, a safe capture_config.example.json, redacted config summaries, PostgreSQL capture writes, and JSONL fallback when database capture is unavailable.
  • Added the canonical capture schema/docs in server/scripts/capture_events_schema.sql and linked it from toc.md.
  • Defines typed capture columns for study analysis: event_id, sequence_number, schema_version, user_id, session_id, event_source, language_id, file_hash, event_type, timestamp, client_tz_offset_min, and event-specific data.
  • Reworked capture setup for students around consent and a status-bar toggle, with an auto-generated pseudonymous participant UUID instead of manual course/group/assignment/task fields.
  • Sends VS Code capture events over the existing IDE/server message path; the extension sends the local path only to the local Rust server, which hashes it before DB/fallback storage so raw paths are not persisted.
  • Added capture status UI/output handling and coordinated status/session shutdown across toggle, stop, and deactivate paths.
  • Added Rust-generated TypeScript capture/status types shared by the client and extension, with enum-backed capture event types and capture states.
  • Captures session/settings changes, saves, compile/run start/end, study lifecycle/handoff events, reflection prompt insertion, code paste markers, and server-generated write classification events.
  • Added server-side CodeChat translation classification for documentation vs. code writes, doc sessions, code/doc switches, observed paste writes, and non-paste external code insert candidates without storing source text.
  • Added the reflection prompt command/UI support while keeping study automation commands out of normal Command Palette contributions.
  • Added and expanded docs/comments for capture globals, data structures, event data keys, schema design, path privacy, and capture classification behavior.
  • Removed the temporary dissertation metrics/export utility from this PR scope and kept analysis/export work separate from the server write contract.
  • Stabilized browser/WebDriver tests, including serialized browser tests, improved timeout diagnostics, optional-message helpers, and a query(...).wait(...) Mocha result wait.
  • Updated dependency/config details needed for audit and CI checks, including the rustls advisory update and logging/config cleanup.

Validation:

  • cargo test export_bindings
  • cargo clippy --all-targets --all-features -- -Dwarnings
  • cargo test --lib capture_
  • cargo test --lib code_external_insert_classifier
  • Client: pnpm exec tsc -noEmit
  • VS Code extension: pnpm exec tsc -noEmit
  • VS Code extension: pnpm exec eslint src
  • Manual paste-capture smoke test confirmed a code_paste event appears in the capture DB from the Extension Development Host.

Update the ignored PostgreSQL integration test to assert the rich events schema columns and fix timestamp/JSONB parameter casts used by the capture insert path. Verified against the AWS PostgreSQL database with event_capture_inserts_rich_schema_event_into_db.
macOS CI occasionally delivered the loadfile acknowledgement just after the old two-second harness timeout. Increase the shared browser message wait to five seconds so test_client does not fail on that timing edge.
The first CI rerun passed the original test_client wait but exposed the same timing issue in test_client_updates while waiting for the autosave content update. Use the client response window as the shared browser test wait budget.
The overall browser tests share one WebDriver endpoint and were running concurrently inside the same test binary. This was causing test_client_updates to miss its autosave content update on CI, especially macOS/Safari. Guard the harness with a shared async mutex so each browser session runs in isolation.
Copy link
Copy Markdown
Owner

@bjones1 bjones1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's some initial comments on the PR, mainly questions -- I'd like to hear your thoughts. I'll continue to review.

Comment thread extensions/VSCode/src/extension.ts
Comment thread extensions/VSCode/src/extension.ts Outdated
Comment thread extensions/VSCode/src/extension.ts Outdated
Comment thread extensions/VSCode/src/extension.ts
Comment thread extensions/VSCode/src/extension.ts
Comment thread extensions/VSCode/src/extension.ts
Comment thread extensions/VSCode/src/extension.ts
Comment thread extensions/VSCode/src/extension.ts
Use generated Rust-backed capture wire/status types in the VS Code extension.

Restore the explanatory extension comments and the current-file update after LoadFile.

Keep study lifecycle commands available for automation while removing them from the Command Palette.
Resolve conflicts in the VS Code extension, translation capture path, and overall test harness.

Keep upstream CursorPosition/WebDriver updates while preserving capture instrumentation and serialized browser test timing.
Copy link
Copy Markdown
Owner

@bjones1 bjones1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More review.

Comment thread extensions/VSCode/src/extension.ts
Comment thread extensions/VSCode/src/extension.ts Outdated
Comment thread extensions/VSCode/src/extension.ts Outdated
Comment thread server/src/webserver.rs Outdated
Comment thread extensions/VSCode/src/extension.ts
Comment thread extensions/VSCode/src/extension.ts
Comment thread extensions/VSCode/src/extension.ts Outdated
Comment thread extensions/VSCode/src/extension.ts Outdated
Comment thread extensions/VSCode/src/extension.ts
Comment thread extensions/VSCode/src/extension.ts
Comment thread server/src/webserver.rs Outdated
Comment thread server/tests/overall_common/mod.rs
Comment thread server/tests/overall_common/mod.rs
Comment thread server/src/webserver.rs Outdated
Comment thread server/src/webserver.rs
Comment thread server/src/webserver.rs Outdated
Comment thread server/src/webserver.rs Outdated
Comment thread server/tests/overall_1.rs Outdated
Comment thread server/tests/overall_1.rs
Comment thread server/tests/overall_1.rs
Comment thread .gitignore
Comment thread .gitignore Outdated
Comment thread capture_config.example.json
Comment thread extensions/VSCode/package.json Outdated
Comment thread extensions/VSCode/package.json Outdated
Comment thread client/src/CodeMirror-integration.mts Outdated
Comment thread client/src/CodeMirror-integration.mts Outdated
Copy link
Copy Markdown
Owner

@bjones1 bjones1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good progress!

If there's some discussion/a question you answer, don't resolve it -- this helps me find an read your responses. When everything's already resolved, it's hard for me to find/think about discussions.

Comment thread server/scripts/capture_events_schema.sql
Comment thread server/src/webserver.rs Outdated
Comment thread server/src/capture.rs Outdated
Comment thread server/src/capture.rs Outdated
Comment thread server/src/capture.rs Outdated
Comment thread server/src/capture.rs Outdated
Comment thread server/src/capture.rs
Comment thread server/src/capture.rs Outdated
Comment thread server/src/capture.rs Outdated
Comment thread server/src/capture.rs
Add a code_external_insert_candidate capture event for code edits that look non-incremental but were not observed as paste operations.

The classifier records only coarse metadata: basis, confidence, size band, block kind, source, and classification basis. Paste markers continue to take precedence so a single edit is not double-counted as both paste and heuristic external insertion.

Include targeted code comments and unit coverage for multi-line, small single-line, and large-block classifier behavior.
Comment thread server/src/webserver.rs Outdated
Comment thread server/src/webserver.rs Outdated
Comment thread server/src/translation.rs Outdated
Comment thread server/src/capture.rs
Comment thread server/scripts/capture_events_schema.sql
Comment thread server/tests/overall_1.rs Outdated
Comment thread server/src/capture.rs
Comment thread server/src/translation.rs Outdated
Comment thread server/src/translation.rs
Comment thread server/src/translation.rs Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants