feat(collection-browse): tabular collection browse view behind feature flag#1780
Merged
Conversation
…owse flag Adds an unranked, paginated listing of a collection alongside the existing search experience. Backend reuses VespaVectorDB.filter_search() + count(), forces chunk_index = 0 so each source entity shows up as one row, and gates the endpoint on the new COLLECTION_BROWSE org feature flag. Frontend adds a BrowseTable component (rows + source filter + offset/limit pagination + row drawer) wired into CollectionDetailView as a Tabs sibling to Search, gated on the same flag via the existing organization store hasFeature helper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Debounced (250 ms) name-contains search bar with abort-on-change so a fast typist doesn't backlog requests. Sends a `name contains <q>` FilterGroup to the existing /browse endpoint; no backend change required. - Export menu: CSV or JSON, scoped to current page or all matching (capped at 1000 rows to stay within the offset/limit window). All-matching mode re-issues a single /browse call with the active filters. - Visual polish: tighter toolbar layout with right-aligned counts + export, bg-card on the table, sticky drawer header, copy-to-clipboard on entity_id, tabular-nums for counts, fixed column widths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bug: typing "Qui" returned 0 results when "Quick process to debug" was in the
collection. Root cause: Vespa's `contains` operator on attribute fields is a
token (whole-word) match, not a substring match — and `name` is indexed as
`attribute | summary` only. So `name contains 'Qui'` wanted the literal
token "Qui", not a substring.
Fix: add a dedicated `name_substring` path on the vector-db protocol that
emits `name matches "(?i).*<escaped>.*"` instead of going through the
shared FilterCondition translator. Special regex chars are escaped via
`re.escape` so a query like "1.5+" doesn't blow up the engine. The browse
request schema gets a corresponding `name_query` field, and the frontend
sends `{ name_query }` instead of building a (broken) `filter` group.
`contains` semantics elsewhere in the system (search filters, agentic
navigate tools) are unchanged — only the browse path uses the new clause.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ixture The new BrowseService field on Container caused all tests using the test_container fixture to error with "missing 1 required positional argument: 'browse_service'". Adds a FakeBrowseService matching the existing instant/classic/agentic fake pattern and threads it through conftest.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
2 issues found across 16 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/airweave/domains/search/adapters/vector_db/vespa_client.py">
<violation number="1" location="backend/airweave/domains/search/adapters/vector_db/vespa_client.py:316">
P1: Escape double quotes in `name_substring` before interpolating into the double-quoted YQL regex literal.</violation>
</file>
<file name="backend/airweave/api/v1/endpoints/search.py">
<violation number="1" location="backend/airweave/api/v1/endpoints/search.py:86">
P1: The browse endpoint path is missing the `/search` segment, so it won’t be reachable at the documented `/.../search/browse` URL.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
…tring YQL CI's diff-cover (80% threshold) failed at 52% on this branch. Adds unit tests for the three new code paths: - BrowseService: happy path, 404 on missing collection, chunk_index=0 anchor, sync_ids/entity_types convenience filters, user-filter combination semantics, name_query trim-to-None behavior. - browse_collection endpoint: feature-flag gate (returns 404, skips usage check + service), happy path through usage check + service. - VespaVectorDB._build_name_substring_clause: regex/YQL char escaping, plus presence/absence of the `matches` clause in count() and filter_search() YQL. Brings diff coverage to 98% (only the prod-only `BrowseService(...)` constructor call in factory.py is still uncovered, which would require running `create_container` end-to-end). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e prefix Two issues identified by cubic on PR #1780: 1. Move endpoint from `/{readable_id}/browse` to `/{readable_id}/search/browse` for consistency with sibling tiers (instant/classic/agentic, plus the admin `as-user` variants — all live under `/search/<tier>`). Frontend call site updated to match. 2. Escape double quotes in `_build_name_substring_clause`. The YQL string literal is double-quoted, and `re.escape` does not touch `"` (it's not a regex metacharacter), so a name_query like `Quick "test"` produced a broken YQL parse (and a potential YQL-injection vector). Switched the second-pass escape from single quote to double quote and added a test that the resulting clause has exactly two unescaped `"` (the literal delimiters). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hiddeco
reviewed
May 12, 2026
…vel imports, export chunking - BrowseRequest: cap sync_ids/entity_types at 100, require name_query min length 2 to avoid full-scan triggers - Move BrowseResponse import to module level in browse service and fake - Frontend: gate name search on >=2 chars with hint, loop export in 200-row chunks to respect backend limit Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
COLLECTION_BROWSEorg feature flag.VespaVectorDB.filter_search() + count()and forceschunk_index = 0so each source entity shows up as exactly one row. NewBrowseServiceis exposed through the DI container and wired into/collections/{readable_id}/search/browse.BrowseTablecomponent (rows, sync/entity-type filters, debounced substring name search, offset/limit pagination, row drawer, CSV/JSON export) wired intoCollectionDetailViewas a Tabs sibling to Search.name matches "(?i).*<escaped>.*") rather thancontains, because Vespa'scontainsoperator on attribute fields is whole-token match — typing "Qui" wasn't matching "Quick process to debug" via the existing FilterCondition translator.CI status (verified locally on changed lines)
FakeBrowseService+ threaded it through thetest_containerfixture so existing tests don't break on the new requiredContainer.browse_servicefield)Test plan
COLLECTION_BROWSEfeature flag on an org and open a collection — verify the Browse tab appears next to Search..,+,() don't break the query, and that aborts work when typing fast./search/browsereturns 403/feature-disabled.🤖 Generated with Claude Code
Summary by cubic
Adds a feature-flagged tabular browse view for collections with a new
POST /collections/{readable_id}/search/browseendpoint and a Browse tab in the UI. It lists entities unranked with pagination, filters, case-insensitive substring name search (properly escaped), and CSV/JSON export.New Features
POST /collections/{readable_id}/search/browsegated byFeatureFlag.COLLECTION_BROWSE; usesVectorDBProtocol.filter_search()+count()in parallel withchunk_index = 0; supportsname_queryvia a regexmatchesclause inVespaVectorDB(case-insensitive substring with escaping); addsBrowseService, DI wiring, andBrowseRequest/BrowseResponse.BrowseTablewith source filter, debounced name search, offset/limit pagination, row drawer, open-in-source link, and CSV/JSON export (cap 1000 for “all matching”); added Tabs inCollectionDetailView; newFeatureFlags.COLLECTION_BROWSE.Bug Fixes
/search/browsefor consistency with other tiers.sync_ids/entity_typescapped at 100;name_queryrequires ≥2 chars (frontend hints and only sends when long enough); export fetches in 200-row chunks to respect the backend page limit.Written for commit bcbf327. Summary will update on new commits.