fix(caches): bound _curation_results and add npz mtime to image-label key#263
Open
lstein wants to merge 1 commit into
Open
fix(caches): bound _curation_results and add npz mtime to image-label key#263lstein wants to merge 1 commit into
lstein wants to merge 1 commit into
Conversation
… key Two in-memory caches accumulate per-job/per-image entries on long-running servers: * ``_curation_results`` in ``routers/curation.py`` — one entry per curation job_id, written from the background task and read by the poll endpoint. Fully unbounded. * ``_IMAGE_LABEL_CACHE`` in ``cluster_labels.py`` — already had an inline OrderedDict + lock + handcrafted LRU eviction with a max of 1024, BUT the cache key was ``(embeddings_path, sorted_index, vocab_mtime)`` — no embeddings ``.npz`` mtime. Re-indexing an album (which can reshuffle the raw row → sorted-index mapping) left stale labels in place until the vocab file was touched. This change: 1. Adds ``BoundedLRU[K, V]`` to ``util.py`` — a thread-safe LRU with ``get`` / ``put`` / ``clear``, capacity-bounded. Replaces the ad-hoc OrderedDict pattern. 2. Migrates ``_curation_results`` to ``BoundedLRU(maxsize=64)`` — 64 is well above any realistic in-flight + recently-polled working set (curation jobs complete in seconds; the frontend polls each job_id once or twice). The dedicated ``_curation_results_lock`` is gone — BoundedLRU does its own locking. 3. Migrates ``_IMAGE_LABEL_CACHE`` to the new helper, drops the inline helpers, and adds the ``.npz`` mtime to the cache key so re-indexing an album naturally invalidates its image labels. Behavior preserved end-to-end — the curation polling endpoint still returns the result by job_id (until LRU eviction; previously the entry persisted forever), and the image-label cache still returns the same labels for the same image until either the embeddings .npz changes (new behavior — fixed stale-after-reindex) or the vocab file changes (previous behavior, unchanged). The matching ``test_compute_image_label_cache_evicts_past_max`` was patching a now-deleted module-level ``_IMAGE_LABEL_CACHE_MAX`` constant; updated to swap in a tiny ``BoundedLRU(maxsize=3)`` via ``monkeypatch.setattr`` so the cap is observable in a few iterations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two in-memory caches accumulate per-job / per-image entries on long-running servers:
What changes
Adds `BoundedLRU[K, V]` to `util.py` — a thread-safe LRU with `get` / `put` / `clear`, capacity-bounded. Replaces the ad-hoc OrderedDict + lock + popitem pattern.
Migrates `_curation_results` to `BoundedLRU(maxsize=64)` — 64 is well above any realistic in-flight + recently-polled working set (curation jobs complete in seconds; the frontend polls each job_id once or twice). The dedicated `_curation_results_lock` is gone — `BoundedLRU` does its own locking.
Migrates `_IMAGE_LABEL_CACHE` to the new helper, drops the inline get/put helpers, and adds the .npz mtime to the cache key so re-indexing an album naturally invalidates its image labels.
Behavior preserved
Test plan
Net +44 lines across 4 files. `util.py` picks up ~54 lines for the helper + JSDoc-style docstring; `cluster_labels.py` and `curation.py` both net negative because the inline LRU plumbing is gone.
🤖 Generated with Claude Code