fix: prevent unnecessary full reindex when Qdrant collection already exists#12147
Draft
roomote-v0[bot] wants to merge 1 commit intomainfrom
Draft
fix: prevent unnecessary full reindex when Qdrant collection already exists#12147roomote-v0[bot] wants to merge 1 commit intomainfrom
roomote-v0[bot] wants to merge 1 commit intomainfrom
Conversation
…exists Root cause: getCollectionInfo() treated ALL errors (including connection failures and timeouts) as "collection not found", causing initialize() to create a new collection even when one already existed with valid data. Additionally, the error handler in the orchestrator aggressively cleared both the collection and cache on any indexing error, destroying existing indexed data. Changes: - getCollectionInfo(): Only return null for 404 (not found) errors; propagate other errors (connection failures, timeouts) so callers can distinguish "missing collection" from "unreachable Qdrant" - hasIndexedData(): Let connection errors propagate instead of silently returning false (which triggered full reindex) - collectionExists(): Same error propagation improvement - Orchestrator error handler: Only clear collection + cache when the collection was just created (no pre-existing data to preserve). For existing collections, flush/persist the cache instead of clearing it so incremental scans can resume on next startup. Fixes #12145
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR attempts to address Issue #12145. Feedback and guidance are welcome.
Problem
Users with persistent Qdrant storage were seeing a full reindex every time they reopened VS Code, even though the collection and its data were intact in Qdrant.
Root Causes
getCollectionInfo()treated ALL errors as "collection not found" - Connection failures, timeouts, and other transient errors were caught and returned asnull, makinginitialize()think the collection did not exist and create a new one.Aggressive error cleanup destroyed valid data - The orchestrator error handler called
clearCollection()+clearCacheFile()on any indexing error, even transient ones on an existing collection with valid data.hasIndexedData()returnedfalseon connection errors - Same issue as Add options to always approve write and execute operations #1, causing the orchestrator to skip the incremental scan path and do a full scan.Changes
src/services/code-index/vector-store/qdrant-client.tsisNotFoundError()helper to distinguish 404 (collection missing) from other errorsgetCollectionInfo()now returnsnullonly for 404 errors; propagates other errorshasIndexedData()lets connection errors propagate instead of silently returningfalsecollectionExists()inherits the same error propagationsrc/services/code-index/orchestrator.tscollectionCreatedflag to decide cleanup behaviorclearCollection()to preserve the user's existing indexTests updated
qdrant-client.spec.ts: Non-404 errors now throw instead of creating a collectionorchestrator.spec.ts: Split error-cleanup test into two cases (new vs existing collection)Fixes #12145
Interactively review PR in Roo Code Cloud