Summary
The ecommerce-catalog-search agent responds successfully but takes 20-94 seconds per request, well above the 10-second target. The 60s upstream timeout is a symptom; the root cause is compounding sequential I/O in the agent's search pipeline.
Root Cause Analysis
Four compounding latency sources identified in agents.py and ai_search.py:
| # |
Issue |
Impact |
Fix |
| 1 |
Duplicate keyword search — handle() calls _search_products_keyword again after _search_products_intelligent already calls it internally as a baseline |
+3-5s wasted |
Return baseline as 4th tuple element; remove duplicate call |
| 2 |
Sequential CRUD fetches — _resolve_ranked_products fetches each product SKU one-by-one in a for loop |
+N*latency (up to 10s for 10 products) |
Parallelize with asyncio.gather |
| 3 |
Sequential AI Search sub-queries — multi_query_search runs each sub-query sequentially |
+N*latency (up to 6s for 3 sub-queries) |
Parallelize with asyncio.gather |
| 4 |
Two sequential model calls — intent classification (~8s) + response generation (~14s) |
~22s minimum |
Deferred (architecture change) |
Fixes 1-3 are code-level optimizations. Fix 4 is deferred as it requires an architectural change.
Reproduction
POST to /invoke with query Im traveling to Russia. Which clothes you have? — response succeeds but takes 20-94 seconds.
Acceptance Criteria
Files Affected
apps/ecommerce-catalog-search/src/ecommerce_catalog_search/agents.py
apps/ecommerce-catalog-search/src/ecommerce_catalog_search/ai_search.py
Summary
The
ecommerce-catalog-searchagent responds successfully but takes 20-94 seconds per request, well above the 10-second target. The 60s upstream timeout is a symptom; the root cause is compounding sequential I/O in the agent's search pipeline.Root Cause Analysis
Four compounding latency sources identified in
agents.pyandai_search.py:handle()calls_search_products_keywordagain after_search_products_intelligentalready calls it internally as a baseline_resolve_ranked_productsfetches each product SKU one-by-one in a for loopasyncio.gathermulti_query_searchruns each sub-query sequentiallyasyncio.gatherFixes 1-3 are code-level optimizations. Fix 4 is deferred as it requires an architectural change.
Reproduction
POST to
/invokewith queryIm traveling to Russia. Which clothes you have?— response succeeds but takes 20-94 seconds.Acceptance Criteria
handle()does NOT call_search_products_keyworda second time after_search_products_intelligent_resolve_ranked_productsusesasyncio.gatherfor parallel CRUD fetchesmulti_query_searchusesasyncio.gatherfor parallel AI Search sub-queriesFiles Affected
apps/ecommerce-catalog-search/src/ecommerce_catalog_search/agents.pyapps/ecommerce-catalog-search/src/ecommerce_catalog_search/ai_search.py