Files
hyungi_document_server/app/api
hyungi ecd2350c15 feat(search): Phase 2Q Diagnose Phase 2 — multi-query retrieval fusion
phase-2q-query-rewrite-diagnose.md v6 plan §5.5 + §7 Phase 2.
Phase 1B 3e6866b (scaffold + dispatcher) 위 retrieval 합성 wire-up.

신규:
- search_pipeline._rrf_fuse_variants() — N variant ranked list RRF 합성.
  fusion_service.RRFOnly 알고리즘 동일 (k=60), 첫 등장 variant representative 보존.
- search_pipeline.search_with_rewrite() — variant N 별 retrieval+fusion 후
  unified RRF (cap 60) → reranker 1회 (query=원본 q) → diversity+freshness+display.
  · per-variant K = 50//3 = 16 (PHASE2Q_PRODUCTION_TOPK//N, A1 채택)
  · variant 별 retrieval asyncio.gather 병렬
  · chunks_by_doc merge (variant 무관 unified reranker input)
  · production fusion_service.get_strategy() + rerank_chunks() 재사용
- 상수: PHASE2Q_PRODUCTION_TOPK=50, PHASE2Q_UNIFIED_CAP=60, PHASE2Q_RRF_K=60.

수정:
- search_pipeline.run_search() — rewrite_backend param 추가. hybrid + cand_<slug> 시
  search_with_rewrite() 위임. baseline/None 시 기존 single-query path 그대로 (invariant).
- app/api/search.py — Phase 1B scaffold discard call 제거. run_search 에 rewrite_backend
  전달. ValueError → 400 (unknown_rewrite_backend 우선 분기) / RuntimeError → 503
  (rewrite_llm_unavailable).
- tests/test_query_rewriter.py — Phase 2 test 9개 추가:
  · _rrf_fuse_variants 6 (single / overlap accumulation / representative / cap limit /
    empty / rank position)
  · search_pipeline import + run_search rewrite_backend default=None signature 1
  · PHASE2Q_* constants 1
  · DATABASE_URL dummy 주입 (api.search import → SQLAlchemy engine init 회피)

30/30 unit test PASS (Phase 1B 21 + Phase 2 9).

baseline 회귀 0 invariant:
- run_search(rewrite_backend=None) → 기존 path 100% 그대로 (분기 first line guard)
- run_search(rewrite_backend=baseline) → 동일
- mode != hybrid → multi-query path 비활성 (text-only/vector-only/trgm 영향 0)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 22:41:50 +00:00
..