# Phase 2Q Multi-Query Rewrite โ€” โš ๏ธ DEPRECATED / EXPERIMENTAL (2026-05-24 closed) ## ๐Ÿ›‘ Status: closed as evaluated experiment > Phase 2Q Query Rewrite is closed as an evaluated experiment. > After result-level dedup correction, true net gain was marginal > (NDCG +0.019, Recall tโ‰ฅ2 +0.030) while latency cost was high > (cold +876%, warm +320%). Therefore, multi-query rewrite is not > recommended for default production rollout. Keep opt-in path as > experimental/deprecated reference only; do not proceed to > Cache-Prewarm unless future real-query evidence shows a stronger gain. **opt-in flag `?rewrite_backend=cand_multi_query_macmini` ๋Š” ์ฝ”๋“œ ์œ ์ง€ (์‹คํ—˜ reference)**. ๋‹จ **production default rollout ๊ถŒ๊ณ  X**. PR-2Q-Cache-Prewarm / PR-2Q-Apply-Default-ON-1 ํ๊ธฐ. Extended ํŠธ๋ž™ ์ค‘ SynonymDict (deterministic, LLM ์šฐํšŒ) ๋งŒ ๋ณ„๋„ ํ›„๋ณด๋กœ ๋ณด์กด. ## ๊ฐœ์š” (์—ญ์‚ฌ ๋ฐ•์ œ) Phase 2Q Diagnose ๊ฒฐ๊ณผ H1 (both backends ์œ ์˜๋ฏธ net ๊ฐœ์„ ) ํ™•์ • + Rerank-Payload-Fix ์™„๋ฃŒ ํ›„ Apply opt-in ์ง„์ž… (commit `fef5ddc`). **๋‹จ measurement chain ์˜ ๋‹ค์ธต inflation ๋ฐœ๊ฒฌ ํ›„ ์ •์ •๊ฐ’ ๊ธฐ์ค€ ๊ฒฐ์ • = closed as experiment.** ## ์ธก์ • ์ •์ • history (๋ชจ๋“  inflation ์ •์ •) | Layer | commit | NDCG | inflation ์›์ธ | |---|---|---:|---| | Phase 3 | `a41adb6` | 0.927 | chunk_id ์ค‘๋ณต ๋ˆ„์  | | Rerank-Fix | `b734fc5` | 0.876 | doc_id ์ž”์žฌ (chunk dedup ๋งŒ) | | Eval-Dedup | `3553573` | 0.641 | eval layer ๋งŒ dedup | | **Result-Dedup (์ตœ์ข…)** | **`5e480d6`** | **0.663** | โœ… 0/51 dedup audit ์ •์ƒ | **์ง„์งœ multi-query ํšจ๊ณผ** (baseline 0.644 ๋Œ€๋น„): - NDCG cold +0.019 / warm +0.015 โ† sub-noise - Recall tโ‰ฅ2 cold +0.030 / warm +0.022 โ† ์†Œ๋Ÿ‰ ๊ฐœ์„  - Recall tโ‰ฅ3 0.000 (cold) / -0.022 (warm) โ† ๋™๋“ฑ~์•ฝ๊ฐ„ ํšŒ๊ท€ - **latency p50 cold +876% (3692ms) / warm +320% (1588ms)** โ† ๋น„์šฉ ๋ช…ํ™• - ์นดํ…Œ๊ณ ๋ฆฌ: english/standards/mixed ์†Œ๋Ÿ‰ ์šฐ์„ธ / exam/korean ์†Œ๋Ÿ‰ ํšŒ๊ท€ โ†’ **multi-query ์˜ marginal quality ๊ฐœ์„ ์ด latency cost + ์‹œ์Šคํ…œ ๋ณต์žก๋„ + LLM ์˜์กด ์ •๋‹นํ™” X**. ## ๊ถŒ๊ณ  (์‚ฌ์šฉ์ž ๊ฒฐ์ • 2026-05-24) **Phase 2Q ์ž์ฒด๋Š” ์‹คํŒจ๊ฐ€ ์•„๋‹Œ ์ข‹์€ ์‹คํ—˜**. ์„ฑ๊ณผ: - chunk_id ์ค‘๋ณต inflation ๋ฐœ๊ฒฌ (Phase 3 โ†’ Rerank-Fix) - doc_id / result dedup ๋ฌธ์ œ ์ •๋ฆฌ (Eval-Dedup โ†’ Result-Dedup) - multi-query ์˜ ์‹ค์ œ ํšจ๊ณผ๋ฅผ ์ •๋Ÿ‰ํ™” (NDCG +0.019) - "LLM rewrite ๋Š” ํ˜„์žฌ DS ๊ฒ€์ƒ‰ ๊ธฐ๋ณธ๊ฐ’์œผ๋กœ๋Š” ROI ๋‚ฎ์Œ" ๊ฒฐ๋ก  ํ™•๋ณด - ์‹ ๊ทœ feedback ๋ฉ”๋ชจ๋ฆฌ 3๊ฑด (fixture-first call shape / apply prereq structural fix / graded NDCG dedup invariant) **๊ธฐ๋Šฅ ์ž์ฒด๋Š” deprecated, ๊ตํ›ˆ๊ณผ ์ธํ”„๋ผ๋Š” ๋ณด์กด**. ## ~~rollout ์ •์ฑ…~~ (์—ญ์‚ฌ ๋ฐ•์ œ) ์ด์ „ ๊ฒฐ์ •: opt-in 1์ฃผ ๊ด€์ฐฐ ~2026-05-31 โ†’ default ON ๊ฒ€ํ† . **์ •์ • ๊ฒฐ์ • (2026-05-24)**: closed as evaluated experiment, default ON ์ง„ํ–‰ X. **์ถ”์ฒœ LLM = `cand_multi_query_macmini` (gemma-4-26b-a4b-it-8bit, Mac mini)**. 4-factor weighted ์‚ฌ์œ  (decision md ยง4): 1. โญ Availability โ€” 24/7 ๊ฐ€๋™ (qwen MacBook lap-top ์˜์กด) 2. NDCG 0.927 dominant (qwen 0.919 ์™€ ๋™๋“ฑ ๋‹จ noise level) 3. Cold latency ์šฐ์„ธ (gemma 2757ms vs qwen 3647ms cold p50) 4. ์นดํ…Œ๊ณ ๋ฆฌ standards/exam/korean ๊ฐ•์  (๋„๋ฉ”์ธ ์ค‘์‹ฌ) ๋Œ€์•ˆ = `cand_multi_query_macbook` (qwen3.6-27B, mixed/english ๊ฐ•์ ) โ€” MacBook always-on ์˜ํ–ฅ ์‹œ ๊ฐ€๋Šฅ. ## ์‚ฌ์šฉ ๋ฐฉ๋ฒ• ### Query parameter (opt-in) ```bash GET /api/search/?q= &mode=hybrid &limit=20 &rewrite_backend=cand_multi_query_macmini # opt-in, ๋ฏธ์ง€์ • ์‹œ single-query path ``` - `rewrite_backend` ๋ฏธ์ง€์ • ๋˜๋Š” `baseline` โ†’ ๊ธฐ์กด single-query path 100% ๊ทธ๋Œ€๋กœ (baseline ํšŒ๊ท€ 0 invariant, Phase 2Q Phase 2 + Rerank-Fix ์ธก์ • ๋ฐ•์ œ). - `rewrite_backend=cand_multi_query_macmini` โ†’ multi-query (3 variants) + unified RRF + reranker. - `rewrite_backend=cand_multi_query_macbook` โ†’ qwen (MacBook ๊ฐ€๋™ ์‹œ). - ๋ฏธ์ง€์› slug โ†’ HTTP 400 `unknown_rewrite_backend`. - LLM ํ˜ธ์ถœ ์‹คํŒจ โ†’ HTTP 503 `rewrite_llm_unavailable` (no silent fallback). ### SvelteKit / fetch ์˜ˆ์‹œ ```typescript const res = await fetch( `/api/search/?q=${encodeURIComponent(q)}&mode=hybrid&limit=20&rewrite_backend=cand_multi_query_macmini`, { headers: { Authorization: `Bearer ${token}` } } ); ``` ## 1์ฃผ ๊ด€์ฐฐ metric (๋ชฉํ‘œ) | Metric | ๋ชฉํ‘œ๊ฐ’ | ์ธก์ • source | ํšŒ๊ท€ ์‹œ action | |---|---|---|---| | **Rewrite cache hit rate** | โ‰ฅ 50% (1์ฃผ์ฐจ) | `[rewrite-dispatch]` log `cache_hit=true` ๋น„์œจ | `PR-2Q-Cache-Prewarm` (nightly cron) | | **LLM latency warm p50** | โ‰ค 1500ms | `[rewrite-dispatch]` log `llm_latency_ms` | gemma ๊ฐ€๋™ ์ƒํƒœ ํ™•์ธ, semaphore ๊ฒฝ์Ÿ ์ง„๋‹จ | | **LLM latency cold p50** | โ‰ค 3000ms | ๋™์ƒ | cache prewarm ๋„์ž… ๊ฒ€ํ†  | | **503 ๋ˆ„์ ** | โ‰ค 5/day | fastapi ์‘๋‹ต status 503 | LLM endpoint health / circuit breaker ๊ฒ€ํ†  | | **Recall@10 tโ‰ฅ3** | โ‰ฅ 0.74 (production traffic ๋ถ„์„) | random sampling ๋˜๋Š” ๋ณ„ dashboard | NDCG ํšŒ๊ท€ ๋ถ„์„ + ์นดํ…Œ๊ณ ๋ฆฌ ๋ถ„ํฌ | | **์‚ฌ์šฉ์ž negative feedback** | 0๊ฑด | ์‚ฌ์šฉ์ž channel | ์ฆ‰์‹œ rollback ๋˜๋Š” priority fix | ## 1์ฃผ ๊ด€์ฐฐ ์ข…๋ฃŒ์ผ (2026-05-31) decision - 4 metric ์ •์ƒ + ์‚ฌ์šฉ์ž negative feedback 0 โ†’ `PR-2Q-Apply-Default-ON-1` ์ง„์ž… (default ON ์ „ํ™˜) - 1 metric ์ด์ƒ ํšŒ๊ท€ โ†’ ๋ณ„ fix PR ํ›„ 1์ฃผ ์ถ”๊ฐ€ ๊ด€์ฐฐ - catastrophic ํšŒ๊ท€ โ†’ rollback (rewrite_backend default null ์˜๊ตฌ ์œ ์ง€) ## Phase 2 QueryAnalyzer sequencing Phase 2 QueryAnalyzer (`app/services/search/query_analyzer.py`) ๊ฐ€ production ๊ฐ€๋™ ์ค‘ ์ด์ง€๋งŒ retrieval path ์˜ํ–ฅ 0 (debug ๋…ธ์ถœ๋งŒ, `app/api/search.py:156` ์ฝ”๋ฉ˜ํŠธ ๋ฐ•์ œ, ask_events 0๊ฑด ์šด์˜ ๊ด€์ฐฐ ํ›„ ํ™•์ •). Phase 2Q multi-query rewrite ์™€ ์ถฉ๋Œ ์—†์Œ. โ†’ Apply ์ง„์ž… ์‹œ ๋‘ layer ๋ชจ๋‘ ๊ฐ€๋™, ๊ฒฐ๊ณผ ์ผ์น˜์„ฑ invariant ์œ ์ง€. ## Follow-up PR (๋ณ„ ํŠธ๋ž™) - **PR-2Q-Apply-Telemetry-1** โ€” `[rewrite-dispatch]` log ๋ฅผ `search_failure_logs` ๋˜๋Š” ๋ณ„ telemetry ํ…Œ์ด๋ธ” ์— ๋ˆ„์  (search_telemetry.py ํŒจํ„ด ์žฌ์‚ฌ์šฉ). 1์ฃผ ๊ด€์ฐฐ metric ์˜ ์ •๋Ÿ‰ ๋ถ„์„ source. - **PR-2Q-Alert-1** โ€” Prometheus + ntfy alert rule (LLM 503 โ‰ฅ 10/hour / cache hit < 30% 7d window). monitoring stack ์˜์—ญ. - **PR-2Q-Apply-Default-ON-1** โ€” 1์ฃผ ๊ด€์ฐฐ ์ข…๋ฃŒ ํ›„ default ON ์ „ํ™˜. - **PR-2Q-Cache-Prewarm** โ€” cache hit rate < 50% ๊ด€์ฐฐ ์‹œ nightly cron. - **PR-2Q-Apply-Category-Analysis** โ€” Rerank-Fix ์ธก์ •์˜ ์นดํ…Œ๊ณ ๋ฆฌ ํšŒ๊ท€ (standards -0.28, exam -0.19) ๋ถ„์„. RRF fallback vs reranker ์˜ ranking ๋™์ž‘ ์ฐจ์ด ๋ฐ•์ œ. ## ๊ด€๋ จ ์ž๋ฃŒ - decision md = `tests/search_eval/baselines/v0_2_phase2q_decision_2026-05-24.md` - Rerank-Fix ์ธก์ • = `tests/search_eval/baselines/v0_2_phase2q_rerank_fix_2026-05-24.json` - Phase 2Q 3 ์ธก์ • = `tests/search_eval/baselines/v0_2_phase2q_results_2026-05-24.json` - Plan = `~/.claude/plans/pr-2q-apply-query-rewrite-1-bright-meadow.md` + `~/.claude/plans/follow-up-pr-8-lazy-shore.md` (sequencing) - Phase 2Q Diagnose plan = `~/.claude/plans/phase-2q-query-rewrite-diagnose.md` v6 - main merge commits: - `711d495` Phase 2Q Diagnose 5 commit - `0257a5d` Rerank-Payload-Fix