c4a40ab18a
사용자 결정 (2026-05-24, measurement chain 4-layer 정정 완료 후): > Phase 2Q Query Rewrite is closed as an evaluated experiment. > After result-level dedup correction, true net gain was marginal > (NDCG +0.019, Recall t≥2 +0.030) while latency cost was high > (cold +876%, warm +320%). Therefore, multi-query rewrite is not > recommended for default production rollout. Keep opt-in path as > experimental/deprecated reference only; do not proceed to > Cache-Prewarm unless future real-query evidence shows a stronger gain. 변경: - docs/phase_2q_apply_opt_in.md: 🛑 DEPRECATED / EXPERIMENTAL status 박제. measurement chain 정정 history (4-layer) + 진짜 효과 + Phase 2Q 성과 보존. - app/api/search.py: rewrite_backend query param description 갱신 (⚠️ EXPERIMENTAL/DEPRECATED, production 추천 문구 제거, opt-in 실험 reference 만 유지 명시). 5 액션 박제 (사용자 결정): 1. opt-in 코드 유지 (recommended=false / experimental) 2. docs/ deprecated 박제 3. search.py description production 추천 제거 4. PR-2Q-Cache-Prewarm + PR-2Q-Apply-Default-ON-1 폐기 5. Extended 4건 중 SynonymDict (deterministic, LLM 우회) 만 별도 후보 보존 신규 feedback memory: [[feedback_measurement_chain_audit]] — Diagnose 측정이 Apply/rollout 결정 기준일 때 retrieval/fusion/rerank/eval 모든 layer audit 필수. Phase 2Q 4-iteration 정정 chain (0.927→0.876→0.641→0.663) origin. Phase 2Q 성과 (실패가 아닌 좋은 실험): - chunk_id/doc_id 중복 inflation 발견 + measurement chain audit pattern 확립 - LLM rewrite 는 현재 DS 검색 기본값으로는 ROI 낮음 결론 확보 - search_pipeline 의 multi-query 합성 + 3-layer dedup 인프라 보존 (Extended SynonymDict 또는 미래 cloud LLM scaffold 재사용 가능) - 신규 feedback memory 4건: fixture-first-call-shape / apply-prereq-structural-fix / graded-ndcg-dedup-invariant / measurement-chain-audit main 위 직접 commit (read-only docs / API description, retrieval path 영향 0). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>