Feat/ds ai routing policy #23

Merged
hyungi merged 2 commits from feat/ds-ai-routing-policy into main 2026-05-24 12:20:50 +09:00

2 Commits

Author SHA1 Message Date
hyungi 00edd6bff8 feat(ask): backend selector 4 options with device toggle
PR-3 of DS AI routing policy (2026-05-23, see plan
~/.claude/plans/document-server-ai-cheeky-reddy.md +
memory project_document_server_ai_routing_policy).

기존 BackendSelector (PR-DocSrv-Web-Ask-Selector-1, 2 옵션 default
qwen-macbook) 확장 — 4 옵션 + DeviceToggle inline.

UI 변경 (frontend/src/routes/ask/+page.svelte):
- BackendChoice = auto | mac-mini-default | qwen-macbook | claude-cloud
  (기존 default 는 legacy alias, auto 또는 mac-mini-default 로 자동 매핑).
- select 4 옵션 (Auto router / Mac mini default / This device /
  Claude Cloud) + tooltip.
- DeviceToggle (checkbox 'This is M5 Max') inline — localStorage
  ds_device_self_label = macbook-m5-max | null. mount 시 복원.
- This device 옵션 disabled state = !isMacBookM5Max (토글 off 시
  grey-out). 토글 off 시 qwen-macbook 선택돼 있었으면 auto 복귀.
- Claude Cloud 옵션 disabled state = !CLOUD_DEV_ENABLED (build-time
  flag VITE_ENABLE_CLOUD_BACKEND_DEV, default false). 운영 토글
  불가 — 후속 PR DS runtime feature flag API 로 migrate 예정.
- friendlyErrorMessage(reason) — 503 error_reason 매핑
  (macbook_unavailable / provider_not_configured / router_* / upstream_*).
- retryWithDefault → retryWithMacMiniDefault 명명 정정.
- parseBackend backward-compat: default / gemma-macmini →
  mac-mini-default.

source IP 의존 0 (PR-0 round 2 발견: caddy 2-hop + X-Forwarded-For
미설정 → DS 가 보는 source IP = LAN gateway, 신뢰 불가).
사용자 명시 토글 + localStorage 방식 채택 (Q3=C).

Closure (build + bundle string + lint):
- frontend build PASS (SvelteKit/TS syntax + svelte compile 모두 OK).
- 컴파일된 bundle 에 9 핵심 string 박혀있음 (mac-mini-default /
  qwen-macbook / claude-cloud / Auto router / This is M5 Max /
  ds_device_self_label / provider_not_configured / This device /
  Cloud backend not configured).
- lint:tokens 본 PR 변경 위반 0 (기존 62 stale debt 는 별 chore
  PR-DocSrv-Frontend-Token-Cleanup-1).

Backup: ~/.local/share/ds-routing-pr2-backups/20260523/
ask-page.svelte.pre-pr3.

선행: PR-1 (llm-router alias scaffold) + PR-2 (RouterBackend
dispatcher, refactor commit bcf644f) closed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 03:42:39 +00:00
hyungi bcf644f893 refactor(search): /api/search/ask dispatcher route via llm-router
PR-2 of DS AI routing policy (2026-05-23, see plan
~/.claude/plans/document-server-ai-cheeky-reddy.md +
memory project_document_server_ai_routing_policy).

DS 의 모든 backend 호출이 llm-router :8890 단일 경유. 정칙 정합:
- 신규 RouterBackend (services/llm/backends.py) — alias 별 router POST
  + requires_gate 분기 (mac-mini-default 만 llm_gate FOREGROUND 보호).
- 기존 GemmaMacMiniBackend + QwenMacBookBackend = legacy 보존
  (DS_BACKENDS_VIA_ROUTER=false rollback safety only). 1주 후 별
  cleanup PR (PR-DS-Backends-Legacy-Cleanup-1) 로 폐기.
- get_backend factory dual-path (env flag) — backward-compat
  (gemma-macmini alias → mac-mini-default 매핑).
- search.py:457 Query pattern 확장: mac-mini-default|claude-cloud|auto
  추가. /ask/react 의 isinstance(QwenMacBookBackend) → hasattr
  duck-typing (RouterBackend + Legacy 모두 generate_with_tools 구현).
- SearchAskBackendConfig 에 router_url 신규 (env LLM_ROUTER_URL 또는
  hardcoded MVP default http://100.76.254.116:8890).
- docker-compose.yml fastapi env 에 LLM_ROUTER_URL +
  DS_BACKENDS_VIA_ROUTER 추가.

AIClient (_call_chat, call_triage, call_primary, call_fallback) 경유
path 는 별 PR (PR-AIClient-Router-Migration-1) — MVP scope C 채택,
회귀 risk 최소화.

Closure (즉시 fixture/matrix):
- factory smoke 6 alias (None/mac-mini-default/gemma-macmini/
  qwen-macbook/claude-cloud/auto) + 1 invalid (nonsense → ValueError).
- live 3 case: mac-mini-default 200 \"pong! 🏓\" + qwen-macbook cold
  502 upstream_502_primary=ConnectError + claude-cloud 503
  provider_not_configured.
- silent fallback 0 + direct M5/Mac mini socket 0
  (RouterBackend 만 router 호출).

Backup: ~/.local/share/ds-routing-pr2-backups/20260523/
(backends.py + config.py + search.py + docker-compose.yml).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 03:41:29 +00:00