refactor(ai): GPU Ollama LLM 제거 — Mac mini 26B 단일 generation 호스트로 통일 #20
Reference in New Issue
Block a user
Delete Branch "feat/gpu-llm-remove"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
GPU 서버 정체성 = embedding/rerank/STT/OCR/marker 특화 백엔드.
Generative LLM 0. Mac mini gemma-4-26B-A4B 가 triage + primary +
classifier 모두 흡수. fallback 은 Claude Sonnet 4 API (자동 trigger,
premium 과 budget 공유).
후속 (별 커밋):
ssh gpu "ollama rm gemma4:e4b-it-q8_0"— VRAM ~11GB 회수.Mac mini 단일화 위험 mitigation = (1) Mac mini uptime 31d 무중단 검증,
(2) Claude Sonnet 4 API daily_budget $5 안 (Mac mini up 가정 호출 빈도 낮음),
(3) Beszel siteMonitor :8801 health check + Synology Chat alert.
plan: ~/.claude/plans/rosy-launching-otter.md §C/§D/§E (7-device LLM 배치 + 운영 전략)
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com