feat: mlx-proxy 서버 + n8n 워크플로우 LLM/임베딩 URL 분리

mlx-vlm 기반 ollama 호환 프록시 서버 추가 (port 11435). n8n GEN 노드 6개에 callLLM 래퍼 주입 (health check + ollama fallback). 임베딩/리랭커는 ollama(LOCAL_EMBED_URL)로 분리. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 10:00:00 +09:00
parent a050f2e7d5
commit 1137754964
5 changed files with 162 additions and 14 deletions
@@ -28,6 +28,12 @@ LOCAL_OLLAMA_URL=http://host.docker.internal:11434
 # Ollama (GPU 서버 — RTX 4070Ti Super, 기본 모델: id-9b:latest)
 GPU_OLLAMA_URL=http://192.168.1.186:11434

+# mlx-proxy (맥미니 — LLM 생성용, ollama 호환, 기본 모델: qwen3.5:27b)
+LOCAL_LLM_URL=http://host.docker.internal:11435
+
+# 임베딩 전용 (ollama — bge-m3, bge-reranker)
+LOCAL_EMBED_URL=http://host.docker.internal:11434
+
 # Qdrant (Docker 내부에서 접근)
 QDRANT_URL=http://host.docker.internal:6333

@@ -63,3 +69,4 @@ CHAT_BRIDGE_URL=http://host.docker.internal:8091
 CALDAV_BRIDGE_URL=http://host.docker.internal:8092
 DEVONTHINK_BRIDGE_URL=http://host.docker.internal:8093
 MAIL_BRIDGE_URL=http://host.docker.internal:8094
+KB_WRITER_URL=http://host.docker.internal:8095