Compare commits

...

2 Commits

Author SHA1 Message Date
Hyungi Ahn 96d57789bd feat(ai): align primary model with mlx-proxy actually loaded model
mlx-proxy on the mac mini currently loads
mlx-community/gemma-4-26b-a4b-it-8bit, but config.yaml was still
requesting mlx-community/Qwen3.5-35B-A3B-4bit. The proxy was silently
serving the loaded model regardless, but the mismatch made debugging
and log tracing harder.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 02:56:38 +00:00
Hyungi Ahn 32c96d6191 fix(deploy): primary endpoint -> mlx-proxy 8801
100.76.254.116:8800 -> :8801 to route through mlx-proxy and gain
/status observability (active_jobs / total_requests).
2026-04-08 02:56:08 +00:00
+2 -2
View File
@@ -6,8 +6,8 @@ ai:
models:
primary:
endpoint: "http://100.76.254.116:8800/v1/chat/completions"
model: "mlx-community/Qwen3.5-35B-A3B-4bit"
endpoint: "http://100.76.254.116:8801/v1/chat/completions"
model: "mlx-community/gemma-4-26b-a4b-it-8bit"
max_tokens: 4096
timeout: 60