fix: max_tokens 추가 — Gemma 16000, EXAONE 4096

응답이 중간에 끊기는 문제 해결. ModelAdapter에 max_tokens 파라미터 추가, stream/complete 양쪽 payload에 반영. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 12:52:31 +09:00
parent 74f8df48fc
commit a16ff2ea88
2 changed files with 5 additions and 0 deletions
--- a/nanoclaude/services/backend_registry.py
+++ b/nanoclaude/services/backend_registry.py
@@ -58,6 +58,7 @@ class BackendRegistry:
            system_prompt=REASONER_PROMPT,
            temperature=settings.reasoning_temperature,
            timeout=settings.reasoning_timeout,
+            max_tokens=16000,
        )

    def start_health_loop(self, interval: float = 30.0) -> None: