feat(ai): config-driven sampling profile — triage T=0, primary T=0.3 top_p=0.9

P1 of family-adaptive-bengio (Mac mini 4-lever bundle). AIModelConfig: temperature/top_p Optional fields (None = server default). _request OpenAI/MLX branch payload 조건부 sampling 인자 삽입. config.yaml ai.models.triage.temperature=0.0 (deterministic) / primary temperature=0.3 top_p=0.9 (summary creativity). fallback (Anthropic) branch 미적용 — 별 plan 범위. caller 코드 무변경. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 06:37:46 +00:00
parent e4cfd81e15
commit 5cb8d04b50
3 changed files with 18 additions and 6 deletions
@@ -262,14 +262,19 @@ class AIClient:
            data = response.json()
            return data["content"][0]["text"]
        else:
+            payload = {
+                "model": model_config.model,
+                "messages": [{"role": "user", "content": prompt}],
+                "max_tokens": model_config.max_tokens,
+                "chat_template_kwargs": {"enable_thinking": False},
+            }
+            if model_config.temperature is not None:
+                payload["temperature"] = model_config.temperature
+            if model_config.top_p is not None:
+                payload["top_p"] = model_config.top_p
            response = await self._http.post(
                model_config.endpoint,
-                json={
-                    "model": model_config.model,
-                    "messages": [{"role": "user", "content": prompt}],
-                    "max_tokens": model_config.max_tokens,
-                    "chat_template_kwargs": {"enable_thinking": False},
-                },
+                json=payload,
                timeout=model_config.timeout,
            )
            response.raise_for_status()