- ModelAdapter: 범용 OpenAI-compat 어댑터 (stream/complete/health)
- BackendRegistry: rewriter(EXAONE) + reasoner(Gemma4) 헬스체크 루프
- 2단계 파이프라인: EXAONE rewrite → Gemma reasoning (SSE rewrite 이벤트 노출)
- Fallback: 맥미니 다운 시 EXAONE 단독 모드, stream 중간 실패 시 자동 전환
- Cancel-safe: rewrite 전/후, streaming loop 내, fallback 경로 모두 체크
- Rewrite heartbeat: complete_chat 대기 중 2초 간격 processing 이벤트
- JobQueue: Semaphore(3) 기반 동시성 제한, 정확한 queue position
- GET /chat/{job_id}/status, GET /queue/stats 엔드포인트
- DB: rewrite_model, reasoning_model, rewritten_message 컬럼 추가
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
41 lines
719 B
Python
41 lines
719 B
Python
"""Pydantic models for NanoClaude API."""
|
|
|
|
from __future__ import annotations
|
|
|
|
from enum import Enum
|
|
|
|
from pydantic import BaseModel
|
|
|
|
|
|
class JobStatus(str, Enum):
|
|
queued = "queued"
|
|
processing = "processing"
|
|
completed = "completed"
|
|
failed = "failed"
|
|
cancelled = "cancelled"
|
|
|
|
|
|
class ChatRequest(BaseModel):
|
|
message: str
|
|
|
|
|
|
class ChatResponse(BaseModel):
|
|
job_id: str
|
|
|
|
|
|
class CancelResponse(BaseModel):
|
|
status: str
|
|
|
|
|
|
class JobStatusResponse(BaseModel):
|
|
job_id: str
|
|
status: JobStatus
|
|
created_at: float
|
|
pipeline: bool
|
|
queue_position: int | None = None
|
|
|
|
|
|
class SSEEvent(BaseModel):
|
|
event: str # ack | processing | rewrite | result | error | done | queued
|
|
data: dict
|