feat: AI Gateway Phase 1 - FastAPI 코어 구현

GPU 서버 중앙 AI 라우팅 서비스 초기 구현:
- OpenAI 호환 API (/v1/chat/completions, /v1/models, /v1/embeddings)
- 모델 레지스트리 + 백엔드 헬스체크 (30초 루프)
- Ollama SSE 프록시 (NDJSON → OpenAI SSE 변환)
- JWT 인증 이중 경로 (httpOnly 쿠키 + Bearer 토큰)
- owner/guest 역할 분리, 로그인 rate limiting
- 백엔드별 rate limiting (NanoClaude 대비)
- SQLite 스키마 사전 정의 (aiosqlite + WAL)
- Docker Compose + Caddy 리버스 프록시

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Hyungi Ahn
2026-03-31 13:41:46 +09:00
commit 3794afff95
27 changed files with 1121 additions and 0 deletions

46
hub-api/main.py Normal file
View File

@@ -0,0 +1,46 @@
from contextlib import asynccontextmanager
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from config import settings
from middleware.auth import AuthMiddleware
from routers import auth, chat, embeddings, gpu, health, models
from services.registry import registry
@asynccontextmanager
async def lifespan(app: FastAPI):
await registry.load_backends(settings.backends_config)
registry.start_health_loop()
yield
registry.stop_health_loop()
app = FastAPI(
title="AI Gateway",
version="0.1.0",
lifespan=lifespan,
)
app.add_middleware(
CORSMiddleware,
allow_origins=settings.cors_origins.split(","),
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
app.add_middleware(AuthMiddleware)
app.include_router(auth.router)
app.include_router(chat.router)
app.include_router(models.router)
app.include_router(embeddings.router)
app.include_router(health.router)
app.include_router(gpu.router)
@app.get("/")
async def root():
return {"service": "AI Gateway", "version": "0.1.0"}