Files
gpu-services/hub-api/main.py
Hyungi Ahn 3794afff95 feat: AI Gateway Phase 1 - FastAPI 코어 구현
GPU 서버 중앙 AI 라우팅 서비스 초기 구현:
- OpenAI 호환 API (/v1/chat/completions, /v1/models, /v1/embeddings)
- 모델 레지스트리 + 백엔드 헬스체크 (30초 루프)
- Ollama SSE 프록시 (NDJSON → OpenAI SSE 변환)
- JWT 인증 이중 경로 (httpOnly 쿠키 + Bearer 토큰)
- owner/guest 역할 분리, 로그인 rate limiting
- 백엔드별 rate limiting (NanoClaude 대비)
- SQLite 스키마 사전 정의 (aiosqlite + WAL)
- Docker Compose + Caddy 리버스 프록시

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 13:41:46 +09:00

47 lines
1.1 KiB
Python

from contextlib import asynccontextmanager
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from config import settings
from middleware.auth import AuthMiddleware
from routers import auth, chat, embeddings, gpu, health, models
from services.registry import registry
@asynccontextmanager
async def lifespan(app: FastAPI):
await registry.load_backends(settings.backends_config)
registry.start_health_loop()
yield
registry.stop_health_loop()
app = FastAPI(
title="AI Gateway",
version="0.1.0",
lifespan=lifespan,
)
app.add_middleware(
CORSMiddleware,
allow_origins=settings.cors_origins.split(","),
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
app.add_middleware(AuthMiddleware)
app.include_router(auth.router)
app.include_router(chat.router)
app.include_router(models.router)
app.include_router(embeddings.router)
app.include_router(health.router)
app.include_router(gpu.router)
@app.get("/")
async def root():
return {"service": "AI Gateway", "version": "0.1.0"}