feat: AI Gateway Phase 1 - FastAPI 코어 구현

GPU 서버 중앙 AI 라우팅 서비스 초기 구현:
- OpenAI 호환 API (/v1/chat/completions, /v1/models, /v1/embeddings)
- 모델 레지스트리 + 백엔드 헬스체크 (30초 루프)
- Ollama SSE 프록시 (NDJSON → OpenAI SSE 변환)
- JWT 인증 이중 경로 (httpOnly 쿠키 + Bearer 토큰)
- owner/guest 역할 분리, 로그인 rate limiting
- 백엔드별 rate limiting (NanoClaude 대비)
- SQLite 스키마 사전 정의 (aiosqlite + WAL)
- Docker Compose + Caddy 리버스 프록시

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Hyungi Ahn
2026-03-31 13:41:46 +09:00
commit 3794afff95
27 changed files with 1121 additions and 0 deletions

47
docker-compose.yml Normal file
View File

@@ -0,0 +1,47 @@
services:
caddy:
image: caddy:2-alpine
container_name: gpu-caddy
restart: unless-stopped
ports:
- "80:80"
- "443:443"
volumes:
- ./caddy/Caddyfile:/etc/caddy/Caddyfile
- caddy_data:/data
depends_on:
- hub-api
networks:
- gateway-net
hub-api:
build: ./hub-api
container_name: gpu-hub-api
restart: unless-stopped
environment:
- OWNER_PASSWORD=${OWNER_PASSWORD}
- GUEST_PASSWORD=${GUEST_PASSWORD}
- JWT_SECRET=${JWT_SECRET}
- BACKENDS_CONFIG=/app/config/backends.json
- CORS_ORIGINS=${CORS_ORIGINS:-http://localhost:5173}
- DB_PATH=/app/data/gateway.db
volumes:
- hub_data:/app/data
- ./backends.json:/app/config/backends.json:ro
extra_hosts:
- "host.docker.internal:host-gateway"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 15s
timeout: 5s
retries: 3
networks:
- gateway-net
volumes:
caddy_data:
hub_data:
networks:
gateway-net:
name: gpu-gateway-network