feat: home-gateway 초기 구성 — Mac mini에서 GPU 서버로 전면 이전

OrbStack 라이선스 만료로 Mac mini Docker 서비스를 GPU 서버로 통합. nginx → Caddy 전환, 12개 서브도메인 자동 HTTPS, fail2ban Caddy JSON 연동. 주요 변경: - home-caddy: Caddy 리버스 프록시 (Let's Encrypt 자동 HTTPS) - home-fail2ban: Caddy JSON 로그 기반 보안 모니터링 - home-ddns: Cloudflare DDNS (API 키 .env 분리) - gpu-hub-api/web: AI 백엔드 라우터 + 웹 UI (gpu-services에서 이전) - AI 런타임(Ollama) 내부망 전용, 외부는 gpu-hub 인증 게이트웨이 경유 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 04:55:28 +00:00
commit 79c09cede4
52 changed files with 6847 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,17 @@
 # Secrets
 .env
 ddns/.env
 # Runtime data
 caddy/logs/
 fail2ban/data/
 docker-compose.test.yml
 # Node
 hub-web/node_modules/
 hub-web/dist/
 # Python
 hub-api/__pycache__/
 hub-api/*.pyc
 hub-api/.venv/
--- a/README.md
+++ b/README.md
@@ -0,0 +1,68 @@
 # home-gateway
 홈 네트워크 통합 게이트웨이. GPU 서버(192.168.1.186)에서 운영.
 ## 구성
 | 컨테이너 | 역할 |
 |----------|------|
 | home-caddy | Caddy 리버스 프록시 (80/443, 자동 HTTPS) |
 | home-fail2ban | Caddy JSON 로그 기반 보안 모니터링 |
 | home-ddns-vpn | Cloudflare DDNS (vpn.hyungi.net) |
 | home-ddns-mail | Cloudflare DDNS (mail.hyungi.net) |
 | gpu-hub-api | AI 백엔드 라우터 (인증 게이트웨이) |
 | gpu-hub-web | AI 허브 웹 UI |
 ## 라우팅 대상
 ### GPU 서버 (로컬)
 - `komga.hyungi.net` → :25600
 - `document.hyungi.net` → :8080 (Document Server 내부 Caddy)
 - `ai.hyungi.net` → gpu-hub-api (인증된 외부 AI 접근)
 ### NAS (192.168.1.227)
 - `ds1525.hyungi.net` → :5000 (DSM)
 - `webdav.hyungi.net` → :5006 (WebDAV)
 - `git.hyungi.net` → :10300 (Gitea)
 - `vault.hyungi.net` → :8443 (Vaultwarden)
 - `link.hyungi.net` → :10002 (Synology Drive)
 - `mailplus.hyungi.net` → :21680 (MailPlus)
 - `contacts.hyungi.net` → :25555 (Contacts)
 - `calendar.hyungi.net` → :20002 (Calendar)
 - `note.hyungi.net` → :9350 (Note Station)
 ### Mac mini (192.168.1.122)
 - `jellyfin.hyungi.net` → :8096
 ## AI 접근 정책
 - Ollama/AI 런타임: 내부망 전용 (127.0.0.1:11434)
 - 외부 AI: gpu-hub-api 인증 게이트웨이를 통해서만 접근
 - `gpu.hyungi.net`: 폐기 (내부망/Tailscale 전용)
 ## 디렉토리 구조
 ```
 home-gateway/
 ├── docker-compose.yml
 ├── backends.json          # gpu-hub AI 백엔드 설정
 ├── caddy/
 │   ├── Caddyfile          # 리버스 프록시 설정 (12개 서브도메인)
 │   └── logs/              # Caddy JSON 로그 (fail2ban 연동)
 ├── fail2ban/
 │   ├── jail.local
 │   └── data/filter.d/     # Caddy용 커스텀 필터
 ├── ddns/
 │   └── .env               # Cloudflare API 키
 ├── hub-api/               # GPU Hub FastAPI 백엔드
 └── hub-web/               # GPU Hub React 프론트엔드
 ```
 ## 관련 독립 서비스 (별도 compose)
 - `~/qdrant/` — Qdrant 벡터 DB (127.0.0.1:6333)
 - `~/ollama/` — Ollama GPU 추론 (127.0.0.1:11434)
 ## 마이그레이션 이력
 - 2026-04-05: Mac mini (OrbStack) → GPU 서버 전면 이전
  - nginx → Caddy 통합
  - Let's Encrypt 수동 관리 → Caddy 자동 HTTPS
  - Cloudflare DDNS API 키 .env 분리
  - fail2ban nginx 필터 → Caddy JSON 필터 전환
--- a/backends.json
+++ b/backends.json
@@ -0,0 +1,22 @@
 [
  {
    "id": "ollama-gpu",
    "type": "ollama",
    "url": "http://host.docker.internal:11434",
    "models": [
      { "id": "bge-m3", "capabilities": ["embed"], "priority": 1 }
    ],
    "access": "all",
    "rate_limit": null
  },
  {
    "id": "mlx-mac",
    "type": "openai-compat",
    "url": "http://192.168.1.122:8800",
    "models": [
      { "id": "qwen3.5:35b-a3b", "backend_model_id": "mlx-community/Qwen3.5-35B-A3B-4bit", "capabilities": ["chat"], "priority": 1 }
    ],
    "access": "all",
    "rate_limit": null
  }
 ]
--- a/caddy/Caddyfile
+++ b/caddy/Caddyfile
@@ -0,0 +1,168 @@
 {
    # Global options
    log default {
        output file /var/log/caddy/access.log {
            roll_size 100MiB
            roll_keep 5
        }
        format json
    }
    servers {
        trusted_proxies static 173.245.48.0/20 103.21.244.0/22 103.22.200.0/22 103.31.4.0/22 104.16.0.0/13 104.24.0.0/14 108.162.192.0/18 131.0.72.0/22 141.101.64.0/18 162.158.0.0/15 172.64.0.0/13 188.114.96.0/20 190.93.240.0/20 197.234.240.0/22 198.41.128.0/17 2400:cb00::/32 2606:4700::/32 2803:f800::/32 2405:b500::/32 2405:8100::/32 2a06:98c0::/29 2c0f:f248::/32
    }
 }
 # ============================================================
 # GPU Hub — default route (direct IP access, no HTTPS)
 # ============================================================
 :80 {
    handle /v1/* {
        reverse_proxy gpu-hub-api:8000 {
            flush_interval -1
        }
    }
    handle /auth/* {
        reverse_proxy gpu-hub-api:8000
    }
    handle /health {
        reverse_proxy gpu-hub-api:8000
    }
    handle /health/* {
        reverse_proxy gpu-hub-api:8000
    }
    handle /gpu {
        reverse_proxy gpu-hub-api:8000
    }
    handle {
        reverse_proxy gpu-hub-web:80
    }
 }
 # ============================================================
 # AI Gateway — authenticated external access
 # ============================================================
 ai.hyungi.net {
    reverse_proxy gpu-hub-api:8000 {
        flush_interval -1
    }
 }
 # ============================================================
 # Jellyfin — Mac mini (192.168.1.122)
 # ============================================================
 jellyfin.hyungi.net {
    reverse_proxy 192.168.1.122:8096 {
        transport http {
            read_timeout 300s
            write_timeout 300s
        }
    }
 }
 # ============================================================
 # Komga — GPU local
 # ============================================================
 komga.hyungi.net {
    reverse_proxy host.docker.internal:25600
 }
 # ============================================================
 # Document Server — GPU local (via internal Caddy, Phase 6에서 직접 라우팅 전환)
 # ============================================================
 document.hyungi.net {
    request_body {
        max_size 100MB
    }
    reverse_proxy host.docker.internal:8080
 }
 # ============================================================
 # WebDAV — NAS (192.168.1.227)
 # ============================================================
 webdav.hyungi.net {
    request_body {
        max_size 2GB
    }
    reverse_proxy https://192.168.1.227:5006 {
        transport http {
            tls_insecure_skip_verify
            read_timeout 600s
            write_timeout 600s
        }
        header_up Host {host}
        header_up X-Real-IP {remote_host}
        header_up X-Forwarded-For {remote_host}
        header_up X-Forwarded-Proto {scheme}
    }
 }
 # ============================================================
 # DSM — NAS
 # ============================================================
 ds1525.hyungi.net {
    request_body {
        max_size 0
    }
    reverse_proxy 192.168.1.227:5000
 }
 # ============================================================
 # Gitea — NAS
 # ============================================================
 git.hyungi.net {
    request_body {
        max_size 512MB
    }
    reverse_proxy 192.168.1.227:10300
 }
 # ============================================================
 # Vaultwarden — NAS (WebSocket)
 # ============================================================
 vault.hyungi.net {
    reverse_proxy 192.168.1.227:8443
 }
 # ============================================================
 # Synology Drive — NAS (WebSocket, unlimited upload)
 # ============================================================
 link.hyungi.net {
    request_body {
        max_size 0
    }
    reverse_proxy 192.168.1.227:10002
 }
 # ============================================================
 # MailPlus — NAS
 # ============================================================
 mailplus.hyungi.net {
    request_body {
        max_size 100MB
    }
    reverse_proxy 192.168.1.227:21680
 }
 # ============================================================
 # Contacts — NAS
 # ============================================================
 contacts.hyungi.net {
    reverse_proxy 192.168.1.227:25555
 }
 # ============================================================
 # Calendar — NAS
 # ============================================================
 calendar.hyungi.net {
    reverse_proxy 192.168.1.227:20002
 }
 # ============================================================
 # Note Station — NAS (WebSocket, unlimited upload)
 # ============================================================
 note.hyungi.net {
    request_body {
        max_size 0
    }
    reverse_proxy 192.168.1.227:9350
 }
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -0,0 +1,105 @@
 services:
  # ============================================================
  # Edge Layer — Reverse Proxy + Security + DDNS
  # ============================================================
  home-caddy:
    image: caddy:2-alpine
    container_name: home-caddy
    restart: unless-stopped
    ports:
      - "80:80"
      - "443:443"
      - "443:443/udp"
    volumes:
      - ./caddy/Caddyfile:/etc/caddy/Caddyfile:ro
      - ./caddy/logs:/var/log/caddy
      - caddy_data:/data
      - caddy_config:/config
    extra_hosts:
      - "host.docker.internal:host-gateway"
    depends_on:
      gpu-hub-api:
        condition: service_healthy
    networks:
      - gateway-net
  home-fail2ban:
    image: crazymax/fail2ban:latest
    container_name: home-fail2ban
    restart: unless-stopped
    network_mode: host
    cap_add:
      - NET_ADMIN
      - NET_RAW
    volumes:
      - ./fail2ban/data:/data
      - ./caddy/logs:/var/log/caddy:ro
      - ./fail2ban/jail.local:/etc/fail2ban/jail.local:ro
    environment:
      - TZ=Asia/Seoul
      - F2B_LOG_LEVEL=INFO
  home-ddns-vpn:
    image: oznu/cloudflare-ddns:latest
    container_name: home-ddns-vpn
    restart: unless-stopped
    env_file:
      - ./ddns/.env
    environment:
      - ZONE=hyungi.net
      - SUBDOMAIN=vpn
      - PROXIED=false
  home-ddns-mail:
    image: oznu/cloudflare-ddns:latest
    container_name: home-ddns-mail
    restart: unless-stopped
    env_file:
      - ./ddns/.env
    environment:
      - ZONE=hyungi.net
      - SUBDOMAIN=mail
      - PROXIED=false
  # ============================================================
  # GPU Hub — AI Backend Router + Web UI
  # ============================================================
  gpu-hub-api:
    build: ./hub-api
    container_name: gpu-hub-api
    restart: unless-stopped
    environment:
      - OWNER_PASSWORD=${OWNER_PASSWORD}
      - GUEST_PASSWORD=${GUEST_PASSWORD}
      - JWT_SECRET=${JWT_SECRET}
      - BACKENDS_CONFIG=/app/config/backends.json
      - CORS_ORIGINS=${CORS_ORIGINS:-http://localhost:5173}
      - DB_PATH=/app/data/gateway.db
    volumes:
      - hub_data:/app/data
      - ./backends.json:/app/config/backends.json:ro
    extra_hosts:
      - "host.docker.internal:host-gateway"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 15s
      timeout: 5s
      retries: 3
    networks:
      - gateway-net
  gpu-hub-web:
    build: ./hub-web
    container_name: gpu-hub-web
    restart: unless-stopped
    networks:
      - gateway-net
 volumes:
  caddy_data:
  caddy_config:
  hub_data:
 networks:
  gateway-net:
    name: home-gateway-network
--- a/fail2ban/filter.d/caddy-auth.conf
+++ b/fail2ban/filter.d/caddy-auth.conf
@@ -0,0 +1,4 @@
 [Definition]
 failregex = ^.*"client_ip":"<HOST>".*"status":\s*401.*$
 ignoreregex =
 datepattern = "ts":{EPOCH}
--- a/fail2ban/filter.d/caddy-botsearch.conf
+++ b/fail2ban/filter.d/caddy-botsearch.conf
@@ -0,0 +1,4 @@
 [Definition]
 failregex = ^.*"client_ip":"<HOST>".*"status":\s*(403|404|444).*$
 ignoreregex =
 datepattern = "ts":{EPOCH}
--- a/fail2ban/jail.local
+++ b/fail2ban/jail.local
@@ -0,0 +1,27 @@
 [DEFAULT]
 bantime = 3600
 findtime = 600
 maxretry = 5
 backend = auto
 enabled = false
 [sshd]
 enabled = false
 # Caddy 봇/스캐너 차단 (404/403 반복)
 [caddy-botsearch]
 enabled = true
 port = 80,443
 filter = caddy-botsearch
 logpath = /var/log/caddy/access.log
 maxretry = 2
 bantime = 86400
 # Caddy 인증 실패 차단 (401 반복)
 [caddy-auth]
 enabled = true
 port = 80,443
 filter = caddy-auth
 logpath = /var/log/caddy/access.log
 maxretry = 3
 bantime = 1800
--- a/hub-api/Dockerfile
+++ b/hub-api/Dockerfile
@@ -0,0 +1,16 @@
 FROM python:3.12-slim
 RUN apt-get update && apt-get install -y --no-install-recommends curl && rm -rf /var/lib/apt/lists/*
 WORKDIR /app
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
 COPY . .
 RUN mkdir -p /app/data
 EXPOSE 8000
 CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
--- a/hub-api/config.py
+++ b/hub-api/config.py
@@ -0,0 +1,21 @@
 from pydantic_settings import BaseSettings
 class Settings(BaseSettings):
    owner_password: str = "changeme"
    guest_password: str = "guest"
    jwt_secret: str = "dev-secret-change-in-production"
    jwt_algorithm: str = "HS256"
    jwt_expire_hours: int = 24
    backends_config: str = "/app/config/backends.json"
    cors_origins: str = "http://localhost:5173"
    nvidia_smi_path: str = "/usr/bin/nvidia-smi"
    db_path: str = "/app/data/gateway.db"
    model_config = {"env_file": ".env", "extra": "ignore"}
 settings = Settings()
--- a/hub-api/db/init.py
+++ b/hub-api/db/init.py
--- a/hub-api/db/database.py
+++ b/hub-api/db/database.py
@@ -0,0 +1,50 @@
 import aiosqlite
 from config import settings
 SCHEMA = """
 CREATE TABLE IF NOT EXISTS chat_sessions (
    id TEXT PRIMARY KEY,
    title TEXT,
    model TEXT NOT NULL,
    role TEXT NOT NULL DEFAULT 'guest',
    created_at REAL NOT NULL
 );
 CREATE TABLE IF NOT EXISTS chat_messages (
    id TEXT PRIMARY KEY,
    session_id TEXT NOT NULL REFERENCES chat_sessions(id),
    role TEXT NOT NULL,
    content TEXT NOT NULL,
    created_at REAL NOT NULL
 );
 CREATE TABLE IF NOT EXISTS usage_logs (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    backend_id TEXT NOT NULL,
    model TEXT NOT NULL,
    prompt_tokens INTEGER DEFAULT 0,
    completion_tokens INTEGER DEFAULT 0,
    latency_ms REAL DEFAULT 0,
    user_role TEXT NOT NULL DEFAULT 'guest',
    created_at REAL NOT NULL
 );
 CREATE INDEX IF NOT EXISTS idx_messages_session ON chat_messages(session_id);
 CREATE INDEX IF NOT EXISTS idx_usage_created ON usage_logs(created_at);
 """
 async def init_db():
    """Initialize SQLite database with WAL mode and schema."""
    async with aiosqlite.connect(settings.db_path) as db:
        await db.execute("PRAGMA journal_mode=WAL")
        await db.executescript(SCHEMA)
        await db.commit()
 async def get_db() -> aiosqlite.Connection:
    """Get a database connection."""
    db = await aiosqlite.connect(settings.db_path)
    await db.execute("PRAGMA journal_mode=WAL")
    return db
--- a/hub-api/db/models.py
+++ b/hub-api/db/models.py
@@ -0,0 +1,2 @@
 # DB model helpers — used in Phase 3 for logging
 # Schema defined in database.py
--- a/hub-api/main.py
+++ b/hub-api/main.py
@@ -0,0 +1,46 @@
 from contextlib import asynccontextmanager
 from fastapi import FastAPI
 from fastapi.middleware.cors import CORSMiddleware
 from config import settings
 from middleware.auth import AuthMiddleware
 from routers import auth, chat, embeddings, gpu, health, models
 from services.registry import registry
@asynccontextmanager
 async def lifespan(app: FastAPI):
    await registry.load_backends(settings.backends_config)
    registry.start_health_loop()
    yield
    registry.stop_health_loop()
 app = FastAPI(
    title="AI Gateway",
    version="0.1.0",
    lifespan=lifespan,
 )
 app.add_middleware(
    CORSMiddleware,
    allow_origins=settings.cors_origins.split(","),
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
 )
 app.add_middleware(AuthMiddleware)
 app.include_router(auth.router)
 app.include_router(chat.router)
 app.include_router(models.router)
 app.include_router(embeddings.router)
 app.include_router(health.router)
 app.include_router(gpu.router)
@app.get("/")
 async def root():
    return {"service": "AI Gateway", "version": "0.1.0"}
--- a/hub-api/middleware/init.py
+++ b/hub-api/middleware/init.py
--- a/hub-api/middleware/auth.py
+++ b/hub-api/middleware/auth.py
@@ -0,0 +1,96 @@
 from __future__ import annotations
 import time
 from jose import JWTError, jwt
 from starlette.middleware.base import BaseHTTPMiddleware
 from starlette.requests import Request
 from config import settings
 # Paths that don't require authentication
 PUBLIC_PATHS = {"/", "/health", "/auth/login", "/docs", "/openapi.json"}
 PUBLIC_PREFIXES = ("/health/",)
 class AuthMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        path = request.url.path
        # Skip auth for public paths
        if path in PUBLIC_PATHS or any(path.startswith(p) for p in PUBLIC_PREFIXES):
            request.state.role = "anonymous"
            return await call_next(request)
        # Skip auth for OPTIONS (CORS preflight)
        if request.method == "OPTIONS":
            return await call_next(request)
        # Try Bearer token first, then cookie
        token = _extract_token(request)
        if not token:
            request.state.role = "anonymous"
            return await call_next(request)
        # Verify JWT
        payload = _verify_token(token)
        if payload:
            request.state.role = payload.get("role", "guest")
        else:
            request.state.role = "anonymous"
        return await call_next(request)
 def create_token(role: str) -> str:
    payload = {
        "role": role,
        "exp": time.time() + settings.jwt_expire_hours * 3600,
        "iat": time.time(),
    }
    return jwt.encode(payload, settings.jwt_secret, algorithm=settings.jwt_algorithm)
 def _extract_token(request: Request) -> str | None:
    # 1. Authorization: Bearer header
    auth_header = request.headers.get("authorization", "")
    if auth_header.startswith("Bearer "):
        return auth_header[7:]
    # 2. httpOnly cookie
    return request.cookies.get("token")
 def _verify_token(token: str) -> dict | None:
    try:
        payload = jwt.decode(
            token, settings.jwt_secret, algorithms=[settings.jwt_algorithm]
        )
        if payload.get("exp", 0) < time.time():
            return None
        return payload
    except JWTError:
        return None
 # Login rate limiting (IP-based)
 _login_attempts: dict[str, list[float]] = {}
 MAX_ATTEMPTS = 5
 LOCKOUT_SECONDS = 60
 def check_login_rate_limit(ip: str) -> bool:
    """Returns True if login is allowed for this IP."""
    now = time.time()
    attempts = _login_attempts.get(ip, [])
    # Clean old attempts
    attempts = [t for t in attempts if now - t < LOCKOUT_SECONDS]
    _login_attempts[ip] = attempts
    return len(attempts) < MAX_ATTEMPTS
 def record_login_attempt(ip: str):
    now = time.time()
    if ip not in _login_attempts:
        _login_attempts[ip] = []
    _login_attempts[ip].append(now)
--- a/hub-api/middleware/rate_limit.py
+++ b/hub-api/middleware/rate_limit.py
@@ -0,0 +1,18 @@
 from fastapi import HTTPException
 from services.registry import registry
 def check_backend_rate_limit(backend_id: str):
    """Raise 429 if rate limit exceeded for this backend."""
    if not registry.check_rate_limit(backend_id):
        raise HTTPException(
            status_code=429,
            detail={
                "error": {
                    "message": f"Rate limit exceeded for backend '{backend_id}'",
                    "type": "rate_limit_error",
                    "code": "rate_limit_exceeded",
                }
            },
        )
--- a/hub-api/requirements.txt
+++ b/hub-api/requirements.txt
@@ -0,0 +1,7 @@
 fastapi==0.115.0
 uvicorn[standard]==0.30.0
 httpx==0.27.0
 pydantic-settings==2.5.0
 python-jose[cryptography]==3.3.0
 python-multipart==0.0.9
 aiosqlite==0.20.0
--- a/hub-api/routers/init.py
+++ b/hub-api/routers/init.py
--- a/hub-api/routers/auth.py
+++ b/hub-api/routers/auth.py
@@ -0,0 +1,79 @@
 from fastapi import APIRouter, Request, Response
 from pydantic import BaseModel
 from config import settings
 from middleware.auth import (
    check_login_rate_limit,
    create_token,
    record_login_attempt,
 )
 router = APIRouter(prefix="/auth", tags=["auth"])
 class LoginRequest(BaseModel):
    password: str
 class LoginResponse(BaseModel):
    role: str
    token: str
@router.post("/login")
 async def login(body: LoginRequest, request: Request, response: Response):
    ip = request.client.host if request.client else "unknown"
    if not check_login_rate_limit(ip):
        return _error_response(429, "Too many login attempts. Try again in 1 minute.")
    record_login_attempt(ip)
    if body.password == settings.owner_password:
        role = "owner"
    elif body.password == settings.guest_password:
        role = "guest"
    else:
        return _error_response(401, "Invalid password")
    token = create_token(role)
    # Set httpOnly cookie for web UI
    response.set_cookie(
        key="token",
        value=token,
        httponly=True,
        samesite="lax",
        max_age=settings.jwt_expire_hours * 3600,
    )
    return LoginResponse(role=role, token=token)
@router.get("/me")
 async def me(request: Request):
    role = getattr(request.state, "role", "anonymous")
    if role == "anonymous":
        return _error_response(401, "Not authenticated")
    return {"role": role}
@router.post("/logout")
 async def logout(response: Response):
    response.delete_cookie("token")
    return {"ok": True}
 def _error_response(status_code: int, message: str):
    from fastapi.responses import JSONResponse
    return JSONResponse(
        status_code=status_code,
        content={
            "error": {
                "message": message,
                "type": "auth_error",
                "code": f"auth_{status_code}",
            }
        },
    )
--- a/hub-api/routers/chat.py
+++ b/hub-api/routers/chat.py
@@ -0,0 +1,112 @@
 from typing import List, Optional
 from fastapi import APIRouter, HTTPException, Request
 from fastapi.responses import JSONResponse, StreamingResponse
 from pydantic import BaseModel
 from middleware.rate_limit import check_backend_rate_limit
 from services import proxy_ollama, proxy_openai
 from services.registry import registry
 router = APIRouter(prefix="/v1", tags=["chat"])
 class ChatMessage(BaseModel):
    role: str
    content: str
 class ChatRequest(BaseModel):
    model: str
    messages: List[ChatMessage]
    stream: bool = False
    temperature: Optional[float] = None
    max_tokens: Optional[int] = None
@router.post("/chat/completions")
 async def chat_completions(body: ChatRequest, request: Request):
    role = getattr(request.state, "role", "anonymous")
    if role == "anonymous":
        raise HTTPException(
            status_code=401,
            detail={"error": {"message": "Authentication required", "type": "auth_error", "code": "unauthorized"}},
        )
    # Resolve model to backend
    result = registry.resolve_model(body.model, role)
    if not result:
        raise HTTPException(
            status_code=404,
            detail={
                "error": {
                    "message": f"Model '{body.model}' not found or not available",
                    "type": "invalid_request_error",
                    "code": "model_not_found",
                }
            },
        )
    backend, model_info = result
    # Check rate limit
    check_backend_rate_limit(backend.id)
    # Record request for rate limiting
    registry.record_request(backend.id)
    messages = [{"role": m.role, "content": m.content} for m in body.messages]
    kwargs = {}
    if body.temperature is not None:
        kwargs["temperature"] = body.temperature
    # Use backend-specific model ID if configured, otherwise use the user-facing ID
    actual_model = model_info.backend_model_id or body.model
    # Route to appropriate proxy
    if backend.type == "ollama":
        if body.stream:
            return StreamingResponse(
                proxy_ollama.stream_chat(
                    backend.url, actual_model, messages, **kwargs
                ),
                media_type="text/event-stream",
                headers={
                    "Cache-Control": "no-cache",
                    "X-Accel-Buffering": "no",
                },
            )
        else:
            result = await proxy_ollama.complete_chat(
                backend.url, actual_model, messages, **kwargs
            )
            return JSONResponse(content=result)
    if backend.type == "openai-compat":
        if body.stream:
            return StreamingResponse(
                proxy_openai.stream_chat(
                    backend.url, actual_model, messages, **kwargs
                ),
                media_type="text/event-stream",
                headers={
                    "Cache-Control": "no-cache",
                    "X-Accel-Buffering": "no",
                },
            )
        else:
            result = await proxy_openai.complete_chat(
                backend.url, actual_model, messages, **kwargs
            )
            return JSONResponse(content=result)
    raise HTTPException(
        status_code=501,
        detail={
            "error": {
                "message": f"Backend type '{backend.type}' not yet implemented",
                "type": "api_error",
                "code": "not_implemented",
            }
        },
    )
--- a/hub-api/routers/embeddings.py
+++ b/hub-api/routers/embeddings.py
@@ -0,0 +1,67 @@
 from typing import List, Union
 from fastapi import APIRouter, HTTPException, Request
 from pydantic import BaseModel
 from services import proxy_ollama
 from services.registry import registry
 router = APIRouter(prefix="/v1", tags=["embeddings"])
 class EmbeddingRequest(BaseModel):
    model: str
    input: Union[str, List[str]]
@router.post("/embeddings")
 async def create_embedding(body: EmbeddingRequest, request: Request):
    role = getattr(request.state, "role", "anonymous")
    if role == "anonymous":
        raise HTTPException(
            status_code=401,
            detail={"error": {"message": "Authentication required", "type": "auth_error", "code": "unauthorized"}},
        )
    result = registry.resolve_model(body.model, role)
    if not result:
        raise HTTPException(
            status_code=404,
            detail={
                "error": {
                    "message": f"Model '{body.model}' not found or not available",
                    "type": "invalid_request_error",
                    "code": "model_not_found",
                }
            },
        )
    backend, model_info = result
    if "embed" not in model_info.capabilities:
        raise HTTPException(
            status_code=400,
            detail={
                "error": {
                    "message": f"Model '{body.model}' does not support embeddings",
                    "type": "invalid_request_error",
                    "code": "capability_mismatch",
                }
            },
        )
    if backend.type == "ollama":
        return await proxy_ollama.generate_embedding(
            backend.url, body.model, body.input
        )
    raise HTTPException(
        status_code=501,
        detail={
            "error": {
                "message": f"Embedding not supported for backend type '{backend.type}'",
                "type": "api_error",
                "code": "not_implemented",
            }
        },
    )
--- a/hub-api/routers/gpu.py
+++ b/hub-api/routers/gpu.py
@@ -0,0 +1,13 @@
 from fastapi import APIRouter
 from services.gpu_monitor import get_gpu_info
 router = APIRouter(tags=["gpu"])
@router.get("/gpu")
 async def gpu_status():
    info = await get_gpu_info()
    if not info:
        return {"error": {"message": "GPU info unavailable", "type": "api_error", "code": "gpu_unavailable"}}
    return info
--- a/hub-api/routers/health.py
+++ b/hub-api/routers/health.py
@@ -0,0 +1,31 @@
 from fastapi import APIRouter
 from services.gpu_monitor import get_gpu_info
 from services.registry import registry
 router = APIRouter(tags=["health"])
@router.get("/health")
 async def health():
    gpu = await get_gpu_info()
    return {
        "status": "ok",
        "backends": registry.get_health_summary(),
        "gpu": gpu,
    }
@router.get("/health/{backend_id}")
 async def backend_health(backend_id: str):
    backend = registry.backends.get(backend_id)
    if not backend:
        return {"error": {"message": f"Backend '{backend_id}' not found"}}
    return {
        "id": backend.id,
        "type": backend.type,
        "status": "healthy" if backend.healthy else "down",
        "models": [m.id for m in backend.models],
        "latency_ms": backend.latency_ms,
    }
--- a/hub-api/routers/models.py
+++ b/hub-api/routers/models.py
@@ -0,0 +1,12 @@
 from fastapi import APIRouter, Request
 from services.registry import registry
 router = APIRouter(prefix="/v1", tags=["models"])
@router.get("/models")
 async def list_models(request: Request):
    role = getattr(request.state, "role", "anonymous")
    models = registry.list_models(role)
    return {"object": "list", "data": models}
--- a/hub-api/services/init.py
+++ b/hub-api/services/init.py
--- a/hub-api/services/gpu_monitor.py
+++ b/hub-api/services/gpu_monitor.py
@@ -0,0 +1,41 @@
 from __future__ import annotations
 import asyncio
 import logging
 from config import settings
 logger = logging.getLogger(__name__)
 async def get_gpu_info() -> dict | None:
    """Run nvidia-smi and parse GPU info."""
    try:
        proc = await asyncio.create_subprocess_exec(
            settings.nvidia_smi_path,
            "--query-gpu=utilization.gpu,temperature.gpu,memory.used,memory.total,power.draw,name",
            "--format=csv,noheader,nounits",
            stdout=asyncio.subprocess.PIPE,
            stderr=asyncio.subprocess.PIPE,
        )
        stdout, stderr = await asyncio.wait_for(proc.communicate(), timeout=5.0)
        if proc.returncode != 0:
            logger.debug("nvidia-smi failed: %s", stderr.decode())
            return None
        line = stdout.decode().strip().split("\n")[0]
        parts = [p.strip() for p in line.split(",")]
        if len(parts) < 6:
            return None
        return {
            "utilization": int(parts[0]),
            "temperature": int(parts[1]),
            "vram_used": int(parts[2]),
            "vram_total": int(parts[3]),
            "power_draw": float(parts[4]),
            "name": parts[5],
        }
    except (FileNotFoundError, asyncio.TimeoutError):
        return None
--- a/hub-api/services/proxy_ollama.py
+++ b/hub-api/services/proxy_ollama.py
@@ -0,0 +1,156 @@
 from __future__ import annotations
 import json
 import logging
 from collections.abc import AsyncGenerator
 import httpx
 logger = logging.getLogger(__name__)
 async def stream_chat(
    base_url: str,
    model: str,
    messages: list[dict],
    **kwargs,
 ) -> AsyncGenerator[str, None]:
    """Proxy Ollama chat streaming, converting NDJSON to OpenAI SSE format."""
    payload = {
        "model": model,
        "messages": messages,
        "stream": True,
        **{k: v for k, v in kwargs.items() if v is not None},
    }
    async with httpx.AsyncClient(timeout=120.0) as client:
        async with client.stream(
            "POST",
            f"{base_url}/api/chat",
            json=payload,
        ) as resp:
            if resp.status_code != 200:
                body = await resp.aread()
                error_msg = body.decode("utf-8", errors="replace")
                yield _error_event(f"Ollama error: {error_msg}")
                return
            async for line in resp.aiter_lines():
                if not line.strip():
                    continue
                try:
                    chunk = json.loads(line)
                except json.JSONDecodeError:
                    continue
                if chunk.get("done"):
                    # Final chunk — send [DONE]
                    yield "data: [DONE]\n\n"
                    return
                content = chunk.get("message", {}).get("content", "")
                if content:
                    openai_chunk = {
                        "id": "chatcmpl-gateway",
                        "object": "chat.completion.chunk",
                        "model": model,
                        "choices": [
                            {
                                "index": 0,
                                "delta": {"content": content},
                                "finish_reason": None,
                            }
                        ],
                    }
                    yield f"data: {json.dumps(openai_chunk)}\n\n"
 async def complete_chat(
    base_url: str,
    model: str,
    messages: list[dict],
    **kwargs,
 ) -> dict:
    """Non-streaming Ollama chat, returns OpenAI-compatible response."""
    payload = {
        "model": model,
        "messages": messages,
        "stream": False,
        **{k: v for k, v in kwargs.items() if v is not None},
    }
    async with httpx.AsyncClient(timeout=120.0) as client:
        resp = await client.post(f"{base_url}/api/chat", json=payload)
        resp.raise_for_status()
        data = resp.json()
    return {
        "id": "chatcmpl-gateway",
        "object": "chat.completion",
        "model": model,
        "choices": [
            {
                "index": 0,
                "message": {
                    "role": "assistant",
                    "content": data.get("message", {}).get("content", ""),
                },
                "finish_reason": "stop",
            }
        ],
        "usage": {
            "prompt_tokens": data.get("prompt_eval_count", 0),
            "completion_tokens": data.get("eval_count", 0),
            "total_tokens": data.get("prompt_eval_count", 0)
            + data.get("eval_count", 0),
        },
    }
 async def generate_embedding(
    base_url: str,
    model: str,
    input_text: str | list[str],
 ) -> dict:
    """Ollama embedding, returns OpenAI-compatible response."""
    texts = [input_text] if isinstance(input_text, str) else input_text
    async with httpx.AsyncClient(timeout=60.0) as client:
        resp = await client.post(
            f"{base_url}/api/embed",
            json={"model": model, "input": texts},
        )
        resp.raise_for_status()
        data = resp.json()
    embeddings_data = []
    raw_embeddings = data.get("embeddings", [])
    for i, emb in enumerate(raw_embeddings):
        embeddings_data.append({
            "object": "embedding",
            "embedding": emb,
            "index": i,
        })
    return {
        "object": "list",
        "data": embeddings_data,
        "model": model,
        "usage": {"prompt_tokens": 1, "total_tokens": 1},
    }
 def _error_event(message: str) -> str:
    error = {
        "id": "chatcmpl-gateway",
        "object": "chat.completion.chunk",
        "model": "error",
        "choices": [
            {
                "index": 0,
                "delta": {"content": f"[Error] {message}"},
                "finish_reason": "stop",
            }
        ],
    }
    return f"data: {json.dumps(error)}\n\ndata: [DONE]\n\n"
--- a/hub-api/services/proxy_openai.py
+++ b/hub-api/services/proxy_openai.py
@@ -0,0 +1,83 @@
 """OpenAI-compatible proxy (MLX server, vLLM, etc.) — SSE passthrough."""
 from __future__ import annotations
 import json
 import logging
 from collections.abc import AsyncGenerator
 import httpx
 logger = logging.getLogger(__name__)
 async def stream_chat(
    base_url: str,
    model: str,
    messages: list[dict],
    **kwargs,
 ) -> AsyncGenerator[str, None]:
    """Proxy OpenAI-compatible chat streaming. SSE passthrough with model field override."""
    payload = {
        "model": model,
        "messages": messages,
        "stream": True,
        **{k: v for k, v in kwargs.items() if v is not None},
    }
    async with httpx.AsyncClient(timeout=120.0) as client:
        async with client.stream(
            "POST",
            f"{base_url}/v1/chat/completions",
            json=payload,
        ) as resp:
            if resp.status_code != 200:
                body = await resp.aread()
                error_msg = body.decode("utf-8", errors="replace")
                yield _error_event(f"Backend error ({resp.status_code}): {error_msg}")
                return
            async for line in resp.aiter_lines():
                if not line.strip():
                    continue
                # Pass through SSE lines as-is (already in OpenAI format)
                if line.startswith("data: "):
                    yield f"{line}\n\n"
                elif line == "data: [DONE]":
                    yield "data: [DONE]\n\n"
 async def complete_chat(
    base_url: str,
    model: str,
    messages: list[dict],
    **kwargs,
 ) -> dict:
    """Non-streaming OpenAI-compatible chat."""
    payload = {
        "model": model,
        "messages": messages,
        "stream": False,
        **{k: v for k, v in kwargs.items() if v is not None},
    }
    async with httpx.AsyncClient(timeout=120.0) as client:
        resp = await client.post(f"{base_url}/v1/chat/completions", json=payload)
        resp.raise_for_status()
        return resp.json()
 def _error_event(message: str) -> str:
    error = {
        "id": "chatcmpl-gateway",
        "object": "chat.completion.chunk",
        "model": "error",
        "choices": [
            {
                "index": 0,
                "delta": {"content": f"[Error] {message}"},
                "finish_reason": "stop",
            }
        ],
    }
    return f"data: {json.dumps(error)}\n\ndata: [DONE]\n\n"
--- a/hub-api/services/registry.py
+++ b/hub-api/services/registry.py
@@ -0,0 +1,227 @@
 from __future__ import annotations
 import asyncio
 import json
 import logging
 import time
 from dataclasses import dataclass, field
 from pathlib import Path
 import httpx
 logger = logging.getLogger(__name__)
@dataclass
 class ModelInfo:
    id: str
    capabilities: list[str]
    priority: int = 1
    backend_model_id: str = ""  # actual model ID sent to backend (if different from id)
@dataclass
 class RateLimitConfig:
    rpm: int = 0
    rph: int = 0
    scope: str = "global"
@dataclass
 class BackendInfo:
    id: str
    type: str  # "ollama", "openai-compat", "anthropic"
    url: str
    models: list[ModelInfo]
    access: str = "all"  # "all" or "owner"
    rate_limit: RateLimitConfig | None = None
    # runtime state
    healthy: bool = False
    last_check: float = 0
    latency_ms: float = 0
@dataclass
 class RateLimitState:
    minute_timestamps: list[float] = field(default_factory=list)
    hour_timestamps: list[float] = field(default_factory=list)
 class Registry:
    def __init__(self):
        self.backends: dict[str, BackendInfo] = {}
        self._health_task: asyncio.Task | None = None
        self._rate_limits: dict[str, RateLimitState] = {}
    async def load_backends(self, config_path: str):
        path = Path(config_path)
        if not path.exists():
            logger.warning("Backends config not found: %s", config_path)
            return
        with open(path) as f:
            data = json.load(f)
        for entry in data:
            models = [
                ModelInfo(
                    id=m["id"],
                    capabilities=m.get("capabilities", ["chat"]),
                    priority=m.get("priority", 1),
                    backend_model_id=m.get("backend_model_id", ""),
                )
                for m in entry.get("models", [])
            ]
            rl_data = entry.get("rate_limit")
            rate_limit = (
                RateLimitConfig(
                    rpm=rl_data.get("rpm", 0),
                    rph=rl_data.get("rph", 0),
                    scope=rl_data.get("scope", "global"),
                )
                if rl_data
                else None
            )
            backend = BackendInfo(
                id=entry["id"],
                type=entry["type"],
                url=entry["url"].rstrip("/"),
                models=models,
                access=entry.get("access", "all"),
                rate_limit=rate_limit,
            )
            self.backends[backend.id] = backend
            if rate_limit:
                self._rate_limits[backend.id] = RateLimitState()
        logger.info("Loaded %d backends", len(self.backends))
    def start_health_loop(self, interval: float = 30.0):
        self._health_task = asyncio.create_task(self._health_loop(interval))
    def stop_health_loop(self):
        if self._health_task:
            self._health_task.cancel()
    async def _health_loop(self, interval: float):
        while True:
            await self._check_all_backends()
            await asyncio.sleep(interval)
    async def _check_all_backends(self):
        async with httpx.AsyncClient(timeout=5.0) as client:
            tasks = [
                self._check_backend(client, backend)
                for backend in self.backends.values()
            ]
            await asyncio.gather(*tasks, return_exceptions=True)
    async def _check_backend(self, client: httpx.AsyncClient, backend: BackendInfo):
        try:
            start = time.monotonic()
            if backend.type == "ollama":
                resp = await client.get(f"{backend.url}/api/tags")
            elif backend.type in ("openai-compat", "anthropic"):
                resp = await client.get(f"{backend.url}/v1/models")
            else:
                resp = await client.get(f"{backend.url}/health")
            elapsed = (time.monotonic() - start) * 1000
            backend.healthy = resp.status_code < 500
            backend.latency_ms = round(elapsed, 1)
            backend.last_check = time.time()
        except Exception:
            backend.healthy = False
            backend.latency_ms = 0
            backend.last_check = time.time()
            logger.debug("Health check failed for %s", backend.id)
    def resolve_model(self, model_id: str, role: str) -> tuple[BackendInfo, ModelInfo] | None:
        """Find the best backend for a given model ID. Returns (backend, model) or None."""
        candidates: list[tuple[BackendInfo, ModelInfo, int]] = []
        for backend in self.backends.values():
            if not backend.healthy:
                continue
            if backend.access == "owner" and role != "owner":
                continue
            for model in backend.models:
                if model.id == model_id:
                    candidates.append((backend, model, model.priority))
        if not candidates:
            return None
        candidates.sort(key=lambda x: x[2])
        return candidates[0][0], candidates[0][1]
    def list_models(self, role: str) -> list[dict]:
        """List all available models for a given role."""
        result = []
        for backend in self.backends.values():
            if not backend.healthy:
                continue
            if backend.access == "owner" and role != "owner":
                continue
            for model in backend.models:
                result.append({
                    "id": model.id,
                    "object": "model",
                    "owned_by": backend.id,
                    "capabilities": model.capabilities,
                    "backend_id": backend.id,
                    "backend_status": "healthy" if backend.healthy else "down",
                })
        return result
    def check_rate_limit(self, backend_id: str) -> bool:
        """Check if a request to this backend is within rate limits. Returns True if allowed."""
        backend = self.backends.get(backend_id)
        if not backend or not backend.rate_limit:
            return True
        state = self._rate_limits.get(backend_id)
        if not state:
            return True
        now = time.time()
        rl = backend.rate_limit
        # Clean old timestamps
        if rl.rpm > 0:
            state.minute_timestamps = [t for t in state.minute_timestamps if now - t < 60]
            if len(state.minute_timestamps) >= rl.rpm:
                return False
        if rl.rph > 0:
            state.hour_timestamps = [t for t in state.hour_timestamps if now - t < 3600]
            if len(state.hour_timestamps) >= rl.rph:
                return False
        return True
    def record_request(self, backend_id: str):
        """Record a request timestamp for rate limiting."""
        state = self._rate_limits.get(backend_id)
        if not state:
            return
        now = time.time()
        state.minute_timestamps.append(now)
        state.hour_timestamps.append(now)
    def get_health_summary(self) -> list[dict]:
        return [
            {
                "id": b.id,
                "type": b.type,
                "status": "healthy" if b.healthy else "down",
                "models": [m.id for m in b.models],
                "latency_ms": b.latency_ms,
                "last_check": b.last_check,
            }
            for b in self.backends.values()
        ]
 registry = Registry()
--- a/hub-web/.gitignore
+++ b/hub-web/.gitignore
@@ -0,0 +1,24 @@
 # Logs
 logs
 *.log
 npm-debug.log*
 yarn-debug.log*
 yarn-error.log*
 pnpm-debug.log*
 lerna-debug.log*
 node_modules
 dist
 dist-ssr
 *.local
 # Editor directories and files
 .vscode/*
 !.vscode/extensions.json
 .idea
 .DS_Store
 *.suo
 *.ntvs*
 *.njsproj
 *.sln
 *.sw?
--- a/hub-web/Dockerfile
+++ b/hub-web/Dockerfile
@@ -0,0 +1,12 @@
 FROM node:20-alpine AS build
 WORKDIR /app
 COPY package*.json ./
 RUN npm ci
 COPY . .
 RUN npm run build
 FROM nginx:alpine
 COPY --from=build /app/dist /usr/share/nginx/html
 COPY nginx.conf /etc/nginx/conf.d/default.conf
 EXPOSE 80
 CMD ["nginx", "-g", "daemon off;"]
--- a/hub-web/README.md
+++ b/hub-web/README.md
@@ -0,0 +1,73 @@
 # React + TypeScript + Vite
 This template provides a minimal setup to get React working in Vite with HMR and some ESLint rules.
 Currently, two official plugins are available:
 - [@vitejs/plugin-react](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react) uses [Oxc](https://oxc.rs)
 - [@vitejs/plugin-react-swc](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react-swc) uses [SWC](https://swc.rs/)
 ## React Compiler
 The React Compiler is not enabled on this template because of its impact on dev & build performances. To add it, see [this documentation](https://react.dev/learn/react-compiler/installation).
 ## Expanding the ESLint configuration
 If you are developing a production application, we recommend updating the configuration to enable type-aware lint rules:
 ```js
 export default defineConfig([
  globalIgnores(['dist']),
  {
    files: ['**/*.{ts,tsx}'],
    extends: [
      // Other configs...
      // Remove tseslint.configs.recommended and replace with this
      tseslint.configs.recommendedTypeChecked,
      // Alternatively, use this for stricter rules
      tseslint.configs.strictTypeChecked,
      // Optionally, add this for stylistic rules
      tseslint.configs.stylisticTypeChecked,
      // Other configs...
    ],
    languageOptions: {
      parserOptions: {
        project: ['./tsconfig.node.json', './tsconfig.app.json'],
        tsconfigRootDir: import.meta.dirname,
      },
      // other options...
    },
  },
 ])
 ```
 You can also install [eslint-plugin-react-x](https://github.com/Rel1cx/eslint-react/tree/main/packages/plugins/eslint-plugin-react-x) and [eslint-plugin-react-dom](https://github.com/Rel1cx/eslint-react/tree/main/packages/plugins/eslint-plugin-react-dom) for React-specific lint rules:
 ```js
 // eslint.config.js
 import reactX from 'eslint-plugin-react-x'
 import reactDom from 'eslint-plugin-react-dom'
 export default defineConfig([
  globalIgnores(['dist']),
  {
    files: ['**/*.{ts,tsx}'],
    extends: [
      // Other configs...
      // Enable lint rules for React
      reactX.configs['recommended-typescript'],
      // Enable lint rules for React DOM
      reactDom.configs.recommended,
    ],
    languageOptions: {
      parserOptions: {
        project: ['./tsconfig.node.json', './tsconfig.app.json'],
        tsconfigRootDir: import.meta.dirname,
      },
      // other options...
    },
  },
 ])
 ```
--- a/hub-web/eslint.config.js
+++ b/hub-web/eslint.config.js
@@ -0,0 +1,23 @@
 import js from '@eslint/js'
 import globals from 'globals'
 import reactHooks from 'eslint-plugin-react-hooks'
 import reactRefresh from 'eslint-plugin-react-refresh'
 import tseslint from 'typescript-eslint'
 import { defineConfig, globalIgnores } from 'eslint/config'
 export default defineConfig([
  globalIgnores(['dist']),
  {
    files: ['**/*.{ts,tsx}'],
    extends: [
      js.configs.recommended,
      tseslint.configs.recommended,
      reactHooks.configs.flat.recommended,
      reactRefresh.configs.vite,
    ],
    languageOptions: {
      ecmaVersion: 2020,
      globals: globals.browser,
    },
  },
 ])
--- a/hub-web/index.html
+++ b/hub-web/index.html
@@ -0,0 +1,13 @@
 <!doctype html>
 <html lang="en">
  <head>
    <meta charset="UTF-8" />
    <link rel="icon" type="image/svg+xml" href="/favicon.svg" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>hub-web</title>
  </head>
  <body>
    <div id="root"></div>
    <script type="module" src="/src/main.tsx"></script>
  </body>
 </html>
--- a/hub-web/nginx.conf
+++ b/hub-web/nginx.conf
@@ -0,0 +1,9 @@
 server {
    listen 80;
    root /usr/share/nginx/html;
    index index.html;
    location / {
        try_files $uri $uri/ /index.html;
    }
 }
--- a/hub-web/package-lock.json
+++ b/hub-web/package-lock.json
--- a/hub-web/package.json
+++ b/hub-web/package.json
@@ -0,0 +1,34 @@
 {
  "name": "hub-web",
  "private": true,
  "version": "0.0.0",
  "type": "module",
  "scripts": {
    "dev": "vite",
    "build": "tsc -b && vite build",
    "lint": "eslint .",
    "preview": "vite preview"
  },
  "dependencies": {
    "react": "^19.2.4",
    "react-dom": "^19.2.4",
    "react-markdown": "^10.1.0",
    "react-router-dom": "^7.13.2"
  },
  "devDependencies": {
    "@eslint/js": "^9.39.4",
    "@tailwindcss/vite": "^4.2.2",
    "@types/node": "^24.12.0",
    "@types/react": "^19.2.14",
    "@types/react-dom": "^19.2.3",
    "@vitejs/plugin-react": "^6.0.1",
    "eslint": "^9.39.4",
    "eslint-plugin-react-hooks": "^7.0.1",
    "eslint-plugin-react-refresh": "^0.5.2",
    "globals": "^17.4.0",
    "tailwindcss": "^4.2.2",
    "typescript": "~5.9.3",
    "typescript-eslint": "^8.57.0",
    "vite": "^8.0.1"
  }
 }
--- a/hub-web/public/favicon.svg
+++ b/hub-web/public/favicon.svg
--- a/hub-web/public/icons.svg
+++ b/hub-web/public/icons.svg
@@ -0,0 +1,24 @@
 <svg xmlns="http://www.w3.org/2000/svg">
  <symbol id="bluesky-icon" viewBox="0 0 16 17">
    <g clip-path="url(#bluesky-clip)"><path fill="#08060d" d="M7.75 7.735c-.693-1.348-2.58-3.86-4.334-5.097-1.68-1.187-2.32-.981-2.74-.79C.188 2.065.1 2.812.1 3.251s.241 3.602.398 4.13c.52 1.744 2.367 2.333 4.07 2.145-2.495.37-4.71 1.278-1.805 4.512 3.196 3.309 4.38-.71 4.987-2.746.608 2.036 1.307 5.91 4.93 2.746 2.72-2.746.747-4.143-1.747-4.512 1.702.189 3.55-.4 4.07-2.145.156-.528.397-3.691.397-4.13s-.088-1.186-.575-1.406c-.42-.19-1.06-.395-2.741.79-1.755 1.24-3.64 3.752-4.334 5.099"/></g>
    <defs><clipPath id="bluesky-clip"><path fill="#fff" d="M.1.85h15.3v15.3H.1z"/></clipPath></defs>
  </symbol>
  <symbol id="discord-icon" viewBox="0 0 20 19">
    <path fill="#08060d" d="M16.224 3.768a14.5 14.5 0 0 0-3.67-1.153c-.158.286-.343.67-.47.976a13.5 13.5 0 0 0-4.067 0c-.128-.306-.317-.69-.476-.976A14.4 14.4 0 0 0 3.868 3.77C1.546 7.28.916 10.703 1.231 14.077a14.7 14.7 0 0 0 4.5 2.306q.545-.748.965-1.587a9.5 9.5 0 0 1-1.518-.74q.191-.14.372-.293c2.927 1.369 6.107 1.369 8.999 0q.183.152.372.294-.723.437-1.52.74.418.838.963 1.588a14.6 14.6 0 0 0 4.504-2.308c.37-3.911-.63-7.302-2.644-10.309m-9.13 8.234c-.878 0-1.599-.82-1.599-1.82 0-.998.705-1.82 1.6-1.82.894 0 1.614.82 1.599 1.82.001 1-.705 1.82-1.6 1.82m5.91 0c-.878 0-1.599-.82-1.599-1.82 0-.998.705-1.82 1.6-1.82.893 0 1.614.82 1.599 1.82 0 1-.706 1.82-1.6 1.82"/>
  </symbol>
  <symbol id="documentation-icon" viewBox="0 0 21 20">
    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="m15.5 13.333 1.533 1.322c.645.555.967.833.967 1.178s-.322.623-.967 1.179L15.5 18.333m-3.333-5-1.534 1.322c-.644.555-.966.833-.966 1.178s.322.623.966 1.179l1.534 1.321"/>
    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M17.167 10.836v-4.32c0-1.41 0-2.117-.224-2.68-.359-.906-1.118-1.621-2.08-1.96-.599-.21-1.349-.21-2.848-.21-2.623 0-3.935 0-4.983.369-1.684.591-3.013 1.842-3.641 3.428C3 6.449 3 7.684 3 10.154v2.122c0 2.558 0 3.838.706 4.726q.306.383.713.671c.76.536 1.79.64 3.581.66"/>
    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M3 10a2.78 2.78 0 0 1 2.778-2.778c.555 0 1.209.097 1.748-.047.48-.129.854-.503.982-.982.145-.54.048-1.194.048-1.749a2.78 2.78 0 0 1 2.777-2.777"/>
  </symbol>
  <symbol id="github-icon" viewBox="0 0 19 19">
    <path fill="#08060d" fill-rule="evenodd" d="M9.356 1.85C5.05 1.85 1.57 5.356 1.57 9.694a7.84 7.84 0 0 0 5.324 7.44c.387.079.528-.168.528-.376 0-.182-.013-.805-.013-1.454-2.165.467-2.616-.935-2.616-.935-.349-.91-.864-1.143-.864-1.143-.71-.48.051-.48.051-.48.787.051 1.2.805 1.2.805.695 1.194 1.817.857 2.268.649.064-.507.27-.857.49-1.052-1.728-.182-3.545-.857-3.545-3.87 0-.857.31-1.558.8-2.104-.078-.195-.349-1 .077-2.078 0 0 .657-.208 2.14.805a7.5 7.5 0 0 1 1.946-.26c.657 0 1.328.092 1.946.26 1.483-1.013 2.14-.805 2.14-.805.426 1.078.155 1.883.078 2.078.502.546.799 1.247.799 2.104 0 3.013-1.818 3.675-3.558 3.87.284.247.528.714.528 1.454 0 1.052-.012 1.896-.012 2.156 0 .208.142.455.528.377a7.84 7.84 0 0 0 5.324-7.441c.013-4.338-3.48-7.844-7.773-7.844" clip-rule="evenodd"/>
  </symbol>
  <symbol id="social-icon" viewBox="0 0 20 20">
    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M12.5 6.667a4.167 4.167 0 1 0-8.334 0 4.167 4.167 0 0 0 8.334 0"/>
    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M2.5 16.667a5.833 5.833 0 0 1 8.75-5.053m3.837.474.513 1.035c.07.144.257.282.414.309l.93.155c.596.1.736.536.307.965l-.723.73a.64.64 0 0 0-.152.531l.207.903c.164.715-.213.991-.84.618l-.872-.52a.63.63 0 0 0-.577 0l-.872.52c-.624.373-1.003.094-.84-.618l.207-.903a.64.64 0 0 0-.152-.532l-.723-.729c-.426-.43-.289-.864.306-.964l.93-.156a.64.64 0 0 0 .412-.31l.513-1.034c.28-.562.735-.562 1.012 0"/>
  </symbol>
  <symbol id="x-icon" viewBox="0 0 19 19">
    <path fill="#08060d" fill-rule="evenodd" d="M1.893 1.98c.052.072 1.245 1.769 2.653 3.77l2.892 4.114c.183.261.333.48.333.486s-.068.089-.152.183l-.522.593-.765.867-3.597 4.087c-.375.426-.734.834-.798.905a1 1 0 0 0-.118.148c0 .01.236.017.664.017h.663l.729-.83c.4-.457.796-.906.879-.999a692 692 0 0 0 1.794-2.038c.034-.037.301-.34.594-.675l.551-.624.345-.392a7 7 0 0 1 .34-.374c.006 0 .93 1.306 2.052 2.903l2.084 2.965.045.063h2.275c1.87 0 2.273-.003 2.266-.021-.008-.02-1.098-1.572-3.894-5.547-2.013-2.862-2.28-3.246-2.273-3.266.008-.019.282-.332 2.085-2.38l2-2.274 1.567-1.782c.022-.028-.016-.03-.65-.03h-.674l-.3.342a871 871 0 0 1-1.782 2.025c-.067.075-.405.458-.75.852a100 100 0 0 1-.803.91c-.148.172-.299.344-.99 1.127-.304.343-.32.358-.345.327-.015-.019-.904-1.282-1.976-2.808L6.365 1.85H1.8zm1.782.91 8.078 11.294c.772 1.08 1.413 1.973 1.425 1.984.016.017.241.02 1.05.017l1.03-.004-2.694-3.766L7.796 5.75 5.722 2.852l-1.039-.004-1.039-.004z" clip-rule="evenodd"/>
  </symbol>
 </svg>
--- a/hub-web/src/App.tsx
+++ b/hub-web/src/App.tsx
@@ -0,0 +1,61 @@
 import { useState, useEffect } from 'react';
 import { BrowserRouter, Routes, Route, Navigate, NavLink } from 'react-router-dom';
 import { AuthCtx } from './lib/auth';
 import { getMe, logout } from './lib/api';
 import Login from './pages/Login';
 import Dashboard from './pages/Dashboard';
 import Chat from './pages/Chat';
 export default function App() {
  const [role, setRole] = useState<string | null>(null);
  const [loading, setLoading] = useState(true);
  useEffect(() => {
    getMe().then(me => {
      setRole(me?.role ?? null);
      setLoading(false);
    });
  }, []);
  if (loading) {
    return <div className="flex items-center justify-center h-screen text-[hsl(var(--muted-foreground))]">Loading...</div>;
  }
  if (!role) {
    return (
      <AuthCtx.Provider value={{ role, setRole }}>
        <Login />
      </AuthCtx.Provider>
    );
  }
  return (
    <AuthCtx.Provider value={{ role, setRole }}>
      <BrowserRouter>
        <div className="min-h-screen flex flex-col dark">
          <nav className="border-b border-[hsl(var(--border))] px-6 py-3 flex items-center gap-6">
            <span className="font-semibold text-lg">AI Gateway</span>
            <NavLink to="/" className={({ isActive }) => isActive ? 'text-[hsl(var(--foreground))]' : 'text-[hsl(var(--muted-foreground))] hover:text-[hsl(var(--foreground))]'}>Dashboard</NavLink>
            <NavLink to="/chat" className={({ isActive }) => isActive ? 'text-[hsl(var(--foreground))]' : 'text-[hsl(var(--muted-foreground))] hover:text-[hsl(var(--foreground))]'}>Chat</NavLink>
            <div className="ml-auto flex items-center gap-3">
              <span className="text-sm text-[hsl(var(--muted-foreground))]">{role}</span>
              <button
                onClick={async () => { await logout(); setRole(null); }}
                className="text-sm text-[hsl(var(--muted-foreground))] hover:text-[hsl(var(--foreground))]"
              >
                Logout
              </button>
            </div>
          </nav>
          <main className="flex-1">
            <Routes>
              <Route path="/" element={<Dashboard />} />
              <Route path="/chat" element={<Chat />} />
              <Route path="*" element={<Navigate to="/" />} />
            </Routes>
          </main>
        </div>
      </BrowserRouter>
    </AuthCtx.Provider>
  );
 }
--- a/hub-web/src/index.css
+++ b/hub-web/src/index.css
@@ -0,0 +1,41 @@
@import "tailwindcss";
 :root {
  --background: 0 0% 100%;
  --foreground: 240 10% 3.9%;
  --card: 0 0% 100%;
  --card-foreground: 240 10% 3.9%;
  --muted: 240 4.8% 95.9%;
  --muted-foreground: 240 3.8% 46.1%;
  --border: 240 5.9% 90%;
  --primary: 240 5.9% 10%;
  --primary-foreground: 0 0% 98%;
  --destructive: 0 84.2% 60.2%;
  --ring: 240 5.9% 10%;
  --radius: 0.5rem;
 }
 .dark {
  --background: 240 10% 3.9%;
  --foreground: 0 0% 98%;
  --card: 240 10% 3.9%;
  --card-foreground: 0 0% 98%;
  --muted: 240 3.7% 15.9%;
  --muted-foreground: 240 5% 64.9%;
  --border: 240 3.7% 15.9%;
  --primary: 0 0% 98%;
  --primary-foreground: 240 5.9% 10%;
  --destructive: 0 62.8% 30.6%;
  --ring: 240 4.9% 83.9%;
 }
 * {
  border-color: hsl(var(--border));
 }
 body {
  margin: 0;
  background-color: hsl(var(--background));
  color: hsl(var(--foreground));
  font-family: system-ui, -apple-system, sans-serif;
 }
--- a/hub-web/src/lib/api.ts
+++ b/hub-web/src/lib/api.ts
@@ -0,0 +1,127 @@
 const BASE = '';
 // Store token in memory for Bearer auth (more reliable than cookies through proxies)
 let _token: string | null = null;
 export function setToken(token: string | null) {
  _token = token;
 }
 function authHeaders(): Record<string, string> {
  const h: Record<string, string> = { 'Content-Type': 'application/json' };
  if (_token) h['Authorization'] = `Bearer ${_token}`;
  return h;
 }
 export async function login(password: string): Promise<{ role: string; token: string }> {
  const res = await fetch(`${BASE}/auth/login`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ password }),
  });
  if (!res.ok) {
    const err = await res.json().catch(() => null);
    throw new Error(err?.error?.message || 'Login failed');
  }
  const data = await res.json();
  _token = data.token;
  return data;
 }
 export async function getMe(): Promise<{ role: string } | null> {
  if (!_token) return null;
  const res = await fetch(`${BASE}/auth/me`, { headers: authHeaders() });
  if (!res.ok) { _token = null; return null; }
  return res.json();
 }
 export async function logout(): Promise<void> {
  await fetch(`${BASE}/auth/logout`, { method: 'POST', headers: authHeaders() });
  _token = null;
 }
 export interface Model {
  id: string;
  owned_by: string;
  capabilities: string[];
  backend_id: string;
  backend_status: string;
 }
 export async function getModels(): Promise<Model[]> {
  const res = await fetch(`${BASE}/v1/models`, { headers: authHeaders() });
  if (!res.ok) return [];
  const data = await res.json();
  return data.data || [];
 }
 export interface BackendHealth {
  id: string;
  type: string;
  status: string;
  models: string[];
  latency_ms: number;
 }
 export interface GpuInfo {
  utilization: number;
  temperature: number;
  vram_used: number;
  vram_total: number;
  power_draw: number;
  name: string;
 }
 export async function getHealth(): Promise<{ backends: BackendHealth[]; gpu: GpuInfo | null }> {
  const res = await fetch(`${BASE}/health`);
  if (!res.ok) return { backends: [], gpu: null };
  return res.json();
 }
 export interface ChatMessage {
  role: 'user' | 'assistant' | 'system';
  content: string;
 }
 export async function* streamChat(
  model: string,
  messages: ChatMessage[],
 ): AsyncGenerator<string, void> {
  const res = await fetch(`${BASE}/v1/chat/completions`, {
    method: 'POST',
    headers: authHeaders(),
    body: JSON.stringify({ model, messages, stream: true }),
  });
  if (!res.ok) {
    const err = await res.json().catch(() => null);
    throw new Error(err?.error?.message || `Chat failed: ${res.status}`);
  }
  const reader = res.body?.getReader();
  if (!reader) throw new Error('No response body');
  const decoder = new TextDecoder();
  let buffer = '';
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    buffer += decoder.decode(value, { stream: true });
    const lines = buffer.split('\n');
    buffer = lines.pop() || '';
    for (const line of lines) {
      if (!line.startsWith('data: ')) continue;
      const data = line.slice(6).trim();
      if (data === '[DONE]') return;
      try {
        const parsed = JSON.parse(data);
        const content = parsed.choices?.[0]?.delta?.content;
        if (content) yield content;
      } catch {
        // skip malformed chunks
      }
    }
  }
 }
--- a/hub-web/src/lib/auth.ts
+++ b/hub-web/src/lib/auth.ts
@@ -0,0 +1,12 @@
 import { createContext, useContext } from 'react';
 export interface AuthContext {
  role: string | null;
  setRole: (role: string | null) => void;
 }
 export const AuthCtx = createContext<AuthContext>({ role: null, setRole: () => {} });
 export function useAuth() {
  return useContext(AuthCtx);
 }
--- a/hub-web/src/main.tsx
+++ b/hub-web/src/main.tsx
@@ -0,0 +1,10 @@
 import { StrictMode } from 'react'
 import { createRoot } from 'react-dom/client'
 import './index.css'
 import App from './App.tsx'
 createRoot(document.getElementById('root')!).render(
  <StrictMode>
    <App />
  </StrictMode>,
 )
--- a/hub-web/src/pages/Chat.tsx
+++ b/hub-web/src/pages/Chat.tsx
@@ -0,0 +1,130 @@
 import { useState, useEffect, useRef } from 'react';
 import ReactMarkdown from 'react-markdown';
 import type { Model, ChatMessage } from '../lib/api';
 import { getModels, streamChat } from '../lib/api';
 export default function Chat() {
  const [models, setModels] = useState<Model[]>([]);
  const [selectedModel, setSelectedModel] = useState('');
  const [messages, setMessages] = useState<ChatMessage[]>([]);
  const [input, setInput] = useState('');
  const [streaming, setStreaming] = useState(false);
  const bottomRef = useRef<HTMLDivElement>(null);
  useEffect(() => {
    getModels().then(mdls => {
      const chatModels = mdls.filter(m => m.capabilities.includes('chat'));
      setModels(chatModels);
      if (chatModels.length > 0 && !selectedModel) {
        setSelectedModel(chatModels[0].id);
      }
    });
  }, []);
  useEffect(() => {
    bottomRef.current?.scrollIntoView({ behavior: 'smooth' });
  }, [messages]);
  const handleSend = async () => {
    if (!input.trim() || !selectedModel || streaming) return;
    const userMsg: ChatMessage = { role: 'user', content: input.trim() };
    const newMessages = [...messages, userMsg];
    setMessages(newMessages);
    setInput('');
    setStreaming(true);
    const assistantMsg: ChatMessage = { role: 'assistant', content: '' };
    setMessages([...newMessages, assistantMsg]);
    try {
      for await (const chunk of streamChat(selectedModel, newMessages)) {
        assistantMsg.content += chunk;
        setMessages(prev => [...prev.slice(0, -1), { ...assistantMsg }]);
      }
    } catch (err) {
      assistantMsg.content += `\n\n[Error: ${err instanceof Error ? err.message : 'Unknown error'}]`;
      setMessages(prev => [...prev.slice(0, -1), { ...assistantMsg }]);
    } finally {
      setStreaming(false);
    }
  };
  const handleKeyDown = (e: React.KeyboardEvent) => {
    if (e.key === 'Enter' && !e.shiftKey) {
      e.preventDefault();
      handleSend();
    }
  };
  return (
    <div className="flex flex-col h-[calc(100vh-57px)]">
      {/* Header */}
      <div className="border-b border-[hsl(var(--border))] px-6 py-3 flex items-center gap-4">
        <select
          value={selectedModel}
          onChange={e => setSelectedModel(e.target.value)}
          className="px-3 py-1.5 rounded-md border border-[hsl(var(--border))] bg-[hsl(var(--background))] text-sm"
        >
          {models.map(m => (
            <option key={m.id} value={m.id}>{m.id} ({m.owned_by})</option>
          ))}
        </select>
        <button
          onClick={() => setMessages([])}
          className="text-sm text-[hsl(var(--muted-foreground))] hover:text-[hsl(var(--foreground))]"
        >
          Clear
        </button>
      </div>
      {/* Messages */}
      <div className="flex-1 overflow-y-auto px-6 py-4 space-y-4">
        {messages.length === 0 && (
          <div className="flex items-center justify-center h-full text-[hsl(var(--muted-foreground))]">
            Send a message to start
          </div>
        )}
        {messages.map((msg, i) => (
          <div key={i} className={`flex ${msg.role === 'user' ? 'justify-end' : 'justify-start'}`}>
            <div className={`max-w-[80%] rounded-lg px-4 py-2 ${
              msg.role === 'user'
                ? 'bg-[hsl(var(--primary))] text-[hsl(var(--primary-foreground))]'
                : 'bg-[hsl(var(--muted))]'
            }`}>
              {msg.role === 'assistant' ? (
                <div className="prose prose-sm prose-invert max-w-none">
                  <ReactMarkdown>{msg.content || '...'}</ReactMarkdown>
                </div>
              ) : (
                <p className="text-sm whitespace-pre-wrap">{msg.content}</p>
              )}
            </div>
          </div>
        ))}
        <div ref={bottomRef} />
      </div>
      {/* Input */}
      <div className="border-t border-[hsl(var(--border))] px-6 py-4">
        <div className="flex gap-3">
          <textarea
            value={input}
            onChange={e => setInput(e.target.value)}
            onKeyDown={handleKeyDown}
            placeholder="Type a message... (Enter to send, Shift+Enter for newline)"
            rows={1}
            className="flex-1 px-3 py-2 rounded-md border border-[hsl(var(--border))] bg-[hsl(var(--background))] text-[hsl(var(--foreground))] resize-none focus:outline-none focus:ring-2 focus:ring-[hsl(var(--ring))]"
          />
          <button
            onClick={handleSend}
            disabled={streaming || !input.trim()}
            className="px-4 py-2 rounded-md bg-[hsl(var(--primary))] text-[hsl(var(--primary-foreground))] hover:opacity-90 disabled:opacity-50"
          >
            {streaming ? '...' : 'Send'}
          </button>
        </div>
      </div>
    </div>
  );
 }
--- a/hub-web/src/pages/Dashboard.tsx
+++ b/hub-web/src/pages/Dashboard.tsx
@@ -0,0 +1,96 @@
 import { useState, useEffect } from 'react';
 import type { BackendHealth, GpuInfo, Model } from '../lib/api';
 import { getHealth, getModels } from '../lib/api';
 export default function Dashboard() {
  const [backends, setBackends] = useState<BackendHealth[]>([]);
  const [gpu, setGpu] = useState<GpuInfo | null>(null);
  const [models, setModels] = useState<Model[]>([]);
  const refresh = async () => {
    const [health, mdls] = await Promise.all([getHealth(), getModels()]);
    setBackends(health.backends);
    setGpu(health.gpu);
    setModels(mdls);
  };
  useEffect(() => {
    refresh();
    const id = setInterval(refresh, 15000);
    return () => clearInterval(id);
  }, []);
  return (
    <div className="p-6 space-y-6">
      <div className="flex items-center justify-between">
        <h2 className="text-xl font-semibold">Backends</h2>
        <button onClick={refresh} className="text-sm text-[hsl(var(--muted-foreground))] hover:text-[hsl(var(--foreground))]">Refresh</button>
      </div>
      <div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-4">
        {backends.map(b => (
          <div key={b.id} className="rounded-lg border border-[hsl(var(--border))] p-4 space-y-2">
            <div className="flex items-center justify-between">
              <span className="font-medium">{b.id}</span>
              <span className={`text-xs px-2 py-0.5 rounded-full ${b.status === 'healthy' ? 'bg-green-500/20 text-green-400' : 'bg-red-500/20 text-red-400'}`}>
                {b.status}
              </span>
            </div>
            <div className="text-sm text-[hsl(var(--muted-foreground))]">{b.type}</div>
            <div className="text-sm">{b.models.join(', ')}</div>
            {b.latency_ms > 0 && <div className="text-xs text-[hsl(var(--muted-foreground))]">{b.latency_ms}ms</div>}
          </div>
        ))}
      </div>
      {gpu && (
        <>
          <h2 className="text-xl font-semibold">GPU</h2>
          <div className="rounded-lg border border-[hsl(var(--border))] p-4 grid grid-cols-2 md:grid-cols-4 gap-4">
            <Stat label="Utilization" value={`${gpu.utilization}%`} />
            <Stat label="Temperature" value={`${gpu.temperature}C`} />
            <Stat label="VRAM" value={`${gpu.vram_used}/${gpu.vram_total} MB`} />
            <Stat label="Power" value={`${gpu.power_draw}W`} />
          </div>
        </>
      )}
      <h2 className="text-xl font-semibold">Models</h2>
      <div className="rounded-lg border border-[hsl(var(--border))] overflow-hidden">
        <table className="w-full text-sm">
          <thead className="bg-[hsl(var(--muted))]">
            <tr>
              <th className="text-left px-4 py-2">Model</th>
              <th className="text-left px-4 py-2">Backend</th>
              <th className="text-left px-4 py-2">Capabilities</th>
              <th className="text-left px-4 py-2">Status</th>
            </tr>
          </thead>
          <tbody>
            {models.map(m => (
              <tr key={`${m.backend_id}-${m.id}`} className="border-t border-[hsl(var(--border))]">
                <td className="px-4 py-2 font-mono">{m.id}</td>
                <td className="px-4 py-2">{m.owned_by}</td>
                <td className="px-4 py-2">{m.capabilities.join(', ')}</td>
                <td className="px-4 py-2">
                  <span className={`text-xs px-2 py-0.5 rounded-full ${m.backend_status === 'healthy' ? 'bg-green-500/20 text-green-400' : 'bg-red-500/20 text-red-400'}`}>
                    {m.backend_status}
                  </span>
                </td>
              </tr>
            ))}
          </tbody>
        </table>
      </div>
    </div>
  );
 }
 function Stat({ label, value }: { label: string; value: string }) {
  return (
    <div>
      <div className="text-xs text-[hsl(var(--muted-foreground))]">{label}</div>
      <div className="text-lg font-semibold">{value}</div>
    </div>
  );
 }
--- a/hub-web/src/pages/Login.tsx
+++ b/hub-web/src/pages/Login.tsx
@@ -0,0 +1,48 @@
 import { useState } from 'react';
 import { login } from '../lib/api';
 import { useAuth } from '../lib/auth';
 export default function Login() {
  const { setRole } = useAuth();
  const [password, setPassword] = useState('');
  const [error, setError] = useState('');
  const [loading, setLoading] = useState(false);
  const handleSubmit = async (e: React.FormEvent) => {
    e.preventDefault();
    setError('');
    setLoading(true);
    try {
      const { role } = await login(password);
      setRole(role);
    } catch (err) {
      setError(err instanceof Error ? err.message : 'Login failed');
    } finally {
      setLoading(false);
    }
  };
  return (
    <div className="dark flex items-center justify-center min-h-screen bg-[hsl(var(--background))]">
      <form onSubmit={handleSubmit} className="w-80 space-y-4">
        <h1 className="text-2xl font-semibold text-center text-[hsl(var(--foreground))]">AI Gateway</h1>
        <input
          type="password"
          value={password}
          onChange={e => setPassword(e.target.value)}
          placeholder="Password"
          autoFocus
          className="w-full px-3 py-2 rounded-md border border-[hsl(var(--border))] bg-[hsl(var(--background))] text-[hsl(var(--foreground))] focus:outline-none focus:ring-2 focus:ring-[hsl(var(--ring))]"
        />
        <button
          type="submit"
          disabled={loading}
          className="w-full px-3 py-2 rounded-md bg-[hsl(var(--primary))] text-[hsl(var(--primary-foreground))] hover:opacity-90 disabled:opacity-50"
        >
          {loading ? 'Logging in...' : 'Login'}
        </button>
        {error && <p className="text-sm text-[hsl(var(--destructive))] text-center">{error}</p>}
      </form>
    </div>
  );
 }
--- a/hub-web/tsconfig.app.json
+++ b/hub-web/tsconfig.app.json
@@ -0,0 +1,32 @@
 {
  "compilerOptions": {
    "tsBuildInfoFile": "./node_modules/.tmp/tsconfig.app.tsbuildinfo",
    "target": "ES2023",
    "useDefineForClassFields": true,
    "lib": ["ES2023", "DOM", "DOM.Iterable"],
    "module": "ESNext",
    "types": ["vite/client"],
    "skipLibCheck": true,
    /* Bundler mode */
    "moduleResolution": "bundler",
    "allowImportingTsExtensions": true,
    "verbatimModuleSyntax": true,
    "moduleDetection": "force",
    "noEmit": true,
    "jsx": "react-jsx",
    /* Linting */
    "strict": true,
    "noUnusedLocals": true,
    "noUnusedParameters": true,
    "erasableSyntaxOnly": true,
    "noFallthroughCasesInSwitch": true,
    "noUncheckedSideEffectImports": true,
    "baseUrl": ".",
    "paths": {
      "@/*": ["./src/*"]
    }
  },
  "include": ["src"]
 }
--- a/hub-web/tsconfig.json
+++ b/hub-web/tsconfig.json
@@ -0,0 +1,7 @@
 {
  "files": [],
  "references": [
    { "path": "./tsconfig.app.json" },
    { "path": "./tsconfig.node.json" }
  ]
 }
--- a/hub-web/tsconfig.node.json
+++ b/hub-web/tsconfig.node.json
@@ -0,0 +1,26 @@
 {
  "compilerOptions": {
    "tsBuildInfoFile": "./node_modules/.tmp/tsconfig.node.tsbuildinfo",
    "target": "ES2023",
    "lib": ["ES2023"],
    "module": "ESNext",
    "types": ["node"],
    "skipLibCheck": true,
    /* Bundler mode */
    "moduleResolution": "bundler",
    "allowImportingTsExtensions": true,
    "verbatimModuleSyntax": true,
    "moduleDetection": "force",
    "noEmit": true,
    /* Linting */
    "strict": true,
    "noUnusedLocals": true,
    "noUnusedParameters": true,
    "erasableSyntaxOnly": true,
    "noFallthroughCasesInSwitch": true,
    "noUncheckedSideEffectImports": true
  },
  "include": ["vite.config.ts"]
 }
--- a/hub-web/vite.config.ts
+++ b/hub-web/vite.config.ts
@@ -0,0 +1,21 @@
 import { defineConfig } from 'vite'
 import react from '@vitejs/plugin-react'
 import tailwindcss from '@tailwindcss/vite'
 import path from 'path'
 export default defineConfig({
  plugins: [react(), tailwindcss()],
  resolve: {
    alias: {
      '@': path.resolve(__dirname, './src'),
    },
  },
  server: {
    proxy: {
      '/v1': 'http://localhost:8000',
      '/auth': 'http://localhost:8000',
      '/health': 'http://localhost:8000',
      '/gpu': 'http://localhost:8000',
    },
  },
 })
		`@@ -0,0 +1,2 @@`
							`# DB model helpers — used in Phase 3 for logging`
							`# Schema defined in database.py`