Files
Hyungi Ahn 09883d0358 feat(ask): Phase 3.5 A0 — ask_events source/eval_case_id + eval auth boundary
- migrations 138~142: source TEXT DEFAULT 'document_server' + eval_case_id TEXT
  추가, 인덱스 2개, backfill, 1주 관찰 후 NOT NULL (140 적용 분리)
- app/models/ask_event.py: source / eval_case_id ORM 필드 (138~141 단계 nullable)
- app/services/search_telemetry.py: record_ask_event 시그니처에 source / eval_case_id
- app/core/config.py: settings.eval_runner_token + EVAL_RUNNER_TOKEN env 로드
- app/api/search.py:
  - X-Source / X-Eval-Case-Id / X-Eval-Token 헤더 수신
  - _resolve_eval_identity(): hmac.compare_digest 로 token 검증, 실패 시 source
    'document_server' 강등 + warning log + eval_case_id=None
  - 두 record_ask_event 호출에 검증된 source/eval_case_id 전달
- credentials.env.example: EVAL_RUNNER_TOKEN= (empty default = 모든 eval claim 거부)
- tests/test_ask_eval_auth.py: 9 케이스 — token 없음/틀림/일치, env 미설정,
  case_id only, non-eval source forces case_id None

trust boundary: 일반 client 의 X-Source=eval / X-Eval-Case-Id 시도는 무시되어
calibration telemetry 오염 불가. eval runner 만 EVAL_RUNNER_TOKEN 으로 인증.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 08:11:06 +09:00

49 lines
2.4 KiB
Python

"""ask_events 테이블 ORM — /ask 호출 관측 (Phase 3.5a migration 102, Phase 3.5b 배선)
threshold calibration + verifier FP 분석 + defense layer 디버깅 데이터.
"""
from datetime import datetime
from typing import Any
from sqlalchemy import BigInteger, Boolean, DateTime, Float, ForeignKey, Integer, String, Text
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.orm import Mapped, mapped_column
from core.database import Base
class AskEvent(Base):
__tablename__ = "ask_events"
id: Mapped[int] = mapped_column(BigInteger, primary_key=True)
query: Mapped[str] = mapped_column(Text, nullable=False)
user_id: Mapped[int | None] = mapped_column(
BigInteger, ForeignKey("users.id", ondelete="SET NULL")
)
completeness: Mapped[str | None] = mapped_column(Text) # full / partial / insufficient
synthesis_status: Mapped[str | None] = mapped_column(Text)
confidence: Mapped[str | None] = mapped_column(Text) # high / medium / low
refused: Mapped[bool] = mapped_column(Boolean, default=False, nullable=False)
classifier_verdict: Mapped[str | None] = mapped_column(Text) # sufficient / insufficient
max_rerank_score: Mapped[float | None] = mapped_column(Float)
aggregate_score: Mapped[float | None] = mapped_column(Float)
hallucination_flags: Mapped[list[Any] | None] = mapped_column(JSONB, default=list)
evidence_count: Mapped[int | None] = mapped_column(Integer)
citation_count: Mapped[int | None] = mapped_column(Integer)
defense_layers: Mapped[dict[str, Any] | None] = mapped_column(JSONB)
total_ms: Mapped[int | None] = mapped_column(Integer)
# Phase E.1: 측정 필드 확장 (answer_length가 E.3 400→600자 비교 핵심)
answer_length: Mapped[int | None] = mapped_column(Integer)
covered_aspects: Mapped[list[Any] | None] = mapped_column(JSONB)
missing_aspects: Mapped[list[Any] | None] = mapped_column(JSONB)
model_name: Mapped[str | None] = mapped_column(Text)
prompt_version: Mapped[str | None] = mapped_column(Text)
# Phase 3.5 calibration: eval/production 분리 + golden join 키
# 138~141 단계: nullable. 142 적용 후 source 는 NOT NULL (DB 강제, 앱은 항상 채움).
source: Mapped[str | None] = mapped_column(Text)
eval_case_id: Mapped[str | None] = mapped_column(Text)
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), default=datetime.now, nullable=False
)