Files
hyungi_document_server/app/services/document_telemetry.py
T
Hyungi Ahn 6fdc48e5b6 feat(ai): B-1 summary tier 분할 — triage(4B) + deep_summary(26B)
PR-A policy 레이어를 재사용하여 classify_worker 에 tier triage 경로를 추가.
Legacy ai_summary / ai_domain / ai_suggestion 은 유지 (회귀 0), tldr/bullets/
detail/inconsistencies 는 별도 필드로 분리.

Migrations (156~160):
- 156 documents: ai_tldr, ai_bullets, ai_detail_summary, ai_inconsistencies,
  ai_analysis_tier 5컬럼
- 157 process_stage 에 'deep_summary' ADD VALUE 단독 (Postgres 동일 트랜잭션
  제약 회피)
- 158 processing_queue.payload JSONB (envelope 전달)
- 159 analyze_events 에 tier + suppressed_reason
- 160 suppressed_reason partial index

Models/ORM:
- Document: 5컬럼 Mapped 추가
- ProcessingQueue: deep_summary enum 확장 + payload 필드, enqueue_stage 에
  payload 옵션
- AnalyzeEvent: PR-A shadow 6컬럼 + PR-B tier/suppressed_reason

Workers:
- classify_worker: 기존 legacy 경로 뒤에 _run_tier_triage 추가.
  - _match_subject_domain(doc, text): source_channel + 본문 keywords + ai_domain
    prefix 로 PR-A policy 의 subject_domain 이름 결정 (category 매칭 금지).
  - R1 TriageOutput pydantic + JSON 깨짐 fallback (triage_json_invalid).
  - R2 _check_backlog_guard(): 30분 window ratio > threshold OR pending 초과면
    soft escalate suppress. hard escalate 는 통과.
  - R3 _slice_text_ranges(): 260k 초과 시 head 120k + mid 20k + tail 120k 3조각.
  - escalate 시 EscalationEnvelope 구성 + {envelope, subject_domain} payload 로
    deep_summary enqueue.
- deep_summary_worker (신규): queue payload 에서 envelope + subject_domain 읽기 →
  render_26b("p3c_deep_summary", subject_domain) + MLX 호출 (llm_gate Semaphore(1)
  경유) → ai_detail_summary + ai_inconsistencies 저장 + ai_analysis_tier='deep'.
  _filter_inconsistencies 로 허용 kind (version_drift / procedure_conflict /
  source_conflict / missing_basis) 만 통과 — 구매/계약 kind drop.
- queue_consumer: workers dict 에 deep_summary 추가 + BATCH_SIZE=1. next_stages
  는 건드리지 않음 — classify → embed/chunk 는 그대로, deep_summary 는 독립 체인.

Telemetry:
- record_analyze_event: subject_domain / risk_flags / escalation_reasons /
  confidence / policy_version / shadow_would_route_to / tier / escalated_to_26b /
  suppressed_reason 파라미터 확장. classify/deep worker 가 mode="summary_triage"
  또는 "summary_deep" 로 기록.

API:
- DocumentResponse 에 ai_tldr / ai_bullets / ai_detail_summary /
  ai_inconsistencies / ai_analysis_tier 5필드 노출.

Prompts:
- classify.txt 에 DEPRECATED 주석만 추가 (파일 유지 — rollback 경로 보존).
- PR-A 의 app/prompts/policy/p3a_short_summary.txt (4B) 와 p3c_deep_summary.txt
  (26B) 를 그대로 사용. 내 소유의 summary_triage.txt / summary_deep.txt 는 중복
  이라 별도 커밋에서 제거하지 않고 바로 생성 전 삭제.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 10:22:40 +09:00

104 lines
3.5 KiB
Python

"""document 관련 telemetry — Phase E.2 (analyze_events).
/documents/{id}/analyze 호출을 background task로 DB에 기록.
search_telemetry.py 패턴 동일 (단독 세션 + 에러 흡수).
"""
from __future__ import annotations
import logging
from typing import Any
from sqlalchemy.exc import SQLAlchemyError
from core.database import async_session
from models.analyze_event import AnalyzeEvent
logger = logging.getLogger("document_telemetry")
# source enum validation — 서버 강제 fallback
VALID_SOURCES: set[str] = {
"document_server",
"synology_chat",
"ui_search",
"ui_detail",
"eval",
"unknown",
}
DEFAULT_SOURCE = "document_server"
def sanitize_source(raw: str | None) -> str:
"""source 값 서버 강제. enum 외 값은 unknown, None은 document_server."""
if raw is None:
return DEFAULT_SOURCE
lowered = raw.strip().lower()
if lowered in VALID_SOURCES:
return lowered
return "unknown"
async def record_analyze_event(
doc_id: int,
user_id: int | None,
mode: str,
text_limit: int | None,
truncated: bool,
layers_returned: list[str],
cached: bool,
latency_ms: int,
model_name: str | None,
prompt_version: str | None,
error_code: str | None,
source: str,
# PR-A shadow observability — 아래 6개는 routing 이 동반될 때만 세팅, 그 외는 None 유지.
subject_domain: str | None = None,
risk_flags: list[str] | None = None,
high_impact_task: bool | None = None,
escalation_reasons: list[str] | None = None,
confidence: float | None = None,
policy_version: str | None = None,
shadow_would_route_to: str | None = None,
# PR-B B-1 — 실제 호출 tier 와 R2 backlog guard
tier: str | None = None,
escalated_to_26b: bool | None = None,
suppressed_reason: str | None = None,
) -> None:
"""analyze_events INSERT. background task에서 호출 — 에러 삼킴.
layers_returned: 성공 시 ["evidence","summary"] 등 layer 문자열 리스트. 실패 시 [].
error_code: None (성공) | "timeout" | "llm" | "parse" | "missing_summary" | "no_text" | "not_found"
tier: 'triage' | 'primary' | 'fallback' — 실제 호출된 tier (PR-B B-0~B-2).
suppressed_reason: R2 backlog guard 로 soft escalate 가 suppress 된 경우의 이유 문자열.
"""
try:
async with async_session() as session:
row = AnalyzeEvent(
doc_id=doc_id,
user_id=user_id,
mode=mode,
text_limit=text_limit,
truncated=truncated,
layers_returned=layers_returned,
cached=cached,
latency_ms=latency_ms,
model_name=model_name,
prompt_version=prompt_version,
error_code=error_code,
source=source,
subject_domain=subject_domain,
risk_flags=risk_flags,
high_impact_task=high_impact_task,
escalated_to_26b=escalated_to_26b,
escalation_reasons=escalation_reasons,
confidence=confidence,
policy_version=policy_version,
shadow_would_route_to=shadow_would_route_to,
tier=tier,
suppressed_reason=suppressed_reason,
)
session.add(row)
await session.commit()
except SQLAlchemyError as exc:
logger.warning(f"analyze_event insert failed: {exc}")