feat(policy): domain_policy.yaml v1 (safety_health + news)

PR-A 의 Single Source of Truth. subject_domains 9개 (safety_reference/ safety_operational/msds/hazard_specific/incident_report/health_record/ safety_video/news_item/news_digest_request) + fallback_domain + risk_flags 10개 + forbidden_for_4b 6 카테고리 + escalation 임계값 + observability. Axis 원칙 (feedback_category_vs_ai_domain_axis.md): - subject_domain 매칭 키 = source_channel/keywords/tags/ai_domain - documents.category 는 UI 축 (매칭 키로 사용 금지) - suggested_ui_category 는 OUTPUT 매핑 (분류 제안용) Scope: safety_health + news 만. 소설은 별도 정책으로 분리. plan: ~/.claude/plans/wise-gliding-hippo.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 09:30:25 +09:00
parent ddfcdbb68a
commit fad73ba88c
1 changed files with 242 additions and 0 deletions
@@ -0,0 +1,242 @@
+# domain_policy.yaml
+# ============================================================================
+# Single Source of Truth for AI routing/escalation/forbidden rules.
+#
+# - 코드가 이 파일을 로드한다 (app/policy/loader.py).
+# - 프롬프트는 이 yaml 에서 excerpt 를 runtime 에 렌더링 받는다.
+# - 규칙을 프롬프트에 직접 하드코딩하지 않는다 (drift 방지).
+# - 변경 시 policy_version (sha256(yaml+template)[:12]) 자동 bump.
+#
+# Axis separation (feedback_category_vs_ai_domain_axis.md):
+#   - subject_domain 매칭 키 = source_channel / keywords(본문) / tags / ai_domain
+#     (documents.category 는 **매칭 키로 사용 금지** — UI 축)
+#   - suggested_ui_category = classify_worker 가 사용자에게 제시할 UI 카테고리 **제안**
+#     (실제 저장·전이는 PR-B 의 ai_suggestion 승인 플로우에서 결정)
+#
+# Scope v1: safety_health + news. 소설은 별도 정책으로 분리 (도메인 미확정).
+# ============================================================================
+
+version: 1
+last_updated: "2026-04-24"
+scope: [safety_health, news]
+self_declare_semantics: additive_trigger_only  # 4B self-declare 는 ADD only, OFF 불가
+
+# ---------------------------------------------------------------------------
+# subject_domains — 라우팅·정책 축
+# ---------------------------------------------------------------------------
+# 매칭 방법 (routing.py 에서 upstream 이 결정):
+#   - keywords: 본문 첫 N chars 내 키워드 매칭 (대소문자 무시)
+#   - source_channel: 외부 수집 경로 (`law_monitor`, `news_collector` 등)
+#   - tags: ai_domain tags / user tags
+# **category (UI 축) 매칭 금지** — feedback_category_vs_ai_domain_axis.md
+#
+# suggested_ui_category: 실측 doc_category enum ∈ {document,library,news,memo,audio,video,law}
+# ---------------------------------------------------------------------------
+subject_domains:
+
+  safety_reference:
+    description: "산업안전보건법·중대재해 관련 법령·기준 문서 (사용자 업로드분)"
+    suggested_ui_category: document
+    high_impact: true
+    default_risk_flags: [safety_legal_interpretation]
+    keywords: [산업안전보건법, 중대재해, 안전보건관리체계, 유해위험방지계획서]
+    note: "law_monitor 자동 유입 법령은 source_channel=law_monitor 로 별도 처리 (classify skip), 여기는 사용자 업로드 PDF/DOC 대상"
+
+  safety_operational:
+    description: "현장 운영 문서 — 위험성평가·작업허가·JSA·SOP"
+    suggested_ui_category: document
+    high_impact: true
+    default_risk_flags: [safety_operational_decision]
+    keywords: [위험성평가, 작업허가서, JSA, 안전작업지침, 보호구, SOP]
+
+  msds:
+    description: "MSDS / SDS / 화학물질 안전자료"
+    suggested_ui_category: document
+    high_impact: true
+    default_risk_flags: [chemical_hazard, safety_legal_interpretation]
+    keywords: [MSDS, SDS, 물질안전보건자료, 화학물질, 유해화학물질]
+
+  hazard_specific:
+    description: "위험 유형별 자료 (추락·끼임·감전·밀폐공간·화재폭발)"
+    suggested_ui_category: document
+    high_impact: true
+    default_risk_flags: [safety_operational_decision]
+    keywords: [밀폐공간, 추락, 끼임, 감전, 화재, 폭발, 중독, 질식]
+
+  incident_report:
+    description: "사고·재해 보고서"
+    suggested_ui_category: document
+    high_impact: true
+    default_risk_flags: [incident_causation]
+    keywords: [사고보고, 재해조사, 원인분석, 재발방지, 중대재해]
+
+  health_record:
+    description: "건강검진·작업환경측정·보건관리"
+    suggested_ui_category: document
+    high_impact: true
+    default_risk_flags: [medical_health_judgment]
+    keywords: [건강검진, 작업환경측정, 보건관리, 특수건강진단]
+
+  safety_video:
+    description: "안전·보건 교육/현장 영상 (STT 대상)"
+    suggested_ui_category: video
+    high_impact: false  # 분류/챕터분리는 4B 가능. 깊은 요약 단계에서 승격
+    deep_summary_risk_flags: [safety_operational_decision, safety_legal_interpretation]
+    keywords: []  # 본문 텍스트가 없음. filename/source_channel 기반 매칭
+
+  news_item:
+    description: "뉴스 기사 단건"
+    suggested_ui_category: news
+    high_impact: false
+    default_risk_flags: []
+    keywords: []  # source_channel=news_collector 로 매칭
+
+  news_digest_request:
+    description: "다출처·다국가 뉴스 종합 요청 (Phase 4 digest 연장선)"
+    suggested_ui_category: news
+    high_impact: true
+    default_risk_flags: [news_cross_source, multi_reference_synthesis]
+    keywords: []  # 사용자 요청 의도로 판별 (upstream)
+
+# ---------------------------------------------------------------------------
+# fallback_domain — 매칭 실패 시 안전 착지점 (INV-6)
+# ---------------------------------------------------------------------------
+fallback_domain:
+  name: generic
+  description: "매칭 실패 시 기본 도메인 — 사람 리뷰 큐로 안내"
+  suggested_ui_category: document
+  high_impact: false
+  default_risk_flags: [low_confidence_reasoning]  # 이미 "모른다" 신호
+  requires_human_review: true
+
+# ---------------------------------------------------------------------------
+# risk_flags — 26B 승격 조건 + synthesis directive
+# ---------------------------------------------------------------------------
+risk_flags:
+
+  safety_legal_interpretation:
+    description: "법령·기준 조문의 특정 상황 적용 해석"
+    requires_26b: true
+    synthesis_directive: "조문 원문 인용 필수. 해석은 '일반적 취지' 선에서만. 특정 상황 적용 단정 금지."
+
+  safety_operational_decision:
+    description: "현장 안전조치 결정·적정성 판단"
+    requires_26b: true
+    synthesis_directive: "조치 나열 OK. '충분/적법' 단정 금지. '검토 사항' 형태로 서술."
+
+  chemical_hazard:
+    description: "MSDS / 화학물질 취급·노출"
+    requires_26b: true
+    synthesis_directive: "MSDS 원문 인용 우선. 대체물질·취급법은 레퍼런스 나열까지만."
+
+  medical_health_judgment:
+    description: "건강검진 결과 해석, 노출 영향 추정, 증상 원인"
+    requires_26b: true
+    synthesis_directive: "의학 판단 거부. '전문의 상담 권장' 문구 포함."
+
+  incident_causation:
+    description: "사고·재해 원인 귀속"
+    requires_26b: true
+    synthesis_directive: "'원인은 ~' 금지. '관련 요인으로 ~가 기록됨' 수동태만."
+
+  multi_reference_synthesis:
+    description: "여러 레퍼런스 종합 가이드 작성"
+    requires_26b: true
+    synthesis_directive: "레퍼런스별 입장 분리 기술. 대조표 권장. 최종 권고 금지."
+
+  news_cross_source:
+    description: "다출처·다국가 뉴스 종합"
+    requires_26b: true
+    synthesis_directive: "출처별 보도 차이 명시. 중립 서술."
+
+  multi_doc_dependency:
+    description: "문서 3개 이상 교차 참조 필요 (INV-4 derived)"
+    requires_26b: true
+
+  low_confidence_reasoning:
+    description: "4B 자체 confidence < 0.7 (derived)"
+    requires_26b: true
+
+  pii_present:
+    description: "개인정보 포함 (주민번호·계좌 등)"
+    requires_26b: false  # 판단 위험 아님, 출력 마스킹 의무만
+    output_mask_required: true
+
+# ---------------------------------------------------------------------------
+# forbidden_for_4b — 4B 가 절대 수행하면 안 되는 작업
+# ---------------------------------------------------------------------------
+# detection_patterns: Python re.search() 엔진. Postgres regex 아님.
+# ---------------------------------------------------------------------------
+forbidden_for_4b:
+
+  - id: legal_interpretation
+    description: "법령·고시·안전보건기준의 특정 상황 적용 해석"
+    applies_when_subject_in: [safety_reference, safety_operational, msds]
+    detection_patterns: []  # 구조적 감지 (조문번호 + 단정 표현), 별도 휴리스틱
+
+  - id: safety_sufficiency_assertion
+    description: "안전조치 충분성·적법성 판단"
+    applies_when_subject_in: [safety_operational, hazard_specific, msds]
+    detection_patterns:
+      - '(이대로|이렇게)\s*하면\s*(됩니다|된다|괜찮)'
+      - '(충분합니다|적법합니다|문제\s*없)'
+      - '걱정\s*(없|안)'
+
+  - id: medical_health_judgment
+    description: "보건·의학적 판단"
+    applies_when_subject_in: [health_record]
+    detection_patterns:
+      - '(증상|노출)[은는이가]\s+[가-힣]+\s*(입니다|이다)'
+      - '(건강상|의학적으로)\s+.*(우려|문제)\s*(없|있)'
+
+  - id: incident_causation_assertion
+    description: "사고 원인 단정"
+    applies_when_subject_in: [incident_report]
+    detection_patterns:
+      # "원인은 ~ 입니다/이다/임" (공백 포함 명사구 허용)
+      - '원인은\s+[가-힣A-Za-z0-9\s]+?(입니다|이다|임)'
+      # "~ 때문에 발생" / "~ 으로 인해 발생" (앞 명사 뒤 공백 허용)
+      - '[가-힣]+\s*(때문에|으로\s*인해)\s+발생'
+
+  - id: multi_reference_synthesis
+    description: "다중 레퍼런스 종합 공식 가이드 작성"
+    applies_when_subject_in: [safety_reference, safety_operational, hazard_specific]
+    detection_patterns: []  # 구조적 (evidence docs >= 3 AND 결론형 문단)
+
+  - id: news_multi_source_synthesis
+    description: "2출처 이상 뉴스 최종 종합"
+    applies_when_subject_in: [news_item]
+    detection_patterns: []
+
+# ---------------------------------------------------------------------------
+# escalation — 결정론 임계값 (INV-3/INV-4)
+# ---------------------------------------------------------------------------
+escalation:
+  confidence_threshold: 0.7            # < 이면 escalate
+  context_char_cap_4b: 120000          # INV-3: 초과 시 강제 escalate
+  context_char_cap_26b: 260000
+  escalate_on_multi_doc_count: 3       # INV-4: >= 이면 escalate
+
+# ---------------------------------------------------------------------------
+# observability — analyze_events 기록 필수 필드 + 건강 범위
+# ---------------------------------------------------------------------------
+observability:
+  required_event_fields:
+    - prompt_version
+    - model_name
+    - subject_domain
+    - risk_flags
+    - high_impact_task
+    - confidence
+    - escalated_to_26b
+    - escalation_reasons
+    - policy_violation
+    - policy_version
+
+  health_ranges:
+    escalation_ratio_24h:          {min: 0.10, max: 0.35}
+    high_impact_26b_coverage:      {min: 1.00, max: 1.00}   # 100% 고정
+    over_escalation_ratio:         {max: 0.15}
+    summary_quality_low_ratio:     {max: 0.03}
+    answerability_insufficient:    {max: 0.20}
+    policy_violation_24h:          {max: 0}