feat(ask): Phase 3.5a guardrails (classifier + refusal gate + grounding + partial)
신규 파일: - classifier_service.py: exaone binary classifier (sufficient/insufficient) parallel with evidence, circuit breaker, timeout 5s - refusal_gate.py: multi-signal fusion (score + classifier) AND 조건, conservative fallback 3-tier (classifier 부재 시) - grounding_check.py: strong/weak flag 분리 strong: fabricated_number + intent_misalignment(important keywords) weak: uncited_claim + low_overlap + intent_misalignment(generic) re-gate: 2+ strong → refuse, 1 strong → partial - sentence_splitter.py: regex 기반 (Phase 3.5b KSS 업그레이드) - classifier.txt: exaone Y+ prompt (calibration examples 포함) - search_synthesis_partial.txt: partial answer 전용 프롬프트 - 102_ask_events.sql: /ask 관측 테이블 (completeness 3-분리 지표) - queries.yaml: Phase 3.5 smoke test 평가셋 10개 수정 파일: - search.py /ask: classifier parallel + refusal gate + grounding re-gate + defense_layers 로깅 + AskResponse completeness/aspects/confirmed_items - config.yaml: classifier model 섹션 (exaone3.5:7.8b GPU Ollama) - config.py: classifier optional 파싱 - AskAnswer.svelte: 4분기 렌더 (full/partial/insufficient/loading) - ask.ts: Completeness + ConfirmedItem 타입 P1 실측: exaone ternary 불안정 → binary gate 축소. partial은 grounding이 담당. 토론 9라운드 확정. plan: quiet-meandering-nova.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
26
migrations/102_ask_events.sql
Normal file
26
migrations/102_ask_events.sql
Normal file
@@ -0,0 +1,26 @@
|
||||
-- Phase 3.5a: /ask 호출 관측 테이블
|
||||
-- refusal rate 측정, 지표 3 분리 (full/partial/insufficient), defense layer 디버깅
|
||||
|
||||
CREATE TABLE IF NOT EXISTS ask_events (
|
||||
id BIGSERIAL PRIMARY KEY,
|
||||
query TEXT NOT NULL,
|
||||
user_id BIGINT REFERENCES users(id),
|
||||
completeness TEXT, -- full / partial / insufficient
|
||||
synthesis_status TEXT,
|
||||
confidence TEXT,
|
||||
refused BOOLEAN DEFAULT false,
|
||||
classifier_verdict TEXT, -- sufficient / insufficient / null (skipped)
|
||||
max_rerank_score REAL,
|
||||
aggregate_score REAL,
|
||||
hallucination_flags JSONB DEFAULT '[]',
|
||||
evidence_count INT,
|
||||
citation_count INT,
|
||||
defense_layers JSONB, -- per-layer flag snapshot (score_gate, classifier, grounding)
|
||||
total_ms INT,
|
||||
created_at TIMESTAMPTZ DEFAULT now()
|
||||
);
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_ask_events_created ON ask_events(created_at);
|
||||
CREATE INDEX IF NOT EXISTS idx_ask_events_completeness ON ask_events(completeness);
|
||||
|
||||
INSERT INTO schema_migrations (version) VALUES (102);
|
||||
Reference in New Issue
Block a user