hyungi_document_server

Author	SHA1	Message	Date
hyungi	c2077b3108	feat(summarize): presegment PR2 — deep_summary 분기 + HOLD 배선 (TIER1 로컬 map-reduce) plan ds-presegment-mapreduce-2. TRIGGER(25K tok) 이하 = 기존 단일콜 byte-불변 무회귀. 초과 시 3-way over% 게이트: auto=유닛별 map(26B)→reduce(26B, p3c_deep_summary_reduce 변형) → ai_detail_summary 동일 기록(불일치=reduce+map 합본 dedup) / hybrid·whole= HOLD(payload.presegment.awaiting_split + StageDeferred 24h, 맥미니 미전송 — 알람· 클로드 유인 분할은 PR3). - 유닛 단위 멱등 재개: 성공 유닛 즉시 payload.map_results commit — 502/defer/재시작 후 완료 유닛 skip, 실패 유닛만 raise→기존 attempts/백오프 재사용 - 모든 LLM 콜 캡(12K tok) 이하 — map=greedy-pack 보장, reduce=build_reduce_units_block 비례 절단 보장, est_tokens 로그로 단정 가능 - 콜 사이 gate 해제 → 짧은 인터랙티브 요청 interleave (허브 굶김 해소 본체) - fix: summarize_units 의 `from app.services...` 절대 import — 컨테이너(빌드 컨텍스트 ./app)에 app 패키지가 없어 배선 시 ModuleNotFoundError 나는 PR1 잠복 버그 → 상대 import 로 수정 (컨테이너/repo-root 테스트 양쪽 동작) - tests: 헬퍼 6 + worker seam 5 (map-reduce e2e·재개·유닛실패·drain 보류·HOLD) — PR1 15 포함 26 passed, 인접 policy/hier_decomp/fair_share 123 passed Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-02 09:14:22 +09:00
hyungi	4cdd30950c	refactor(classify): summarize 콜을 tier triage 에 병합 (3콜→2콜) B-1 3→2: p3a_short_summary triage 가 ai_summary 도 생산 → 별도 summarize 콜 제거. classify(domain/type)은 분리 유지(shadow probe 결과 결합 시 domain 노이즈 → 안전하게 요약만 병합). 본문 prefill 3회→2회 = Mac mini 부하 절감. >120K long_context·triage 파싱실패 시 summarize fallback 보존. shadow probe(Industrial_Safety 5문서) 검증: triage ai_summary 품질 legacy summarize 동급. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 16:55:12 +09:00
hyungi	495e1c786f	refactor(search)!: /ask 고아 service·테스트·프롬프트 정리 (검색 단일화 Phase 2) /ask 삭제로 0-consumer 된 자산 제거(3-gate 실증): search.py /ask 섹션(Citation/ConfirmedItem/AskDebug/AskResponse 모델 + 헬퍼 + _resolve_eval_identity) + 죽은 import 13개. service 4(classifier/verifier/refusal_gate/grounding_check). AIClient.call_classifier/call_verifier(고아). 프롬프트 2(classifier/verifier.txt). broken test 6. evidence/synthesis 는 공유(documents.py 등)라 유지. 실 pyflakes 클린(이전 세션 pyflakes 미설치로 검증 누락 → 설치 후 실검증). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 14:39:53 +09:00
hyungi	33ee81bf1d	feat(presegment): G2 PR-3 — LLM 경계 폴백 (flag-gated, 기본 OFF, scaffold-first) ToC 없는/게이트 미달 대형 PDF(>=60p)에 한해 off-card Qwen(맥북, call_deep_or_defer, StageDeferred-safe) 경계 제안 → 동일 검증게이트(_is_clear_bundle) 통과 시에만 deterministic 과 공유하는 _create_children 로 분할. is_bundle=false/파싱·검증 실패=단일문서(오늘과 동일)+로깅. - env PRESEGMENT_LLM_FALLBACK 기본 false → 배포 동작 무변(LLM 미호출, 검증=unit test) - 자식생성 _create_children 공유 헬퍼로 리팩터(deterministic+LLM 단일 경로, 동작 동일) - SegmentationOutput Pydantic + parse_json_response(house 패턴) + per-page heading 샘플(본문 미전송) - prompt app/prompts/presegment_boundaries.txt + tests/test_presegment_llm.py(14, fitz/DB/LLM mock) no direct HTTP·no silent fallback. 활성=flag ON + 실 router fixture 검증 후. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-18 17:53:28 +09:00
hyungi	6a85087b83	feat(eid): 이드 persona substrate W2~W4 — DS compose·약점진단·egress 코드층 박탈 전 로컬 LLM 관통 '이드' persona substrate 의 Document Server 측 빌드(W2~W4). 설계 = PKM eid-persona-substrate(r1~r3 수렴) / impl = eid-persona-impl. W2 — compose + 표면 배선: - app/eid/compose.py: persona→rules→overlay→task 단일 system 문자열 + 정적 ROUTE_MAP (런타임 sniffing 아님) + rules 부재 fail-loud · persona 부재 quiet · overflow fail-loud. - 자유-prose 3 표면(react_ask·study_subject_note·study_question_explanation) 중복 정체성· generic 정책 trim + compose 배선(AIClient 에 additive system 파라미터). 도메인 calibration 보존. - STRICT JSON 기계류(briefing_comparative·digest_topic)는 persona-ZERO 동결(불변식 #3). - app/prompts/substrate/: persona(외부 컴파일 산출물 vendor) + rules(생성 가드 서브셋) + overlay 5. W3 — migration + 워커 + study_diagnosis: - migration 301~305: eid_* append-only 원장(약점/복습초안/회고) + approval_requests(가변 큐) + 일정 파생뷰 2. - app/workers/study_weakness.py: study_question_progress.pattern_state 집계로 약점 derived 산출 (LLM 0) + bounded tier(watch/review/focus). nightly cron. - study_diagnosis 표면: 최신 스냅샷을 코치 언어로 번역(약점 판정은 코드, LLM 은 블록 값만 인용). W4-1 — egress 코드층 박탈: - app/eid/ai.py EidAIClient: 이드 표면 = call_primary(내부 MLX) only. 외부 LLM fallback 경로 구조적 봉쇄(call_fallback raise · 자동 fallback 제거 · 외부 endpoint 차단). egress 워커는 분리 유지. load-bearing 정정 3(환경 grounding 강제, 설계 회귀 아님): - rules = 운영 ruleset 전체 → 생성 가드 서브셋(HTML 산출물 룰이 study task 와 충돌). - append-only = REVOKE → CREATE RULE DO INSTEAD NOTHING(단일 owner role 은 REVOKE 무효 + migration 검증기가 plpgsql BEGIN 거부) + actor/source_* NOT NULL 스탬프. - 이드 LLM 봉쇄 = path discipline → EidAIClient 구조화. 검증: eid 순수 단위테스트 30 통과 + py_compile + migration 검증기 모사 + egress 적대감사 COMPLETE. DB/LLM/httpx 의존 테스트(append-only RULE·EidAIClient·E2E)는 staging(Docker) 가동. W4-2 네트워크 belt 은 조건부 보류(코드층 1차 충분, P0-3② 원격 실측 후 hard-gate 시 승격). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 15:13:20 +09:00
hyungi	0a7402b327	feat(study): 공부 암기노트 Phase 1 — card_extract 추출 파이프라인 (순수 additive) study_memo_cards 추출 파이프라인 + 버전키 폴러 + needs_review 컬럼. 운영 SR 코드(session_finalize/quiz_selection) 무수정. - migrations 287~298: study_memo_cards/_evidence/_jobs/_progress(P1 휴면)·study_reminders·study_topics.focused_at·study_questions needs_review 3컬럼. dedup PARTIAL UNIQUE(deleted_at IS NULL). - 워커: in-process RAG gather → MLX {cards} → 카드 가드(정량=evidence 원문 등장·cue/cloze 누출·dedup) → supersede 구버전 retire → append. 별 consumer 로 기존 study_queue 격리. - 폴러 study_card_enqueue: 버전키 NOT EXISTS(source_version) 멱등 + ai_explanation_generated_at NOT NULL 가드 + per-poll LIMIT(thundering-herd). - 검증: 실 prod 스키마 덤프 위 12 마이그 적용 OK + dedup/supersede/active-unique 기능 7/7 PASS + 정규화 util 15/15. plan: PKM plans/2026-06-05-study-memo-card-p1-plan.html Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 21:33:12 +09:00
hyungi	446ba82c91	feat(eval): Phase 2Q Diagnose Phase 1A — fixture (4 카테고리 × 2 LLM) + prompt v1 phase-2q-query-rewrite-diagnose.md v6 plan 의 Phase 1 fixture 박제 (G0-1 + G0-2). 산출물: - app/prompts/query_rewrite.txt — multi-query rewrite prompt v1 (3 variants: 원본 + 한국어 rephrase + 영어 번역) - tests/fixtures/macmini_gemma4_query_rewrite_response.json — 4 카테고리 (korean_only/mixed/english_only/exam) - tests/fixtures/macbook_qwen_query_rewrite_response.json — 4 카테고리 동일 inspect 9 결과 (2026-05-24): - Mac mini gemma-4-26B-A4B :8801 = response_format json_object 지원 - MacBook qwen3.6-27B-8bit :8810 = response_format json_object 미지원 (120s hang) — prompt rule only - prompt rule \"no markdown, no code fence\" 강제 시 둘 다 strict JSON (gemma 도 fence wrap 없음) - parser fallback (markdown fence regex) 유지 — 첫 호출 prompt 없을 때 wrap 관찰 사례 8 호출 측정: - gemma 1.16~1.36s / qwen 1.93~2.24s (warm) - variants 의미 일관 + 도메인 용어 (ASME/Section VIII/압력용기/가스기사) verbatim preserve - 한국어→영어 cross-lingual translation 자연 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 22:09:29 +00:00
hyungi	51c3f6df10	feat(search): /ask/react endpoint with Qwen native tool calling ReAct loop PR-DocSrv-Ask-ToolCalling-ReAct-1 — Qwen3.6-27B-8bit 의 native tool calling 으로 ReAct loop 도입. 기존 /api/search/ask 무수정. 트랙 B (frontend /ask SSE) 와 파일 단위 충돌 0 (search.py 의 ask() 함수 line diff = 0, 순수 추가). 핵심 invariant: - 별 endpoint /api/search/ask/react (qwen-macbook only, implicit opt-in) - MacBook unavailable 시 HTTP 503 + error_reason=macbook_unavailable. Gemma 자동 fallback X (정정 4 의 연장) G0 (구현 전 hard gate, plan b-velvety-hare.md): - G0-1 fixture (tests/fixtures/qwen_tool_call_response.json): 실제 mlx-vlm 응답 박제. shape = OpenAI 표준 호환 (choices[0].message.tool_calls + function.arguments JSON string). generate_with_tools() 가 본 shape 기준 구현. - G0-2 counter semantics: max_tool_rounds=2 + max_llm_calls=3 + search_exec_max=2. 마지막 LLM 호출은 tool_choice="none" + system instruction 으로 final 강제. - G0-3 trace exposure: default response 의 debug_trace=null. debug=true 시만 채움. server log 에는 항상 round 기록. backends.py (193 → 261줄): - QwenMacBookBackend.generate_with_tools(messages, tools, tool_choice) 신규 method. 기존 generate() 무수정. BackendUnavailable 처리 동일. react_loop.py 신규 (275줄): - agentic_ask_loop(session, query, *, backend, max_tool_rounds, debug) - tool round 안에서 run_search 호출, results dedup by id, final round 강제, partial=True 조건 (final content 빈 경우) search.py (+82줄): - POST /api/search/ask/react + AskReactRequest/Response schema - BackendUnavailable → JSONResponse(503, error_reason=macbook_unavailable) config.yaml + config.py: - search.ask.react: { enabled, max_tool_rounds=2, search_tool_limit=5, search_tool_mode=hybrid } tests (566줄, 18 신규 + 23 회귀 모두 PASS): - test_react_loop.py 13건: G0-1 fixture shape / G0-2 counter cap / G0-3 trace exposure / BackendUnavailable propagation / sources dedup - test_search_ask_react_endpoint.py 5건: 503 + run_search 호출 0 / 정상 200 / debug=true trace 노출 / max rounds partial - 회귀 (test_ask_eval_auth 9 + test_search_ask_macbook_503 5 + test_backend_dispatcher 9) 모두 PASS Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 13:43:47 +00:00
Hyungi Ahn	431d4fe010	feat(briefing): add morning briefing schema + services + api (historical off) 야간 수집 뉴스 (KST 00:00~05:00) topic×country 비교 분석 1페이지 카드. Phase 4 Global Digest 와 코드/로직/테이블 분리, 알고리즘만 services/clustering_common 공유. Backend 신규: - migrations/255_morning_briefings.sql: morning_briefings + briefing_topics (briefing_date UNIQUE, UNIQUE(briefing_id,topic_rank), FK CASCADE, historical_* 3컬럼 nullable, cluster_members JSONB, country_perspectives JSONB, status 4-state success\|partial\|failed\|empty) - app/models/briefing.py: SQLAlchemy ORM - app/services/briefing/loader.py: KST 5h 윈도우 + news_sources prefix fallback (Phase 4 패턴 미러) + historical candidate pool 로더 - app/services/briefing/clustering.py: cluster_global topic-first (LAMBDA=ln(2)/2h, MIN_COUNTRIES_PER_TOPIC=2, MAX_TOPICS=7) - app/services/briefing/comparator.py: call_primary 26B + JSON envelope sanitize (cap perspectives 10 / divergences 3 / convergences 2 / quotes 5) + fallback row 고정 형태 + retrieve_historical cosine top-K - app/services/briefing/pipeline.py: load→cluster→select(K=7,λ=0.6) →historical→compare→status 4-state→delete+insert transaction - app/workers/briefing_worker.py: APScheduler/수동 호출 공용 진입점, 600s hard cap - app/prompts/briefing_comparative.txt: 한국어 비교 분석 JSON 프롬프트, {articles_block} + {historical_block} 2섹션, 인용 금지 라벨 - app/api/briefing.py: GET /latest, GET ?date=, POST /regenerate?date= (admin, sync delete+insert tx, regenerated:true) Backend 수정: - app/main.py: briefing_router 등록 (/api/briefing prefix). scheduler 등록은 PR-3 에서. - app/services/digest/selection.py: select_for_llm 매개변수화 (K, λ caller 주입). Phase 4 동작은 default 값으로 보존. Historical 정책: - BRIEFING_HISTORICAL_ENABLED env flag, default off. - flag off → historical_* 컬럼 모두 NULL, prompt {historical_block} 빈 라벨, retrieval 호출 안 함. - flag on (PR-1b 에서 enable) → cluster centroid 와 과거 30일 doc embedding cosine top-K 5 (sim≥0.70), prompt 에 주입. Country canonical (실측 확인 후): - documents.country 컬럼 부재 확정 - document_chunks.country 매칭률 0% (chunks 자체가 뉴스에 안 만들어짐) - 유일 country 신호 = news_sources prefix 매핑 (Phase 4 와 동일) Tests: - tests/test_briefing_historical.py: 3 경로 회귀 (flag off/on with fixture/on zero match) + sanitize cap + fallback row 형태. Verification: PR-1.8 에서 GPU 컨테이너 pytest + 수동 regenerate.	2026-05-12 12:58:50 +09:00
Hyungi Ahn	63990ac632	feat(memos): add AI event-kind triage fields PR-2B (Memo Inbox Triage) backend 1/2. plan: beszel-tingly-sloth.md 라운드 13. 사용자 비전 = 메모는 inbox, AI 는 triage assistant. AI worker 는 events row 직접 생성 X. Migrations 250–253 (실측 N=250): - 250 CREATE TYPE event_kind_hint AS ENUM (note\|task\|calendar_event\|activity_log\|reference) - 251 ALTER TABLE documents ADD ai_event_kind event_kind_hint - 252 ALTER TABLE documents ADD ai_event_confidence NUMERIC(3,2) + CHECK 0–1 - 253 CREATE INDEX idx_documents_ai_event_kind partial WHERE ai_event_kind IS NOT NULL ORM: - Document.ai_event_kind / ai_event_confidence 컬럼 추가 (Enum SQLAlchemy 동기) - source_channel enum 에 'voice' 추가 (PR-2C 와 호환) Worker: - classify_worker Phase 3 (Gemma 4B triage) 확장 · TriageOutput 에 event_kind_hint + event_kind_confidence 필드 추가 · 4B 응답에 hint 가 있을 때만 Document 에 저장 (enum 외 값은 무시) - prompt p3a_short_summary.txt 확장 — note/task/calendar_event/activity_log/reference 분류 기준 + confidence + default='note' 명시 원칙: AI worker 는 hint 만 제공. events 생성은 다음 commit 의 promote endpoint 에서만.	2026-05-11 12:04:21 +09:00
Hyungi Ahn	6b52d57bac	feat(study): Phase 4-A explanation_md 길이 cap + prompt 강화 운영 데이터에서 ready 박힌 풀이가 793/838/866자 — 권장 200~400 대비 큰 편. 1차 운영 후 결과 화면 가독성 + 토큰 사용량 통제 위해 prompt 강화 + 저장 전 cap. Prompt (study_explanation_envelope.txt): - explanation_md 권장 300~600자, 최대 900자 명시 - 핵심 개념 + 정답 근거 + 헷갈리는 1~2개 오답만 — 모든 오답 풀이 X - explanation_md 안 줄바꿈 최소화 (parse_json fix 와 결합 — invalid escape 줄임) - LaTeX 수식 자제 — \\circ/\\text/\\, 매크로 가능하면 평문 ('0°C', 'C') - 출력은 raw JSON 한 객체만 — 코드 펜스/thinking/메타 X 강조 Worker (study_explanation_worker.py): - _cap_explanation_md(text, max_chars=1200) 헬퍼 신규 · 1200자 이하 passthrough · 초과 시 마지막 200자 안에서 \\n\\n / \\n / '. ' / '다.' / '요.' 경계 탐색 · 경계에서 자르기 + '…' (단어 중간 자르기 회피) · 경계 못 찾으면 단순 자르기 + '…' - save 전 cap 적용. ai_explanation_status='ready' 유지 (cap 됐다고 failed X) - payload 에 운영 분석 metadata: explanation_len_original / _saved / capped 플래그 검증: - tests/test_explanation_cap.py (6 케이스) · short passthrough / exact at limit / paragraph boundary / sentence boundary · no boundary fallback / empty input - scripts/phase4_health.sql 섹션 8/9 추가 · ai_explanation 길이 p50/p95/max (study_questions.ready) · cap 작동 빈도 (job.payload 의 explanation_capped/_original/_saved) cap 1200 = 800 (4-B summary_md) 보다 여유 — 기사시험 풀이는 공식+오답+개념 묶이면 800 빡빡함. 운영 후 800~1000 으로 조정 검토. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 08:33:18 +09:00
Hyungi Ahn	6785d53d3d	feat(study): Phase 4-B v1 세션 단위 종합 분석 (자유 마크다운) Phase 4-A 가 wrong/unsure 한 문제씩 풀이 캐시. 4-B 는 세션 전체 wrong/unsure 5~30건을 묶어 200~400자 자연어 요약 1건 생성. 결과 화면 헤더 카드. 큐 인프라는 4-A study_question_jobs 와 분리 — FK 단일 의미 + 운영 SQL 명확성 + 4-A/4-B 가드/payload/재시도 정책 차이. 신규 study_quiz_session_jobs (큐) + study_quiz_session_analysis (결과 캐시 PK=session_id, UPSERT) + 전용 consumer. Backend: - migrations/233 — study_quiz_session_jobs (FK study_quiz_sessions NOT NULL, status pending/processing/completed/failed/skipped, max_attempts=2) - migrations/234 — partial unique idx (session_id) WHERE pending/processing - migrations/235 — study_quiz_session_analysis (session_id PK, summary_md, confidence, model_name, generated_at, is_stale) - models/study_quiz_session_job — ORM + enqueue_session_analysis_job() (멱등) - models/study_quiz_session_analysis — ORM (PK = session_id) - services/study/session_summary_guard — GUARD_PATTERN (정규식) + normalize_confidence() 단일 source, worker + tests 가 import 공유 - services/study/session_summary_rag — gather_session_summary_context() documents 만 (PR-3 _gather_document_evidence 재사용). evidence 없어도 호출 허용 (4-A 와 다른 정책 — 세션 기록 자체가 evidence) - services/study/session_analysis_enqueue — auto (finalize/fallback) + request_session_analysis_regenerate (manual). manual 은 wrong/unsure < 5 즉시 차단, active job 차단, 기존 analysis 있으면 is_stale=true 박기 - prompts/study_session_summary_envelope.txt — envelope JSON {summary_md, confidence}. 정량 정수만 인용 가능, 비율/추세/범위/날짜 금지 - workers/study_session_analysis_worker — terminal status 분기: · wrong/unsure < 5 → status=skipped, error_code=insufficient_attempts · question_text/outcome 부족 → skipped, evidence_missing · GUARD_PATTERN match → failed, guard_fail · 800자 hard cap + confidence normalize · timeout/parse/unknown → 재시도 후보 · UPSERT study_quiz_session_analysis ON CONFLICT DO UPDATE (PK session_id) - workers/study_session_queue_consumer — 4-A consumer 패턴 복제. BATCH_SIZE=1 + STALE_MINUTES=10. MLX gate 4-A 와 공유 (Semaphore(1)) - main.py — APScheduler add_job(consume_study_session_queue, ..., 1분 주기) - session_finalize — 끝에서 enqueue_session_analysis_auto (best-effort) - api/study_topics: · QuizSessionAnalysisOut + ai_session_analysis 응답 필드 (analysis row + 최신 job status/error_code) · GET fallback enqueue (기존 analysis 또는 active job 없으면만, non-blocking) · POST /quiz-sessions/{sid}/regenerate-summary — manual 트리거 Frontend (quiz-sessions/[sid]/+page.svelte): - 결과 헤더에 세션 요약 카드 (AI 풀이 indicator 직후, 바로 할 일 직전) - summary_md 박혔으면 markdown 렌더, 없으면 job_status / error_code 분기: · pending/processing → "AI 가 세션 분석 중" · insufficient_attempts → "오답·모르겠음 5건 미만" · evidence_missing → "자료 부족" · guard_fail → "환각 검증 차단" + 재생성 링크 - confidence='low' 배지 + is_stale "재생성 중" 배지 - 재생성 버튼 + regenerateSummary() — reason 별 toast 분기 ship gate: - tests/test_session_summary_guard_pattern.py — 허용 5 + 차단 7 케이스 + normalize_confidence 표준/비표준 검증. python3 직접 실행 패스. Plan: ~/.claude/plans/nifty-sparking-spindle.md (Phase 4-B v1) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 07:20:29 +09:00
Hyungi Ahn	43097e6fd9	fix(study): Phase 4-A envelope 프롬프트 — answer_choice 사용자 정답 강제 검증 결과 모델이 envelope 안에서 자료 근거로 정답 번호를 재판단해서 거의 매번 guard_fail (answer_choice != correct_choice). 환각 가드는 정확히 작동했지만 caching 효율 0%. PR-3 의 free-form 풀이는 "사용자 정답 우선, 충돌 명시" 라 정상 ready 박혔지만 envelope.txt 가 "자료 근거 우선" 으로 충돌. 환각 가드의 본질 — 모델이 envelope 형식을 어겨 임의로 다른 번호를 박는 케이스 차단 — 을 유지하되, answer_choice 값은 사용자 정답 (correct_choice) 을 그대로 박도록 명시. 자료 근거와 사용자 정답이 다를 경우 explanation_md 안에 짧게 명시만 하고 answer_choice 는 보존. 정답 자체를 바꾸는 게 환각 가드의 차단 대상이라고 강조. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 11:47:44 +09:00
Hyungi Ahn	e8da53490c	feat(study): Phase 4-A wrong/unsure AI 풀이 prefetch batch PR-3 의 결과 화면 [AI 해설 보기] 실시간 호출이 클릭 시 8~30초 대기. 풀이 직후 백그라운드 batch 로 미리 생성해 캐시 hit. 환각 가드는 PR-3 보다 강화 — envelope JSON {answer_choice, explanation_md, confidence} + answer_choice == correct_choice 검증 + evidence 의무. processing_queue 가 documents.id FK 라 study_questions 에 직접 재사용 불가 → 별도 study_question_jobs 테이블 + 별도 consumer. Backend: - migrations/231 — study_question_jobs CREATE TABLE (13컬럼, kind 권장값 'explanation' / 'session_summary' 예약, status pending/processing/completed/ failed/skipped, max_attempts=2) - migrations/232 — partial unique idx (qid, kind) WHERE status IN (pending, processing) — active 행 중복 차단, terminal 이력 누적 허용 - models/study_question_job — ORM + enqueue_study_question_job() 헬퍼 (on_conflict_do_nothing 멱등) - prompts/study_explanation_envelope.txt — envelope 형식 프롬프트 (answer_choice 1~4 강제, confidence high/medium/low) - workers/study_explanation_worker — terminal status 분기: · evidence 둘 다 빈 리스트 → job/question 모두 skipped (LLM 호출 X) · answer_choice != correct_choice → guard_fail / failed (재시도 X) · timeout/parse → 재시도 후보 (max_attempts=2) · catch-all except → unknown 명시 + retryable 분기 · question.ai_explanation_status='ready' 이미 박혀있으면 즉시 completed · confidence 는 job.payload 에 보존 (운영 분석) - workers/study_queue_consumer — APScheduler 1분 주기, BATCH_SIZE=1, MLX gate Semaphore(1) 공유. STALE_MINUTES=10 자체 복구 - main.py — scheduler.add_job(consume_study_queue, ..., id='study_queue_consumer') - services/study/explanation_enqueue — finalize + GET fallback 공유 헬퍼: filter_needs_explanation (study_questions status + 최신 job error_code 필터, guard_fail/evidence_missing 인 마지막 job 은 자동 재enqueue 제외) + enqueue_explanation_for_qids (max_count cap) - session_finalize — 끝에서 wrong/unsure qid prefetch enqueue (best-effort, 실패해도 finalize 자체 안 깨짐) - api/study_topics get_quiz_session — done 세션에서 backfill enqueue (max=30, non-blocking, debug 로그) 대상 조건: ai_explanation_status IN ('none', 'failed') OR ai_explanation IS NULL. stale / skipped / pending / ready 는 자동 enqueue 대상 X. stale 재생성은 PR-3 명시 [다시 생성] 또는 후속 Phase 에서. Plan: ~/.claude/plans/nifty-sparking-spindle.md (Phase 4-A) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 11:42:08 +09:00
Hyungi Ahn	d968b2d901	feat(study): 문제풀이 모드 개편 + 결과 분류 + 분야 설명 (PR-9) - 라벨 "복습 시작" → "문제풀이" - attempts.outcome 컬럼 + selected_choice nullable (correct/wrong/unsure) - 풀이 중 정답·해설·AI·비슷한 문제 모두 비노출, 답 클릭 시 자동 진행 - "모르겠음" 5번째 옵션 추가 - 결과 화면 = 정답/틀린/모르겠음 3 카테고리 탭, 카드 클릭 expand - 틀린 → PR-3 AI 해설 (RAG) - 모르겠음 → 분야(subject+scope) 설명 AI 즉석 생성 + 캐시 (PR-9 신규) - 분야 설명 RAG: 매핑 documents 청크 + 같은 분야 다른 문제·해설 → bge-reranker - 마이그레이션 200~205 (single-statement, asyncpg 호환) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 15:58:35 +09:00
Hyungi Ahn	e1a2cdc677	feat(study): AI 풀이 생성 — 수동 트리거 + RAG (PR-3) 복습 답 제출 후 또는 편집 화면에서 사용자가 명시적으로 누를 때만 AI 가 4지선다 풀이 생성. 자동 일괄 생성 금지 (하루 100문제 입력 시 MLX 부하· 잘못 입력 문제 해설 위험). 데이터 모델 (migrations 191~192): - study_questions 4 컬럼 추가: ai_explanation TEXT, ai_explanation_status VARCHAR(20) DEFAULT 'none' (none/pending/ready/failed/stale), ai_explanation_generated_at, ai_explanation_model - partial idx (study_topic_id, ai_explanation_status) WHERE status != 'none' PATCH stale 자동 전이: question_text/choice_*/correct_choice 변경 시 status='ready' 만 'stale' 로. 본문은 보존, UI 배지 + "다시 생성" 동선. 신규 엔드포인트: POST /api/study-questions/{id}/ai-explanation - regenerate=false + ready/stale → 캐시 즉시 (MLX 호출 없음, is_stale 플래그) - pending → 409 (race-safe 조건부 UPDATE 로 동시 호출 차단) - 그 외 → 새 생성 RAG 입력 풀: - 1순위: study_topic 매핑 documents 청크 + ai_summary, bge-reranker top-5 - 2순위: 같은 토픽 다른 questions (자기 자신 제외, ai_explanation 은 ready 상태만 포함 — 재귀적 hallucination 방지), reranker top-3 - 제외: 필기 OCR / 외부 웹 / Premium 모델 모델: Mac mini MLX gemma-4-26b primary 단독. get_mlx_gate() Semaphore(1) 경유, 30s timeout. 실패 시 status='failed' + 직전 본문 보존. 프롬프트 (app/prompts/study_question_explanation.txt): 자료 우선순위·인용 형식·할루시네이션 방지 절대 규칙 (법령명·조항·수치·표준 번호 단정 금지, "자료에서 확인되지 않음" 명시). 프론트: - 복습 화면 답 제출 후 인라인 expand. status별 버튼 분기 (ready 캐시 / stale "이전 풀이"+"다시 생성" / failed "다시 시도") - 편집 화면 별도 카드. 상태 배지 + "이전 풀이 보기" / "다시 생성" 분리 - 참고 근거 토글 (source_type 별 아이콘 📄/❓ + 제목 + snippet) 후속 PR 보류: 오답노트/통계, AI 일괄 백그라운드 생성, 필기 OCR RAG, Premium/Claude 재생성, /api/search/ask retrieval scope 통합. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 08:41:46 +09:00
Hyungi Ahn	6fdc48e5b6	feat(ai): B-1 summary tier 분할 — triage(4B) + deep_summary(26B) PR-A policy 레이어를 재사용하여 classify_worker 에 tier triage 경로를 추가. Legacy ai_summary / ai_domain / ai_suggestion 은 유지 (회귀 0), tldr/bullets/ detail/inconsistencies 는 별도 필드로 분리. Migrations (156~160): - 156 documents: ai_tldr, ai_bullets, ai_detail_summary, ai_inconsistencies, ai_analysis_tier 5컬럼 - 157 process_stage 에 'deep_summary' ADD VALUE 단독 (Postgres 동일 트랜잭션 제약 회피) - 158 processing_queue.payload JSONB (envelope 전달) - 159 analyze_events 에 tier + suppressed_reason - 160 suppressed_reason partial index Models/ORM: - Document: 5컬럼 Mapped 추가 - ProcessingQueue: deep_summary enum 확장 + payload 필드, enqueue_stage 에 payload 옵션 - AnalyzeEvent: PR-A shadow 6컬럼 + PR-B tier/suppressed_reason Workers: - classify_worker: 기존 legacy 경로 뒤에 _run_tier_triage 추가. - _match_subject_domain(doc, text): source_channel + 본문 keywords + ai_domain prefix 로 PR-A policy 의 subject_domain 이름 결정 (category 매칭 금지). - R1 TriageOutput pydantic + JSON 깨짐 fallback (triage_json_invalid). - R2 _check_backlog_guard(): 30분 window ratio > threshold OR pending 초과면 soft escalate suppress. hard escalate 는 통과. - R3 _slice_text_ranges(): 260k 초과 시 head 120k + mid 20k + tail 120k 3조각. - escalate 시 EscalationEnvelope 구성 + {envelope, subject_domain} payload 로 deep_summary enqueue. - deep_summary_worker (신규): queue payload 에서 envelope + subject_domain 읽기 → render_26b("p3c_deep_summary", subject_domain) + MLX 호출 (llm_gate Semaphore(1) 경유) → ai_detail_summary + ai_inconsistencies 저장 + ai_analysis_tier='deep'. _filter_inconsistencies 로 허용 kind (version_drift / procedure_conflict / source_conflict / missing_basis) 만 통과 — 구매/계약 kind drop. - queue_consumer: workers dict 에 deep_summary 추가 + BATCH_SIZE=1. next_stages 는 건드리지 않음 — classify → embed/chunk 는 그대로, deep_summary 는 독립 체인. Telemetry: - record_analyze_event: subject_domain / risk_flags / escalation_reasons / confidence / policy_version / shadow_would_route_to / tier / escalated_to_26b / suppressed_reason 파라미터 확장. classify/deep worker 가 mode="summary_triage" 또는 "summary_deep" 로 기록. API: - DocumentResponse 에 ai_tldr / ai_bullets / ai_detail_summary / ai_inconsistencies / ai_analysis_tier 5필드 노출. Prompts: - classify.txt 에 DEPRECATED 주석만 추가 (파일 유지 — rollback 경로 보존). - PR-A 의 app/prompts/policy/p3a_short_summary.txt (4B) 와 p3c_deep_summary.txt (26B) 를 그대로 사용. 내 소유의 summary_triage.txt / summary_deep.txt 는 중복 이라 별도 커밋에서 제거하지 않고 바로 생성 전 삭제. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 10:22:40 +09:00
Hyungi Ahn	23b8a555c2	feat(prompts): policy templates (p1~p6, 9 files) app/prompts/policy/*.txt — 4B/26B 정책 템플릿. {forbidden_block} / {subject_description} / {confidence_threshold} / {context_cap} placeholder 포함. 금지 규칙 하드코딩 0 건. 4B (7): p1_triage, p2_nas_rule, p3a_short_summary, p3b_entities, p4a_advice_trigger, p4b_retrieval, p6_night_sweep 26B (2): p3c_deep_summary, p4b_synthesis 각 템플릿 공통 구조: - [System] 역할 선언 + subject_description - forbidden_block (yaml 에서 도메인별 렌더) - 작업 규칙 - 출력 형식 (JSON only, escalate_to_26b 포함) - 에스컬레이션 기준 - [User] 실행시 치환 placeholder (이중 중괄호) render 호출은 PR-A 에서 아무도 하지 않음 — 자산 배치만. PR-B escalation_service 가 실제 worker 에서 render. plan: ~/.claude/plans/wise-gliding-hippo.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 09:34:48 +09:00
Hyungi Ahn	8fdea88676	feat(documents): §1 category enum + ai_suggestion 승인 파이프 plan: ~/.claude/plans/luminous-sprouting-hamster.md §1 - migrations/143_category.sql: doc_category enum (6 활성 + 3 유보) + documents.category + documents.ai_suggestion JSONB + 2 idx. - app/models/document.py: category (Enum, create_type=False), ai_suggestion (JSONB). - app/prompts/classify.txt: document_type enum 에 7 실무 doctype 추가 (발주서/세금계산서/명세표/도면/증명서/계획서/시방서) + facet_doctype 필드 directive. - config.yaml: document_types 에 7 항목 추가 (worker 검증 통과). - app/workers/classify_worker.py: FACET_DOCTYPES / LIBRARY_SUGGESTION_DOCTYPES 상수, facet_doctype 파싱(기존값 미덮어씀), 발주서/세금계산서/명세표 감지 시 ai_suggestion={proposed_category=library, proposed_path=@library/ 거래/{YYYY}/{doctype}, source_updated_at=doc.updated_at.isoformat(), ...}. category / user_tags 자동 전이 금지 (suggestion-only). - app/api/documents.py: · DocumentResponse 에 category / ai_suggestion 노출 · GET /documents ?category=<cat> / ?has_suggestion / ?proposed_category (category 지정 시 기본 news/memo 제외 해제 — §2 승인 UI 계약) · GET /documents/library 를 Document.category=='library' 기반으로 재구현 (path subquery 는 user_tags 유지 — 분류 내부 서가 경로) · POST /documents/{id}/accept-suggestion — FOR UPDATE + idempotent no-op + dual 409 stale (payload source_updated_at / documents.updated_at) + user_tags idempotent append · DELETE /documents/{id}/suggestion — idempotent, stale 검사 없음 - scripts/backfill_category.py: dry-run / apply. 매핑(news/memo/@library/else) + 3-way 상대 검증 (all_rows==categorized, uncategorized==0, cat_library==has_library_tag — 자동 전이 금지 정책 검증). 남은 DoD (원격 배포 후): docker compose up → migration 143 적용 → backfill apply → smoke (drive_sync 발주서 업로드 suggestion 생성 / category 유지, accept-suggestion idempotency + 409 stale 두 벡터, /documents?category=library == /documents/library 건수 일치). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:32:01 +09:00
Hyungi Ahn	eb9dc94604	feat(search): E.3 — ask synthesis prompt v2-600char bump 한도 400 → 600 자. baseline 관찰(partial avg 168자 / full 10%)에서 길이 제약이 실제 출력 제약이 되는 현상 확인, 절차·비교 카테고리 답변 깊이 확보 목적. 변경 4 라인: - search_synthesis.txt:17 answer 400→600 characters max - prompt_versions.py:20 v1-400char → v2-600char (telemetry) - synthesis_service.py:42 PROMPT_VERSION v1→v2 (cache key 의미론 동기화) - synthesis_service.py:46 MAX_ANSWER_CHARS 400→600 (hard clip 동기화) v1 post-tier0 baseline: 225 rows, partial 51% / insufficient 49% / full 0% (Tier 0 fix 로 full+refused=True 모순 0 건). E.6 는 이 clean baseline 을 compare-against 로 사용. 향후 티켓: PROMPT_VERSION 과 ASK_PROMPT_VERSION 단일 소스 통합. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 12:02:51 +09:00
Hyungi Ahn	5bfbb79641	feat(verifier): Phase 3.5 B2 — numeric_conflict promote (env flag) + Tier 4 VERIFIER_NUMERIC_PROMOTE 환경변수로 numeric_conflict severity 승격 실험. verifier_service.py: - _NUMERIC_PROMOTE = os.getenv('VERIFIER_NUMERIC_PROMOTE', '0') == '1' (import time 평가 — env 변경 시 process restart 필수) - _SEVERITY_MAP['numeric_conflict']: env=1 → critical=strong / minor=medium, env=0 (기본) → 둘 다 medium (기존 동작 유지) - direct_negation 은 env 무관 항상 strong (안전장치) verifier.txt: - numeric_conflict 정의에 critical/minor 분리 명시 (core quantity vs peripheral) - "Range values satisfy any answer within range" rule 추가 - severity mapping 갱신: numeric_conflict 분기 명시 search.py re-gate (Tier 1~7 재번호, B2 신규 Tier 4): - v_strong_numeric = sum(1 for f in v_strong if f.startswith('verifier_numeric_conflict')) - Tier 4 (신규): g_strong + v_strong_numeric >= 1 + low_conf → refuse re_gate value: 'refuse(grounding+verifier_numeric)' - 원칙 유지: verifier strong 단독 refuse 금지 — g_strong 교차 필수 - 호환성: 기존 re_gate string literals 그대로 유지, 신규 1개만 추가 credentials.env.example: VERIFIER_NUMERIC_PROMOTE=0 (off, B3 통과 후 production 전환) tests/test_verifier_numeric_promote.py: 4 케이스 (env off / on / explicit 0 / direct_negation invariant). monkeypatch.setenv + importlib.reload 패턴. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 08:11:06 +09:00
Hyungi Ahn	d9caf075e5	feat(api): Phase D.5 — POST /documents/{id}/analyze 문서 분석 엔드포인트 전문 15,000자 → Gemma 4 구조화 분석 (근거/해설/사례/요약 4층). - MLX gate + 20초 timeout (gate 안쪽) - 인메모리 캐시 TTL 30분, 키 = doc_id + updated_at(fallback: created_at) - 층별 최소 50자 + 억지 채움 문구 제거 - summary 필수 (없으면 422) - 에러: 404 text 없음 / 504 timeout / 502 llm / 422 parse Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 12:32:44 +09:00
Hyungi Ahn	5c58778a41	feat(library): doc_purpose 필드 + 자료실 업로드 기능 지식/업무 문서 1차 구분을 위한 doc_purpose(business\|knowledge) 추가. - 마이그레이션: document_purpose enum + 컬럼 - AI 분류: docPurpose 자동 추론 (빈 값만 채움) - 업로드 API: doc_purpose + library_path Form 파라미터 - 자료실 업로드: business 기본값 + 선택 경로 자동 태깅 - FileInfoView: 용도 select (수동 변경, 실패 롤백) - DocumentCard: 업무/참조 배지 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 15:26:59 +09:00
Hyungi Ahn	e405ed3414	fix(ask): evidence sparse 문제 해결 — 프롬프트 + supplement + source 분리 근본 원인: evidence 프롬프트가 "<0.5 = 탈락" 명시 → LLM 하향 편향 → candidates 5개 중 4개 탈락 → synthesis 자체 거부. Change 2: evidence_extract.txt - relevance 스케일 재정의: "탈락" 라벨 제거 - 0.3~0.5 약한 부분 연관 / 0.5~0.7 명확한 부분 연관 구간 세분화 - "directly answer" → "no connection at all" 완화 Change 3: search_synthesis.txt - refused 조건: "직접 답 아니면 거부" → "완전 무관일 때만 거부" - "covered only" 제한: partial evidence로 missing part 추론 금지 - supplement evidence weight 지시 추가 (보조 취급) Change 1: evidence_service.py - sparse evidence supplement: kept 1~2 + candidates 3+ → rule-only 보충 - substring + critical token 필터 (recall+precision) - critical token: 길이 3자+ OR 의미 기반 suffix (조건/기준/처벌 등) - EvidenceItem.source 필드 ("llm"\|"supplement"\|"rule_fallback") Change 4: search.py - defense_log["evidence"] 추가 (skip_reason, kept_count) synthesis_service.py - supplement evidence [n] (보충) 마킹 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 16:11:57 +09:00
Hyungi Ahn	b2306c3afd	feat(ask): Phase 3.5b guardrails — verifier + telemetry + grounding 강화 Phase 3.5a(classifier+refusal gate+grounding) 위에 4개 Item 추가: Item 0: ask_events telemetry 배선 - AskEvent ORM 모델 + record_ask_event() — ask_events INSERT 완성 - defense_layers에 input_snapshot(query, chunks, answer) 저장 - refused/normal 두 경로 모두 telemetry 호출 Item 3: evidence 간 numeric conflict detection - 동일 단위 다른 숫자 → weak flag - "이상/이하/초과/미만" threshold 표현 → skip (FP 방지) Item 4: fabricated_number normalization 개선 - 단위 접미사 건/원 추가, 범위 표현(10~20%) 양쪽 추출 - bare number 2자리 이상만 (1자리 FP 제거) Item 1: exaone semantic verifier (판단권 잠금 배선) - verifier_service.py — 3s timeout, circuit breaker, severity 3단계 - direct_negation만 strong, numeric/intent→medium, 나머지→weak - verifier strong 단독 refuse 금지 — grounding과 교차 필수 - 6-tier re-gate (4라운드 리뷰 확정) - grounding strong 2+ OR max_score<0.2 → verifier skip Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 09:49:56 +09:00
Hyungi Ahn	06443947bf	feat(ask): Phase 3.5a guardrails (classifier + refusal gate + grounding + partial) 신규 파일: - classifier_service.py: exaone binary classifier (sufficient/insufficient) parallel with evidence, circuit breaker, timeout 5s - refusal_gate.py: multi-signal fusion (score + classifier) AND 조건, conservative fallback 3-tier (classifier 부재 시) - grounding_check.py: strong/weak flag 분리 strong: fabricated_number + intent_misalignment(important keywords) weak: uncited_claim + low_overlap + intent_misalignment(generic) re-gate: 2+ strong → refuse, 1 strong → partial - sentence_splitter.py: regex 기반 (Phase 3.5b KSS 업그레이드) - classifier.txt: exaone Y+ prompt (calibration examples 포함) - search_synthesis_partial.txt: partial answer 전용 프롬프트 - 102_ask_events.sql: /ask 관측 테이블 (completeness 3-분리 지표) - queries.yaml: Phase 3.5 smoke test 평가셋 10개 수정 파일: - search.py /ask: classifier parallel + refusal gate + grounding re-gate + defense_layers 로깅 + AskResponse completeness/aspects/confirmed_items - config.yaml: classifier model 섹션 (exaone3.5:7.8b GPU Ollama) - config.py: classifier optional 파싱 - AskAnswer.svelte: 4분기 렌더 (full/partial/insufficient/loading) - ask.ts: Completeness + ConfirmedItem 타입 P1 실측: exaone ternary 불안정 → binary gate 축소. partial은 grounding이 담당. 토론 9라운드 확정. plan: quiet-meandering-nova.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 08:49:11 +09:00
Hyungi Ahn	75a1919342	feat(digest): Phase 4 Global News Digest (cluster-level batch summarization) 7일 rolling window 뉴스를 country × topic 2-level로 묶어 매일 04:00 KST 배치 생성. search 파이프라인 미사용. documents → clustering → cluster-level LLM summarization → digest. 핵심 결정: - adaptive threshold (0.75/0.78/0.80) + EMA centroid (α=0.7) + time-decay (λ=ln(2)/3) - min_articles=3, max_topics=10/country, top-5 MMR diversity, ai_summary[:300] truncate - cluster-level LLM only, drop금지 fallback (topic_label="주요 뉴스 묶음" + top member ai_summary[:200]) - importance_score country별 0~1 normalize + raw_weight_sum 별도 보존, max(score, 0.01) floor - per-call timeout 25s + pipeline hard cap 600s - DELETE+INSERT idempotent (UNIQUE digest_date), AIClient._call_chat 직접 호출 (client.py 수정 없음) 신규: - migrations/101_global_digests.sql (2테이블 정규화) - app/models/digest.py (GlobalDigest + DigestTopic ORM) - app/services/digest/{loader,clustering,selection,summarizer,pipeline}.py - app/workers/digest_worker.py (PIPELINE_HARD_CAP + CLI 진입점) - app/api/digest.py (/latest, ?date\|country, /regenerate, inline Pydantic) - app/prompts/digest_topic.txt (JSON-only + 절대 금지 블록) main.py 4줄: import 2 + scheduler add_job 1 + include_router 1. plan: ~/.claude/plans/quiet-herding-tome.md	2026-04-09 07:45:11 +09:00
Hyungi Ahn	64322e4f6f	feat(search): Phase 3 Ask pipeline (evidence + synthesis + /api/search/ask) - llm_gate.py: MLX single-inference 전역 semaphore (analyzer/evidence/synthesis 공유) - search_pipeline.py: run_search() 추출, /search 와 /ask 단일 진실 소스 - evidence_service.py: Rule + LLM span select (EV-A), doc-group ordering, span too-short 자동 확장(<80자→120자), fallback 은 query 중심 window 강제 - synthesis_service.py: grounded answer + citation 검증 + LRU 캐시(1h/300), refused 처리, span_text ONLY 룰 (full_snippet 프롬프트 금지) - /api/search/ask: 15s timeout, 9가지 failure mode + 한국어 no_results_reason - rerank_service: rerank_score raw 보존 (display drift 방지) - query_analyzer: _get_llm_semaphore 를 llm_gate.get_mlx_gate 로 위임 - prompts: evidence_extract.txt, search_synthesis.txt (JSON-only, example 포함) config.yaml / docker / ollama / infra_inventory 변경 없음. plan: ~/.claude/plans/quiet-meandering-nova.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 07:34:08 +09:00
Hyungi Ahn	c81b728ddf	refactor(search): Phase 2.1 QueryAnalyzer를 async-only 구조로 전환 ## 철학 수정 (실측 기반) gemma-4-26b-a4b-it-8bit MLX 실측: - full query_analyze.txt (prompt_tok=2406) → 10.5초 - max_tokens 축소 무효 (모델 자연 EOS 조기 종료) - 쿼리 길이 영향 거의 없음 (프롬프트 자체가 지배) → 800ms timeout 가정은 13배 초과. 동기 호출 완전히 불가능. 따라서 QueryAnalyzer는 "즉시 실행하는 기능" → "미리 준비해두는 기능"으로 포지셔닝 변경. retrieval 경로에서 analyzer 동기 호출 금지. ## 구조 ``` query → retrieval (항상 즉시) ↘ trigger_background_analysis (fire-and-forget) → analyze() [5초+] → cache 저장 다음 호출 (동일 쿼리) → get_cached() 히트 → Phase 2 파이프라인 활성화 ``` ## 변경 사항 ### app/prompts/query_analyze.txt - 5971 chars → 2403 chars (40%) - 예시 4개 → 1개, 규칙 설명 축약 - 목표 prompt_tok 2406 → ~600 (1/4) ### app/services/search/query_analyzer.py - LLM_TIMEOUT_MS 800 → 5000 (background이므로 여유 OK) - PROMPT_VERSION v1 → v2 (cache auto-invalidate) - get_cached / set_cached 유지 — retrieval 경로 O(1) 조회 - trigger_background_analysis(query) 신규 — 동기 함수, 즉시 반환, task 생성 - _PENDING set으로 task 참조 유지 (premature GC 방지) - _INFLIGHT set으로 동일 쿼리 중복 실행 방지 - prewarm_analyzer() 신규 — startup에서 15~20 쿼리 미리 분석 - DEFAULT_PREWARM_QUERIES: 평가셋 fixed 7 + 법령 3 + 뉴스 2 + 실무 3 ### app/api/search.py - 기존 sync analyzer 호출 완전 제거 - analyze=True → get_cached(q) 조회만 O(1) - hit: query_analysis 활용 (Phase 2.2/2.3 파이프라인 조건부 활성화) - miss: trigger_background_analysis(q) + 기존 경로 그대로 - timing["analyze_ms"] 제거 (경로에 LLM 호출 없음) - notes에 analyzer cache_hit/cache_miss 상태 기록 - debug.query_analysis는 cache hit 시에만 채워짐 ### app/main.py - lifespan startup에 prewarm_analyzer() background task 추가 - 논블로킹 — 앱 시작 막지 않음 - delay_between=0.5로 MLX 부하 완화 ## 기대 효과 - cold 요청 latency: 기존 Phase 1.3 그대로 (회귀 0) - warm 요청 + prewarmed: cache hit → query_analysis 활용 - 예상 cache hit rate: 초기 70~80% (prewarm) + 사용 누적 - Phase 2.2/2.3 multilingual/filter 기능은 cache hit 시에만 동작 ## 참조 - memory: feedback_analyzer_async_only.md (영구 룰 저장) - plan: ~/.claude/plans/zesty-painting-kahan.md ("철학 수정" 섹션) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 14:47:09 +09:00
Hyungi Ahn	d28ef2fca0	feat(search): Phase 2.1 QueryAnalyzer + LRU cache + confidence 3-tier QueryAnalyzer 스켈레톤 구현. 자연어 쿼리를 구조화된 분석 결과로 변환. Phase 2.1은 debug 노출 + tier 판정까지만 — retrieval 경로는 변경 X (회귀 0 목표). multilingual/filter 실제 분기는 2.2/2.3에서 이 분석 결과를 활용. app/prompts/query_analyze.txt - gemma-4 JSON-only 응답 규약 - intent/query_type/domain_hint/language_scope/normalized_queries/ hard_filters/soft_filters/expanded_terms/analyzer_confidence - 4가지 예시 (자연어 법령, 정확 조항, 뉴스 다국어, 의미 불명) - classify.txt 구조 참고 app/services/search/query_analyzer.py - LLM_TIMEOUT_MS=800 (MLX 멈춤 시 검색 전체 멈춤 방지, 절대 늘리지 말 것) - MAX_NORMALIZED_QUERIES=3 (multilingual explosion 방지) - in-memory FIFO LRU (maxsize=1000, TTL=86400) - cache key = sha256(query + PROMPT_VERSION + primary.model) → 모델/프롬프트 변경 시 자동 invalidate - 저신뢰(<0.5) / 실패 결과 캐시 금지 - weight 합=1.0 정규화 (fusion 왜곡 방지) - 실패 시 analyzer_confidence=float 0.0 (None 금지, TypeError 방지) app/api/search.py - ?analyze=true\|false 파라미터 (default False — 회귀 영향 0) - query_analyzer.analyze() 호출 + timing["analyze_ms"] 기록 - _analyzer_tier(conf) → "ignore" \| "original_fallback" \| "merge" \| "analyzed" (tier 게이트: 0.5 / 0.7 / 0.85) - debug.query_analysis 필드 채움 + notes에 tier/fallback_reason - logger 라인에 analyzer conf/tier 병기 app/services/search_telemetry.py - record_search_event(analyzer_confidence=None) 추가 - base_ctx에 analyzer_confidence 기록 (다층 confidence 시드) - result confidence와 분리된 축 — Phase 2.2+에서 failure 분류에 활용 검증: - python3 -m py_compile 통과 - 런타임 검증은 GPU 재배포 후 수행 (fixed 7 query + 평가셋) 참조: ~/.claude/plans/zesty-painting-kahan.md (Phase 2.1 섹션) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 14:21:37 +09:00
Hyungi Ahn	6d73e7ee12	feat: 분류 체계 전면 개편 — taxonomy + document_type + confidence - config.yaml: 6개 domain × 3단계 taxonomy + 13개 document_types 정의 - classify.txt: 영문 프롬프트, taxonomy 경로 기반 분류 + 분류 규칙 주입 - classify_worker: taxonomy 검증, confidence 기반 분류, document_type 저장 - migration 008: document_type, importance, ai_confidence 컬럼 - API: DocumentResponse에 document_type, importance, ai_confidence 추가 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 13:32:20 +09:00
Hyungi Ahn	131dbd7b7c	feat: scaffold v2 project structure with Docker, FastAPI, and config 동작하는 최소 코드 수준의 v2 스캐폴딩: - docker-compose.yml: postgres, fastapi, kordoc, frontend, caddy - app/: FastAPI 백엔드 (main, core, models, ai, prompts) - services/kordoc/: Node.js 문서 파싱 마이크로서비스 - gpu-server/: AI Gateway + GPU docker-compose - frontend/: SvelteKit 기본 구조 - migrations/: PostgreSQL 초기 스키마 (documents, tasks, processing_queue) - tests/: pytest conftest 기본 설정 - config.yaml, Caddyfile, credentials.env.example 갱신 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 10:20:15 +09:00

32 Commits