hyungi_document_server

Author	SHA1	Message	Date
hyungi	b73a5cc601	feat(infra): 2노드 이관 P1-4 — rerank 프로토콜 스위치(tei\|llamacpp)·OCR/STT 명시 게이트·413 재홈 - AIModelConfig.protocol 판별자 신설(기본 tei = 무회귀), llamacpp = /v1/rerank 요청·응답 스키마 정규화(ai/rerank_protocol.py 순수함수 + 단위테스트 4) - OCR_ENABLED/STT_ENABLED 명시 게이트 — GPU CUDA 서비스(Surya/faster-whisper) 폐기 대응, silent 아님(경고 로그 + extract_meta 터미널 기록) - DS Caddyfile request_body 100MB — 413 정책을 edge(home-caddy)에서 내부로 재홈 (DSM 리버스 프록시 전환 대비, upload.max_bytes 정합) - SSE X-Accel-Buffering는 기점검 결과 기구현(eid_chat)이라 무변경 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-02 13:11:06 +09:00
hyungi	495e1c786f	refactor(search)!: /ask 고아 service·테스트·프롬프트 정리 (검색 단일화 Phase 2) /ask 삭제로 0-consumer 된 자산 제거(3-gate 실증): search.py /ask 섹션(Citation/ConfirmedItem/AskDebug/AskResponse 모델 + 헬퍼 + _resolve_eval_identity) + 죽은 import 13개. service 4(classifier/verifier/refusal_gate/grounding_check). AIClient.call_classifier/call_verifier(고아). 프롬프트 2(classifier/verifier.txt). broken test 6. evidence/synthesis 는 공유(documents.py 등)라 유지. 실 pyflakes 클린(이전 세션 pyflakes 미설치로 검증 누락 → 설치 후 실검증). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 14:39:53 +09:00
hyungi	2d86683636	refactor(ai): AIClient PR-B — gate 누락 경로 봉인 + 공유 httpx + public classifier/verifier 코드리뷰 AIClient 정비 PR-B (#2 gate·#3 httpx·#4 public). #2 gate 구조 (call-site 컨벤션 — gate 는 caller-managed, AIClient self-gate 금지): · classify_worker consumer call_triage: gate 없이 Mac mini 직타하던 것 → acquire_mlx_gate(BACKGROUND). (drain 경로 call_deep_or_defer 는 맥북 deep 슬롯이라 mini gate 무관, 미적용.) · verifier_service: gate 없이 _request(verifier) 하던 것 → acquire_mlx_gate(FOREGROUND) + call_verifier. classifier/evidence 와 동일 gate 공유로 thundering-herd(22-timeout 사고) 방어. ★재진입 안전 검증: AIClient 메서드 내부 self-gate 0(전부 call-site) + evidence/classifier 는 이미 독립 gate 보유 + api/search 오케스트레이터 gate 미보유 → double-acquire 데드락 불가. #4 public 메서드: call_classifier/call_verifier 추가 → classifier/verifier_service 의 private _request 직접호출 봉인(egress 가드 일관 적용). gate 는 caller-managed 유지(call_primary 와 동일 계약). #3 공유 httpx: 호출마다 AsyncClient 생성(30+ 사이트)을 _get_shared_http() 단일 풀로 — keep-alive 재사용. 이벤트루프 바인딩이라 루프 변경(테스트) 시 재생성, close() 는 no-op. py_compile PASS. (잔여 #4: query_analyzer/digest/backends 의 _request·_call_chat 직접호출은 gated 라 안전, 후속 sweep.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-26 20:07:30 +09:00
hyungi	fb82a69c02	feat(ai): AIModelConfig 에 mlx 샘플링 필드(repetition_penalty/top_k) + _request 주입 코드리뷰 AIClient 정비 PR-A. Qwen3 한국어 장문에서 코드스위칭(CJK/라틴 누수)·반복루프를 억제할 손잡이가 config/코드에 부재했음(temperature/top_p만 존재). None 기본값이라 동작 무변경 — 활성화는 config.yaml 에 값 설정 시(별도). OpenAI 호환(mlx) 분기만 적용. PR-B(gate 구조강제·공유 httpx·public call_classifier/verifier)는 후속. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-26 19:24:42 +09:00
hyungi	456dfaa9f2	fix(ai): _call_chat 무동의 Claude egress 자동폴백 제거 (R6) primary(맥미니) Timeout/ConnectError 시 동의·과금 통제 없이 ai.fallback(Claude API)으로 자동 전환 → 개인 문서/쿼리/메모가 Anthropic 으로 silent egress 되던 프라이버시 결함 봉쇄. 실패는 전파 — 배치 워커는 재시도/StageDeferred(R3), interactive 는 호출자 5xx 표면화 (documents.analyze 이미 502/504). 클라우드는 premium explicit-trigger / call_fallback 명시 호출로만 (자동 진입 금지). 참고: uncoordinated-mlx-semaphores 는 gitea/main 최신에서 digest/briefing 이 이미 acquire_mlx_gate 사용(감사 20커밋 stale 탓 오탐) — 변경 불요. rerank silent-identity 의 rerank_skipped notes 플래그는 시그니처 변경 동반이라 별도 후속(Low). 검증: py_compile 통과. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 13:38:46 +09:00
hyungi	151c1ee518	fix(search): text-leg 본문 스코어링 2000자 절단 + bge-m3 keep_alive 로 검색 latency 개선 코퍼스 ~52배 성장(코드 가정 765 → 실제 40k docs) 후 search_text ORDER BY 가 후보 행마다 extracted_text(평균 3.7KB·최대 1.6MB) 전체에 similarity() + to_tsvector() 재토큰화를 재연산 → broad/영어 쿼리 text_ms 최대 4960ms. scoring/match_reason 의 extracted_text 를 left(...,2000) 으로 절단(후보 CTE 의 FTS 매칭은 전체 본문 유지 → recall 불변). embed() 요청에 keep_alive:-1 추가로 ollama bge-m3 GPU 상주 → sparse 검색의 cold reload(~6s) 제거. 검증(snapshot freeze docs 43958/chunks 195671, 51 case, eval-version both): - graded NDCG 0.575 → 0.575 (±0.000, 전 카테고리 byte-identical) - Recall g>=2 0.691 / g>=3 0.739 불변, v0.1 NDCG/Recall/Top-3 불변 - latency p50 760→586ms (-23%) / p95 5230→832ms (-84%) - EXPLAIN 단일쿼리: V0 4917ms → left(2000) 285ms (17x) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-14 04:34:24 +00:00
hyungi	d667545185	fix(classify): 적대 리뷰 반영 — use_deep 스레딩(B1)·StageDeferred 전파(B2)·legacy 호출 deep 경유(M3) - _run_tier_triage(use_deep) 스레딩 — 미배선 NameError(전 classify 파괴) fix - process 의 triage try 에 except StageDeferred: raise 선행 (drain 보류 시멘틱 복구) - legacy classify()/summarize() 에 cfg 파라미터 — use_deep 시 deep 슬롯 경유 + is_deferrable_error → StageDeferred 변환(첫 호출 = 최저비용 지점에서 보류, doc 쓰기 0) - ai_model_version = 실제 처리 경로 모델 (drain=qwen-macbook 귀속) - analyze_event model_name 스레딩 + deep triage cfg 에 top_p 동승 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-12 07:12:40 +09:00
hyungi	235bbf9881	ops(pipeline): fair-share 번들 — drain classify 합류 + deep 맥미니 폴백 + mlx 게이트 동시 2 사용자 '공평하게 동일한 작업' 지적의 비대칭 잔재 2건 + 예고된 배칭 레버: - queue_drain --stage classify (use_deep: deep 슬롯 endpoint + triage sampling, 완료 시 enqueue_next_stage 로 embed/chunk/markdown 연쇄 — DAG 단절 방지) - deep_summary consumer = 맥북 우선, 불가 시 맥미니 primary 즉시 처리(동일 모델 — 강등 아님). drain 은 defer_on_deep_unavailable=True 로 기존 보류-종료 유지 - llm_gate capacity 일반화 (config pipeline.mlx_gate_concurrency, 기본 1, 운영 2) — 'MLX_CONCURRENCY=1 고정' 영구 룰의 전제(single-inference 서버) 소멸을 docstring 에 개정 박제 - analyze_events FK(users) CLI 컨텍스트 INSERT 실패 fix (models.user 명시 import) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-12 06:56:02 +09:00
hyungi	e7c7a2091f	fix(workers): 보류 분류에 라우터 502/504 추가 — upstream 절단이 라우터 경유에선 502 로 표면화 llm_router.py 실측: upstream 연결 실패/생성 중 절단 = HTTPException 502 (4곳). 맥북 sleep 절단의 실제 표면이라 503 단독 분류는 보류 누락 → 502/503/504 로 확장. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 13:00:55 +09:00
hyungi	88e5893041	feat(workers): 맥북 M5 Max 분담 배선 — deep 슬롯 + 보류 시멘틱 + queue_drain CLI plan ds-macbook-offload-1 P2 (Soft Lock 예외 박제 ds-macbook-offload-exec-20260611.md): - config ai.models.deep optional 슬롯 (라우터 :8890 경유 qwen-macbook, 부재 시 기존 경로) - AIClient.call_deep + is_deferrable_error + call_deep_or_defer (자동 cloud/맥미니 폴백 0) - deep_summary_worker: deep 슬롯 시 맥북 경유 (맥미니 mlx gate 미점유) + 실모델 기록 - StageDeferred 보류 시멘틱: 503/connect/read-timeout(sleep 절단) = attempts 미소모 + payload.deferred_until 30분 백오프, doc 쓰기는 완주+파싱 후 단일 커밋 (부분 쓰기 0) - queue_consumer: claim 에 deferred 필터 + StageDeferred 분기 - workers.queue_drain: 수동 burst-drain CLI (summarize/deep_summary, SKIP LOCKED 단건 claim, per-item 커밋, 보류 시 run 종료, deep 슬롯 필수 가드) - tests 20건 + 라우터 경유 Qwen 실응답 fixture 박제 (13.2s 라이브) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 12:55:16 +09:00
hyungi	6a85087b83	feat(eid): 이드 persona substrate W2~W4 — DS compose·약점진단·egress 코드층 박탈 전 로컬 LLM 관통 '이드' persona substrate 의 Document Server 측 빌드(W2~W4). 설계 = PKM eid-persona-substrate(r1~r3 수렴) / impl = eid-persona-impl. W2 — compose + 표면 배선: - app/eid/compose.py: persona→rules→overlay→task 단일 system 문자열 + 정적 ROUTE_MAP (런타임 sniffing 아님) + rules 부재 fail-loud · persona 부재 quiet · overflow fail-loud. - 자유-prose 3 표면(react_ask·study_subject_note·study_question_explanation) 중복 정체성· generic 정책 trim + compose 배선(AIClient 에 additive system 파라미터). 도메인 calibration 보존. - STRICT JSON 기계류(briefing_comparative·digest_topic)는 persona-ZERO 동결(불변식 #3). - app/prompts/substrate/: persona(외부 컴파일 산출물 vendor) + rules(생성 가드 서브셋) + overlay 5. W3 — migration + 워커 + study_diagnosis: - migration 301~305: eid_* append-only 원장(약점/복습초안/회고) + approval_requests(가변 큐) + 일정 파생뷰 2. - app/workers/study_weakness.py: study_question_progress.pattern_state 집계로 약점 derived 산출 (LLM 0) + bounded tier(watch/review/focus). nightly cron. - study_diagnosis 표면: 최신 스냅샷을 코치 언어로 번역(약점 판정은 코드, LLM 은 블록 값만 인용). W4-1 — egress 코드층 박탈: - app/eid/ai.py EidAIClient: 이드 표면 = call_primary(내부 MLX) only. 외부 LLM fallback 경로 구조적 봉쇄(call_fallback raise · 자동 fallback 제거 · 외부 endpoint 차단). egress 워커는 분리 유지. load-bearing 정정 3(환경 grounding 강제, 설계 회귀 아님): - rules = 운영 ruleset 전체 → 생성 가드 서브셋(HTML 산출물 룰이 study task 와 충돌). - append-only = REVOKE → CREATE RULE DO INSTEAD NOTHING(단일 owner role 은 REVOKE 무효 + migration 검증기가 plpgsql BEGIN 거부) + actor/source_* NOT NULL 스탬프. - 이드 LLM 봉쇄 = path discipline → EidAIClient 구조화. 검증: eid 순수 단위테스트 30 통과 + py_compile + migration 검증기 모사 + egress 적대감사 COMPLETE. DB/LLM/httpx 의존 테스트(append-only RULE·EidAIClient·E2E)는 staging(Docker) 가동. W4-2 네트워크 belt 은 조건부 보류(코드층 1차 충분, P0-3② 원격 실측 후 hard-gate 시 승격). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 15:13:20 +09:00
hyungi	5cb8d04b50	feat(ai): config-driven sampling profile — triage T=0, primary T=0.3 top_p=0.9 P1 of family-adaptive-bengio (Mac mini 4-lever bundle). AIModelConfig: temperature/top_p Optional fields (None = server default). _request OpenAI/MLX branch payload 조건부 sampling 인자 삽입. config.yaml ai.models.triage.temperature=0.0 (deterministic) / primary temperature=0.3 top_p=0.9 (summary creativity). fallback (Anthropic) branch 미적용 — 별 plan 범위. caller 코드 무변경. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 06:37:46 +00:00
hyungi	118f32f9b1	refactor(ai): PR #20 reframe cleanup — Ollama LLM 잔재 주석 정정 PR #20 (2026-05-14, GPU LLM 제거 + Mac mini 26B MLX 흡수) 의 swap 이 backends.json + 코드 주석/docstring 까지 따라가지 못한 표현 잔재 정리. - app/ai/client.py: AIClient docstring 및 call_triage / call_fallback docstring 의 "4B Ollama" → "Mac mini 26B MLX" / "현재는 triage 와 동일 엔드포인트" → "Claude Sonnet 4 API (PR #20 swap 완료)" - app/core/config.py: triage/primary/fallback 주석 통합 + Phase 3.5 classifier/verifier 주석에 PR #20 endpoint 명시 (history 보존) - app/services/search/{llm_gate,classifier_service,verifier_service, evidence_service}.py: "fallback(Ollama)" / "Ollama concurrent OK" / "triage(4B Ollama)" 표현을 Mac mini 26B MLX endpoint 기준으로 정정 + concurrent 안전성 별 검토 마커 추가 - app/services/digest/summarizer.py: "MLX hang/Ollama stall 방어" → "MLX hang / fallback Claude API stall 방어" - app/services/prompt_versions.py: SUMMARY_TRIAGE_TASK + ASK_PROMPT_VERSION 주석의 "4B Ollama" / "4B gemma Ollama" → Mac mini 26B MLX - app/workers/classify_worker.py: B-1 tier triage docstring 정정 코드 동작 변경 0 (주석/docstring 만). embed_worker / study_question_embed_worker 의 "Ollama bge-m3" 표현은 사실 정확이라 유지. 검증: - ollama list → bge-m3:latest 잔존 (embedding owner) - /api/embeddings probe → 1024-dim 200 OK - fastapi embed/ollama error 0 (last 10min) - document.hyungi.net 200 plan: ~/.claude/plans/4-stateless-dongarra.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 12:09:15 +00:00
Hyungi Ahn	b3dbf1a11e	fix(ai): parse_json_response — string literal 안만 fix 하는 stateful walker 직전 fallback 의 무차별 newline replace 가 string 외부 (object 구조) 의 raw newline 까지 escape 해서 JSON 거부. 또 LaTeX 수식 (\circ, \text, \, etc) 의 invalid backslash 는 newline 이슈와 별개라 별도 fix 필요. state machine: in_string 토글 (`\"` 만남). string literal 안에서만: - raw LF/CR/TAB → \\n/\\r/\\t 로 변환 - backslash 다음에 valid escape char (\"\\/bfnrtu) 면 그대로 - backslash 다음에 invalid (\\c, \\,) 면 backslash 자체를 \\\\ 로 escape - string 외부 raw newline 은 JSON whitespace 라 보존 운영 데이터 id=243 의 raw 940자에 \\circ \\text \\, \\approx \\times 등 다수 LaTeX + markdown 줄바꿈 → 새 walker 가 두 케이스 모두 fix. 다른 worker (classify/triage/ study_explanation/evidence/study_session_analysis) 자동 혜택. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 08:00:20 +09:00
Hyungi Ahn	95b127fd8d	fix(ai): parse_json_response — raw newline escape fallback (5단계) Phase 4-A debug 결과 study_question_jobs.parse_fail 33건의 raw preview 분석: - 모델이 explanation_md 안에 raw newline (LF) 그대로 박음 ('### [풀이]\n\n**자료...') - JSON 표준상 string literal 안 raw control char 금지 → json.loads 거부 - 4단계 fallback (greedy slice) 도 이 때문에 실패 5단계 fallback 추가: candidate 의 \r\n/\n/\r 을 ``\\n``/``\\r`` escape 로 치환 후 재시도. 이미 escape 된 ``\\n`` (Python str = backslash+n 두 글자) 는 raw newline 아니라 영향 없음. 다른 worker (classify/triage/study_explanation/evidence/study_session_analysis) 모두 같은 파서를 공유하므로 자동으로 혜택. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 07:56:01 +09:00
Hyungi Ahn	ff41feb3e3	fix(study): Phase 4-A parse_fail 디버깅 — 파서 fallback + raw 저장 운영 데이터에서 4-A study_question_jobs 의 33/114 가 'envelope JSON parse failed' 로 종결. parse_json_response 의 balanced 정규식이 못 잡는 케이스 다수 추정. 원인 분류 위해: 1. 파서 보강 (app/ai/client.py) - 기존 4단계 파싱 (fenced / balanced finditer / 전체 cleaned) 보존 - 5단계 fallback 추가: first '{' ~ last '}' greedy slice → json.loads - envelope JSON 안에 내부 따옴표/뉴라인/escape 때문에 balanced 가 못 잡는 케이스 방어. 모델이 JSON 앞뒤 자유 텍스트 섞어도 본체만 추출. - 회귀 위험 낮은 추가만 (앞 단계 성공 시 즉시 반환) 2. parse_fail 시 raw preview 저장 (study_explanation_worker) - 3개 inline parse_fail 분기 (not_dict / invalid_answer_choice / empty_explanation_md) 모두 _save_raw_preview() 헬퍼 호출 - job.payload.debug_raw_preview = raw_text[:1000] - job.payload.parse_fail_reason = 분류 키 - 향후 parse_fail row 의 payload 분석으로 원인 정확히 분류 가능 다음 단계: 배포 후 재발생 추이 + raw preview 분석 → prompt 추가 강화 또는 parser 추가 보강. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 07:48:10 +09:00
Hyungi Ahn	490bef1136	feat(ai): B-0 3-tier routing — triage/primary/fallback 슬롯 + AIClient - config.yaml: ai.models 에 triage (gemma4:e4b-it-q8_0, GPU Ollama, context_char_limit=120k, timeout 30s) 신규. primary (MLX gemma-4-26b) 는 에스컬레이션 전용 역할 명시. fallback 을 gemma4:e4b 로 통일 (exaone 제거 이미 반영). classifier/verifier 는 optional 유지, vision 은 optional 로 완화 (미사용 정리 준비). - core/config.py: AIConfig 에 triage 필드 추가, vision 은 Optional 로 전환. AIModelConfig.context_char_limit + DeepSummaryBacklogConfig (R2 backlog guard 임계치 ratio 0.3 / pending 5 / window 30min) 스키마 신설. load_settings 가 models.get("vision") graceful. - ai/client.py: call_triage / call_primary / call_fallback 3-tier 진입점 신규. primary 는 caller 가 get_mlx_gate() 블록 안에서 호출 해야 한다는 계약 docstring. classify/summarize 는 DEPRECATED 주석 만 추가, 기존 호출부 (eval runner 등) 를 위해 유지. PR-B B-0 Day 1. 기존 primary 경로 변경 없음 — 회귀 0 기대. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 10:05:24 +09:00
Hyungi Ahn	b401085518	feat(ai): EscalationEnvelope contract (4B→26B handoff) frozen dataclass with from_stage / escalation_reasons / risk_flags / distilled_context / original_pointers / synthesis_directives / user_intent / draft_hint. JSON round-trip (to_json/from_json). to_system_injection() 으로 26B system prompt 에 주입할 텍스트 블록 생성 (risk_flags + directives + distilled_context 순). from_stage 는 whitelist 검증 (triage/classify/summarize_short/advice_trigger/ night_sweep/ask_pre/unknown). tuple 타입 강제 (mutability 방지). PR-B 의 escalation_service 가 이 계약을 사용. PR-A 는 계약만 정의. plan: ~/.claude/plans/wise-gliding-hippo.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 09:34:48 +09:00
Hyungi Ahn	f8f72ceae2	fix(ocr): Surya 0.17 API + NFC/NFD path normalize - services/ocr/server.py: surya 0.17.x predictors 기반으로 재작성 (구 `from surya.ocr import run_ocr` 제거됨 → import error → 빈 텍스트 반환) - NFC(DB 경로) vs NFD(NFS 파일시스템) 한글 정규화 mismatch 보정 - surya-ocr 버전 0.17.1 고정 (0.6~1.0 범위는 breaking change 노출) - AIClient.ocr() NotImplementedError 제거 (호출처 0건, extract_worker 가 ocr-service HTTP 호출을 직접 사용) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 13:52:19 +09:00
Hyungi Ahn	5070ac45ff	fix(extract): LibreOffice 추출 절단 제거 및 요약 입력 확대 - extract_worker: LibreOffice 15000자 절단 제거 (full text 저장 원칙) - classify_worker/summarize_worker: 요약 입력 15000→50000자 확대 - client.py: 길이 기반 Claude 자동전환 제거 (require_explicit_trigger 정책 준수) _call_chat의 primary→fallback(exaone3.5) 체인은 유지 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 14:00:23 +09:00
Hyungi Ahn	76e723cdb1	feat(search): Phase 1.3 TEI reranker 통합 (코드 골격) 데이터 흐름 원칙: fusion=doc 기준 / reranker=chunk 기준 — 절대 섞지 말 것. 신규/수정: - ai/client.py: rerank() 메서드 추가 (TEI POST /rerank API) - services/search/rerank_service.py: - rerank_chunks() — asyncio.Semaphore(2) + 5s soft timeout + RRF fallback - _make_snippet/_extract_window — title + query 중심 200~400 토큰 (keyword 매치 없으면 첫 800자 fallback) - apply_diversity() — max_per_doc=2, top score>=0.90 unlimited - warmup_reranker() — 10회 retry + 3초 간격 (TEI 모델 로딩 대기) - MAX_RERANK_INPUT=200, MAX_CHUNKS_PER_DOC=2 hard cap - services/search_telemetry.py: compute_confidence_reranked() — sigmoid score 임계값 - api/search.py: - ?rerank=true\|false 파라미터 (기본 true, hybrid 모드만) - 흐름: fused_docs(limit*5) → chunks_by_doc 회수 → rerank_chunks → apply_diversity - text-only 매치 doc은 doc 자체를 chunk처럼 wrap (fallback) - rerank 활성 시 confidence는 reranker score 기반 - tests/search_eval/run_eval.py: --rerank true\|false 플래그 GPU 적용 보류: - TEI 컨테이너 추가 (docker-compose.yml) — 별도 작업 - config.yaml rerank.endpoint 갱신 — GPU 직접 (commit 없음) - 재인덱싱 완료 후 build + warmup + 평가셋 측정	2026-04-08 12:41:47 +09:00
Hyungi Ahn	63f75de89d	fix: Qwen3.5 thinking 모드 비활성화 (enable_thinking: false) JSON 응답에 Thinking Process 텍스트가 섞이는 문제 해결. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 13:38:10 +09:00
Hyungi Ahn	47e9981660	fix: Qwen3.5 Thinking Process 텍스트 제거 — JSON 파싱 개선 첫 번째 { 이전의 모든 비-JSON 텍스트를 제거하여 thinking/reasoning preamble이 있어도 JSON 추출 가능. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 11:44:21 +09:00
Hyungi Ahn	d93e50b55c	security: fix 5 review findings (2 high, 3 medium) HIGH: - Lock setup TOTP/NAS endpoints behind _require_setup() guard (prevented unauthenticated admin 2FA takeover after setup) - Sanitize upload filename with Path().name + resolve() validation (prevented path traversal writing outside Inbox) MEDIUM: - Add score > 0.01 filter to hybrid search via subquery (prevented returning irrelevant documents with zero score) - Implement Inbox → Knowledge file move after classification (classify_worker now moves files based on ai_domain) - Add Anthropic Messages API support in _request() (premium/Claude path now sends correct format and parses content[0].text instead of choices[0].message.content) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 15:33:31 +09:00
Hyungi Ahn	299fac3904	feat: implement Phase 1 data pipeline and migration - Implement kordoc /parse endpoint (HWP/HWPX/PDF via kordoc lib, text files direct read, images flagged for OCR) - Add queue consumer with APScheduler (1min interval, stage chaining extract→classify→embed, stale item recovery, retry logic) - Add extract worker (kordoc HTTP call + direct text read) - Add classify worker (Qwen3.5 AI classification with think-tag stripping and robust JSON extraction from AI responses) - Add embed worker (GPU server nomic-embed-text, graceful failure) - Add DEVONthink migration script with folder mapping for 16 DBs, dry-run mode, batch commits, and idempotent file_path UNIQUE - Enhance ai/client.py with strip_thinking() and parse_json_response() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 14:35:36 +09:00
Hyungi Ahn	131dbd7b7c	feat: scaffold v2 project structure with Docker, FastAPI, and config 동작하는 최소 코드 수준의 v2 스캐폴딩: - docker-compose.yml: postgres, fastapi, kordoc, frontend, caddy - app/: FastAPI 백엔드 (main, core, models, ai, prompts) - services/kordoc/: Node.js 문서 파싱 마이크로서비스 - gpu-server/: AI Gateway + GPU docker-compose - frontend/: SvelteKit 기본 구조 - migrations/: PostgreSQL 초기 스키마 (documents, tasks, processing_queue) - tests/: pytest conftest 기본 설정 - config.yaml, Caddyfile, credentials.env.example 갱신 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 10:20:15 +09:00

26 Commits