- config.yaml: embedding model → bge-m3
- document.py: Vector(768) → Vector(1024)
- embed_worker.py: 모델 버전 업데이트
- migration 011: 벡터 컬럼 재생성 (기존 임베딩 초기화)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Implement kordoc /parse endpoint (HWP/HWPX/PDF via kordoc lib,
text files direct read, images flagged for OCR)
- Add queue consumer with APScheduler (1min interval, stage chaining
extract→classify→embed, stale item recovery, retry logic)
- Add extract worker (kordoc HTTP call + direct text read)
- Add classify worker (Qwen3.5 AI classification with think-tag
stripping and robust JSON extraction from AI responses)
- Add embed worker (GPU server nomic-embed-text, graceful failure)
- Add DEVONthink migration script with folder mapping for 16 DBs,
dry-run mode, batch commits, and idempotent file_path UNIQUE
- Enhance ai/client.py with strip_thinking() and parse_json_response()
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>