hyungi_document_server

hyungi/hyungi_document_server

Fork 0

Commit Graph

Author	SHA1	Message	Date
Hyungi Ahn	a842c650d8	fix(migration): asyncpg 다중 statement 분리 asyncpg는 prepared statement에 다중 SQL 불가. COMMENT 제거하고 ALTER TABLE만 유지. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 15:10:01 +09:00
Hyungi Ahn	088966bf78	feat(extract): OCR 트리거 규칙 + extract_meta JSONB 스캔 PDF/이미지 자동 OCR 트리거 + 결과 품질 검증 + 1회 제한. - extract_meta JSONB 컬럼 추가 (migration 134) ocr_attempted, ocr_reason, ocr_skip_reason, ocr_terminal, ocr_chars - PDF OCR 트리거: total_chars < 300 또는 avg < 80 && total < 3000 - 이미지 자동 OCR: jpg/png/tiff/webp 등 - 품질 차등: 이미지 50자, PDF 200자 또는 페이지당 30자 - 상한: pages > 200 또는 file_size > 150MB → 스킵 - OCR 1회 제한: extract_meta.ocr_attempted로 재시도 방지 - extractor_version은 도구명만 (surya_ocr/pymupdf/kordoc) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 15:04:13 +09:00

Author

SHA1

Message

Date

Hyungi Ahn

a842c650d8

fix(migration): asyncpg 다중 statement 분리

asyncpg는 prepared statement에 다중 SQL 불가.
COMMENT 제거하고 ALTER TABLE만 유지.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-15 15:10:01 +09:00

Hyungi Ahn

088966bf78

feat(extract): OCR 트리거 규칙 + extract_meta JSONB

스캔 PDF/이미지 자동 OCR 트리거 + 결과 품질 검증 + 1회 제한.

- extract_meta JSONB 컬럼 추가 (migration 134)
  ocr_attempted, ocr_reason, ocr_skip_reason, ocr_terminal, ocr_chars
- PDF OCR 트리거: total_chars < 300 또는 avg < 80 && total < 3000
- 이미지 자동 OCR: jpg/png/tiff/webp 등
- 품질 차등: 이미지 50자, PDF 200자 또는 페이지당 30자
- 상한: pages > 200 또는 file_size > 150MB → 스킵
- OCR 1회 제한: extract_meta.ocr_attempted로 재시도 방지
- extractor_version은 도구명만 (surya_ocr/pymupdf/kordoc)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-15 15:04:13 +09:00

2 Commits