feat: 분류 체계 전면 개편 — taxonomy + document_type + confidence

- config.yaml: 6개 domain × 3단계 taxonomy + 13개 document_types 정의
- classify.txt: 영문 프롬프트, taxonomy 경로 기반 분류 + 분류 규칙 주입
- classify_worker: taxonomy 검증, confidence 기반 분류, document_type 저장
- migration 008: document_type, importance, ai_confidence 컬럼
- API: DocumentResponse에 document_type, importance, ai_confidence 추가

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Hyungi Ahn
2026-04-03 13:32:20 +09:00
parent 770d38b72c
commit 6d73e7ee12
6 changed files with 227 additions and 67 deletions

View File

@@ -40,6 +40,66 @@ nas:
mount_path: "/documents"
pkm_root: "/documents/PKM"
# ─── 문서 분류 체계 ───
taxonomy:
Philosophy:
Ethics: []
Metaphysics: []
Epistemology: []
Logic: []
Aesthetics: []
Eastern_Philosophy: []
Western_Philosophy: []
Language:
Korean: []
English: []
Japanese: []
Translation: []
Linguistics: []
Engineering:
Mechanical: [Piping, HVAC, Equipment]
Electrical: [Power, Instrumentation]
Chemical: [Process, Material]
Civil: []
Network: [Server, Security, Infrastructure]
Industrial_Safety:
Legislation: [Act, Decree, Foreign_Law, Korea_Law_Archive, Enforcement_Rule, Public_Notice, SAPA]
Theory: [Industrial_Safety_General, Safety_Health_Fundamentals]
Academic_Papers: [Safety_General, Risk_Assessment_Research]
Cases: [Domestic, International]
Practice: [Checklist, Contractor_Management, Safety_Education, Emergency_Plan, Patrol_Inspection, Permit_to_Work, PPE, Safety_Plan]
Risk_Assessment: [KRAS, JSA, Checklist_Method]
Safety_Manager: [Appointment, Duty_Record, Improvement, Inspection, Meeting]
Health_Manager: [Appointment, Duty_Record, Ergonomics, Health_Checkup, Mental_Health, MSDS, Work_Environment]
Programming:
Programming_Language: [Python, JavaScript, Go, Rust]
Framework: [FastAPI, SvelteKit, React]
DevOps: [Docker, CI_CD, Linux_Administration]
AI_ML: [Large_Language_Model, Computer_Vision, Data_Science]
Database: []
Software_Architecture: []
General:
Reading_Notes: []
Self_Development: []
Business: []
Science: []
History: []
document_types:
- Reference
- Standard
- Manual
- Drawing
- Template
- Note
- Academic_Paper
- Law_Document
- Report
- Memo
- Checklist
- Meeting_Minutes
- Specification
schedule:
law_monitor: "07:00"
mailplus_archive: ["07:00", "18:00"]