refactor(board): 처리 머신 보드 나스+맥미니 2노드 재구성

2026-07-02 컷오버 반영 — GPU 서버 퇴역, 맥북 night-drain 보류(06-29 결정). - 레인 2개: 나스(추출/마크다운/청크·임베딩 등 DS 본체 Docker 스테이지), 맥미니(분류/요약/심층분석 — 단일 생성 LLM 허브 + bge-m3/리랭크) - summarize 풀 분리(summarize_by_machine·ai_model_version 조인 SQL) 제거 — FE 유일 소비자 확인 후 응답 스키마에서 정리 (5쿼리 -> 4쿼리) - 맥북 전제 UI 제거: 요약 오프로드 분담막대·요약 합류 칩·번다운 합류 변곡점 마커·잠듦 문구·전역 스트립 맥북 칩(맥미니 칩으로 대체) - deferred_pending = LLM 백오프 신호로 맥미니 카드 귀속 (기능 보존) - 번다운 차트·정직 ETA·실패 드로어·백그라운드 작업 등 머신 무관 기능 보존 - background_jobs 머신 귀속 기본값 gpu -> nas - 단위테스트 2노드 기준 재작성 (27 passed) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Merge pull request 'Feat/two node endpoints' (#51 ) from feat/two-node-endpoints into main
2026-07-02 16:51:32 +09:00 · 2026-07-02 14:31:27 +09:00 · 2026-07-02 13:30:04 +09:00 · 2026-07-02 13:11:33 +09:00 · 2026-07-02 13:11:06 +09:00 · 2026-07-02 09:47:57 +09:00
41 changed files with 2463 additions and 382 deletions
@@ -19,6 +19,14 @@ http://document.hyungi.net {
        Referrer-Policy strict-origin-when-cross-origin
        -Server
    }
    # 2노드 이관(2026-07-02): 업로드 100MB 한도 집행을 edge(home-caddy)에서 DS 내부로 재홈.
    # 인그레스가 DSM 리버스 프록시(한도 GUI 미노출)로 바뀌어도 413 단일 소스 유지.
    # config.yaml upload.max_bytes(100000000)와 정합.
    request_body {
        max_size 100MB
    }
    encode {
        gzip
        match {
@@ -11,8 +11,8 @@ RUN apt-get update && \
      ffmpeg && \
    apt-get clean && rm -rf /var/lib/apt/lists/*
-COPY requirements.txt .
+COPY requirements.txt requirements.lock ./
-RUN pip install --no-cache-dir -r requirements.txt
+RUN pip install --no-cache-dir -r requirements.lock
 COPY . .
@@ -290,23 +290,43 @@ class AIClient:
        return response.json()["embedding"]
    async def rerank(self, query: str, texts: list[str]) -> list[dict]:
-        """TEI bge-reranker-v2-m3 호출 (Phase 1.3).
+        """리랭커 호출 — ai.models.rerank.protocol 로 백엔드 분기 (2노드 이관 2026-07-02).
-        TEI POST /rerank API:
+        공통 반환 계약: [{"index": int, "score": float}, ...] (score 내림차순)
        "tei" (기본, 무회귀) — TEI POST /rerank:
            request:  {"query": str, "texts": [str, ...]}
            response: [{"index": int, "score": float}, ...] (정렬됨)
        "llamacpp" — llama.cpp POST /v1/rerank (bge-reranker GGUF, 맥미니 :8807):
            request:  {"model": str, "query": str, "documents": [str, ...]}
            response: {"results": [{"index": int, "relevance_score": float}, ...]}
            → normalize_llamacpp_rerank 로 TEI 형태 정규화.
        미지원 protocol = ValueError (명시 실패 — silent fallback 금지).
        timeout은 self.ai.rerank.timeout (config.yaml).
        호출자(rerank_service)가 asyncio.Semaphore + try/except로 감쌈.
        """
        protocol = getattr(self.ai.rerank, "protocol", "tei") or "tei"
        timeout = float(self.ai.rerank.timeout) if self.ai.rerank.timeout else 5.0
-        response = await self._http.post(
+        if protocol == "tei":
-            self.ai.rerank.endpoint,
+            response = await self._http.post(
-            json={"query": query, "texts": texts},
+                self.ai.rerank.endpoint,
-            timeout=timeout,
+                json={"query": query, "texts": texts},
-        )
+                timeout=timeout,
-        response.raise_for_status()
+            )
-        return response.json()
+            response.raise_for_status()
            return response.json()
        if protocol == "llamacpp":
            from ai.rerank_protocol import normalize_llamacpp_rerank
            response = await self._http.post(
                self.ai.rerank.endpoint,
                json={"model": self.ai.rerank.model, "query": query, "documents": texts},
                timeout=timeout,
            )
            response.raise_for_status()
            return normalize_llamacpp_rerank(response.json())
        raise ValueError(f"unknown rerank protocol: {protocol}")
    async def _call_chat(self, model_config, prompt: str) -> str:
        """OpenAI 호환 API 호출 (R6: 무동의 클라우드 폴백 제거).
@@ -0,0 +1,24 @@
 """rerank 백엔드 응답 정규화 — 2노드 이관 (2026-07-02, main-server-retirement-1 P1-4).
 TEI(/rerank)와 llama.cpp(/v1/rerank)는 요청/응답 스키마가 다르다.
 소비자(rerank_service)는 TEI 형태 [{"index": int, "score": float}]를 기대하므로
 llama.cpp 응답을 여기서 정규화한다. 순수 함수(stdlib only) — 단위 테스트 대상.
 """
 def normalize_llamacpp_rerank(payload: dict) -> list[dict]:
    """llama.cpp /v1/rerank 응답을 TEI 형태로 정규화.
    입력:  {"results": [{"index": int, "relevance_score": float}, ...], ...}
    반환:  [{"index": int, "score": float}, ...] (score 내림차순 — TEI '정렬됨' 계약 유지)
    index/relevance_score 가 없는 항목은 버린다 (소비자 측 idx/sc None 가드와 동일 방어).
    """
    results = payload.get("results") or []
    normalized = [
        {"index": r["index"], "score": float(r["relevance_score"])}
        for r in results
        if r.get("index") is not None and r.get("relevance_score") is not None
    ]
    normalized.sort(key=lambda r: -r["score"])
    return normalized
@@ -37,8 +37,8 @@ class CurrentItem(BaseModel):
 class MachineCard(BaseModel):
-    """머신 카드 — stage 귀속 합산 + 완료 실적(summarize 는 풀 분리) + state."""
+    """머신 카드 — stage 귀속 합산 + 완료 실적 + state (나스/맥미니 2노드)."""
-    key: Literal["gpu", "macmini", "macbook"]
+    key: Literal["nas", "macmini"]
    label: str
    state: Literal["active", "deferred", "idle"]
    stages: list[str]
@@ -59,20 +59,6 @@ class SummarizeEta(BaseModel):
    eta_minutes: int | None
 class MachineDone(BaseModel):
    """머신 1대의 summarize 완료 실적 (분담 표시용)."""
    done_1h: int
    done_today: int
 class SummarizeByMachine(BaseModel):
    """summarize 풀의 머신별 완료 실적 분담 — 보드 레인의 '맥미니 vs 맥북'
    오프로드 가시화용. rows_to_summarize_split 이 이미 계산하던 값의 노출
    (ds-board-merged A-1, 신규 수집 SQL 0)."""
    macmini: MachineDone
    macbook: MachineDone
 class TrendBucket(BaseModel):
    """summarize 24h 추이 버킷 — hour 는 KST "HH:00" 라벨."""
    hour: str
@@ -122,7 +108,6 @@ class QueueOverviewResponse(BaseModel):
    machines: list[MachineCard]
    stages: list[StageRow]
    summarize_eta: SummarizeEta
    summarize_by_machine: SummarizeByMachine
    trend_24h: list[TrendBucket]
    totals: Totals
    background_jobs: list[BackgroundJobItem] = []
@@ -8,13 +8,14 @@ from __future__ import annotations
 from typing import Annotated
-from fastapi import APIRouter, Depends
+from fastapi import APIRouter, Depends, HTTPException
 from sqlalchemy.ext.asyncio import AsyncSession
 from core.auth import get_current_user
 from core.database import get_session
 from models.user import User
 from services.study import concept_curriculum as cc
 from services.study import concept_links as cl
 router = APIRouter()
@@ -43,6 +44,45 @@ async def get_today_concepts(
    return await cc.today_concepts(session, user.id, topic_id, limit)
@router.get("/concepts/weakness-map")
 async def get_weakness_map(
    user: Annotated[User, Depends(get_current_user)],
    session: Annotated[AsyncSession, Depends(get_session)],
    topic_id: int = DEFAULT_TOPIC_ID,
    limit: int = 12,
 ):
    """개념 약점 지도 — 링크된 기출 정답률로 약점 개념(정답률<60%) 우선(이론↔문제)."""
    name = await cc._topic_name(session, topic_id)
    if not name:
        return {"weak": [], "weak_total": 0, "evaluated_total": 0}
    return await cl.weakness_map(session, user.id, name, limit)
@router.get("/concepts/{doc_id}")
 async def get_concept_detail(
    doc_id: int,
    user: Annotated[User, Depends(get_current_user)],
    session: Annotated[AsyncSession, Depends(get_session)],
    topic_id: int = DEFAULT_TOPIC_ID,
 ):
    """개념 리더 재료 — 구조 파싱(요약/본문/빈출/관련) + 백링크 해소 + 회독/SR + 이전/다음."""
    detail = await cc.concept_detail(session, user.id, topic_id, doc_id)
    if detail is None:
        raise HTTPException(status_code=404, detail="concept not found")
    return detail
@router.get("/concepts/{doc_id}/questions")
 async def get_concept_questions(
    doc_id: int,
    user: Annotated[User, Depends(get_current_user)],
    session: Annotated[AsyncSession, Depends(get_session)],
    limit: int = 20,
 ):
    """개념 관련 기출 + 내 정답률 (이론↔문제 브리지)."""
    return await cl.related_questions(session, user.id, doc_id, limit)
@router.post("/concepts/{doc_id}/read")
 async def post_concept_read(
    doc_id: int,
@@ -35,6 +35,12 @@ class AIModelConfig(BaseModel):
    # OpenAI 호환 분기(mlx)만 적용 — Anthropic 분기는 미적용(별 범위).
    repetition_penalty: float | None = None
    top_k: int | None = None
    # 2노드 이관 (2026-07-02): rerank 백엔드 프로토콜 판별자.
    # "tei" = TEI POST /rerank {"query","texts"} → [{"index","score"}] (기본, 무회귀)
    # "llamacpp" = llama.cpp POST /v1/rerank {"model","query","documents"}
    #              → {"results":[{"index","relevance_score"}]} (맥미니 :8807)
    # 미지원 값 = client.rerank 가 ValueError (silent fallback 금지). rerank 블록 외 무시.
    protocol: str = "tei"
 class DeepSummaryBacklogConfig(BaseModel):
@@ -145,6 +151,12 @@ class Settings(BaseModel):
    # STT (faster-whisper, §3)
    stt_endpoint: str = "http://stt-service:3300"
    # 2노드 이관 (2026-07-02): GPU CUDA 서비스(Surya OCR / faster-whisper) 폐기 대응 명시 게이트.
    # false = 해당 경로 명시 비활성 — OCR 은 _call_ocr 이 경고 로그 후 None(기존 soft-fail 의미론),
    # STT 는 터미널 skip + extract_meta 기록. silent 저품질 fallback 아님 (로그/메타로 가시).
    ocr_enabled: bool = True
    stt_enabled: bool = True
    # §3 file_watcher: Roon 음원 경로 (prefix match 로 skip).
    # 빈 문자열이면 skip 없음. 예: "/documents/PKM/../Music/roon-library" 또는
    # NFS 경유 별도 마운트된 Roon 라이브러리.
@@ -224,6 +236,8 @@ def load_settings() -> Settings:
    kordoc_endpoint = os.getenv("KORDOC_ENDPOINT", "http://kordoc-service:3100")
    ocr_endpoint = os.getenv("OCR_ENDPOINT", "http://ocr-service:3200")
    stt_endpoint = os.getenv("STT_ENDPOINT", "http://stt-service:3300")
    ocr_enabled = os.getenv("OCR_ENABLED", "true").lower() in ("1", "true", "yes")
    stt_enabled = os.getenv("STT_ENABLED", "true").lower() in ("1", "true", "yes")
    roon_library_path = os.getenv("ROON_LIBRARY_PATH", "")
    # ADDITIONAL_WATCH_TARGETS — 쉼표 구분 (공백 제거)
@@ -343,6 +357,8 @@ def load_settings() -> Settings:
        kordoc_endpoint=kordoc_endpoint,
        ocr_endpoint=ocr_endpoint,
        stt_endpoint=stt_endpoint,
        ocr_enabled=ocr_enabled,
        stt_enabled=stt_enabled,
        roon_library_path=roon_library_path,
        additional_watch_targets=additional_watch_targets,
        taxonomy=taxonomy,
@@ -36,6 +36,8 @@ KNOWN_4B_TASKS = {
 }
 KNOWN_26B_TASKS = {
    "p3c_deep_summary",
    # presegment PR2 — 거대문서 map-reduce 의 reduce 단계 (요약들의 요약)
    "p3c_deep_summary_reduce",
    "p4b_synthesis",
 }
@@ -0,0 +1,44 @@
 [System]
 너는 긴 문서·문서 묶음 분석가다. 이 문서는 한 번에 처리하기에 너무 커서, 원문을 순서대로 유닛으로 나눠 각 유닛을 먼저 요약했다(map 단계). 아래 "유닛 요약"들은 원문 순서 그대로이며 문서 전체를 빠짐없이 커버한다. 너는 이를 종합해 문서 전체의 최종 분석을 작성한다(reduce 단계).
 subject_description: {subject_description}
 {forbidden_block}
 envelope 를 읽는 순서:
 1. risk_flags 를 먼저 본다. 어떤 위험 때문에 올라온 것인지 파악.
 2. synthesis_directives 를 system 지시로 간주하여 반드시 준수.
 3. distilled_context 는 "참고 요지"일 뿐, 근거는 유닛 요약에서 재확인.
 작성 규칙:
 - TL;DR (1문장, 최대 60자)
 - 핵심 (bullets 5개, 각 30~80자)
 - 상세 (2~4 문단, 각 3~5문장) — 유닛(섹션) 순서의 논리 흐름을 보전하며 문서 전체를 관통하는 서술. 특정 유닛만 편식하지 말 것.
 - 유닛 요약에 없는 정보 금지 (hallucination 금지). 숫자·조문·인용은 유닛 요약에 있는 것만 사용.
 - 유닛 요약의 "불일치(...)" 줄들은 중복 제거해 inconsistencies 로 보전 — 임의로 버리지 않는다.
 - synthesis_directives 의 문구 규칙 ("원인은 ~" 금지 등) 반드시 준수.
 - multi_reference_synthesis flag 있으면 레퍼런스별 입장 분리 기술, 종합 권고 금지.
 출력 (JSON only):
 {{
  "mode": "single|bundle",
  "tldr": "...",
  "bullets": ["..."],
  "detail": "...\\n\\n...",
  "bundle_flow": ["..."] | null,
  "inconsistencies": ["..."] | null,
  "entities_confirmed": {{
    "people": [{{"name": "...", "evidence": "..."}}],
    "orgs": [...],
    "projects": [...]
  }},
  "directives_applied": ["..."],
  "confidence": 0.0~1.0
 }}
 [User]
 Envelope:
 {{escalation_envelope_json}}
 유닛 요약 (총 {{unit_count}}개, 원문 순서 — 각 블록 = 원문 한 구간의 요약):
 {{unit_summaries}}
@@ -0,0 +1,104 @@
 # requirements.lock — 라이브 fastapi 컨테이너 pip freeze 스냅샷 (2026-07-02, 101 pkgs, CVE-clear known-good)
 # 재생성: docker exec hyungi_document_server-fastapi-1 pip freeze > app/requirements.lock (헤더 재부착)
 # requirements.txt = 사람이 편집하는 floor 사양(>=) / 본 lock = Dockerfile 이 실제 설치하는 정본(==)
 annotated-doc==0.0.4
 annotated-types==0.7.0
 anthropic==0.109.1
 anyio==4.13.0
 APScheduler==3.11.2
 asyncpg==0.31.0
 babel==2.18.0
 bcrypt==5.0.0
 beautifulsoup4==4.15.0
 caldav==3.2.1
 certifi==2026.5.20
 cffi==2.0.0
 chardet==7.4.3
 charset-normalizer==3.4.7
 click==8.4.1
 cobble==0.1.4
 courlan==1.4.0
 cryptography==48.0.1
 cssselect==1.4.0
 dateparser==1.4.0
 defusedxml==0.7.1
 distro==1.9.0
 dnspython==2.8.0
 docstring_parser==0.18.0
 ecdsa==0.19.2
 et_xmlfile==2.0.0
 fastapi==0.136.3
 feedparser==6.0.12
 flatbuffers==25.12.19
 greenlet==3.5.1
 h11==0.16.0
 htmldate==1.10.0
 httpcore==1.0.9
 httptools==0.8.0
 httpx==0.28.1
 icalendar==7.1.2
 icalendar-searcher==1.0.6
 idna==3.18
 jh2==5.0.13
 Jinja2==3.1.6
 jiter==0.15.0
 jusText==3.0.2
 lxml==6.1.1
 lxml_html_clean==0.4.5
 magika==0.6.3
 mammoth==1.11.0
 Markdown==3.10.2
 markdownify==1.2.2
 markitdown==0.1.6
 MarkupSafe==3.0.3
 niquests==3.19.1
 numpy==2.4.6
 olefile==0.47
 onnxruntime==1.26.0
 openpyxl==3.1.5
 packaging==26.2
 pandas==3.0.3
 pgvector==0.4.2
 pillow==12.2.0
 protobuf==7.35.0
 pyasn1==0.6.3
 pycparser==3.0
 pydantic==2.13.4
 pydantic_core==2.46.4
 pyhwp==0.1b15
 PyMuPDF==1.27.2.3
 pyotp==2.9.0
 python-dateutil==2.9.0.post0
 python-dotenv==1.2.2
 python-jose==3.5.0
 python-multipart==0.0.32
 python-pptx==1.0.2
 pytz==2026.2
 PyYAML==6.0.3
 qh3==1.9.2
 readability-lxml==0.8.4.1
 recurring-ical-events==3.8.2
 regex==2026.5.9
 requests==2.34.2
 rsa==4.9.1
 sgmllib3k==1.0.0
 six==1.17.0
 sniffio==1.3.1
 soupsieve==2.8.4
 SQLAlchemy==2.0.50
 starlette==1.2.1
 tld==0.13.2
 trafilatura==2.1.0
 typing-inspection==0.4.2
 typing_extensions==4.15.0
 tzdata==2026.2
 tzlocal==5.3.1
 urllib3==2.7.0
 urllib3-future==2.21.902
 uvicorn==0.49.0
 uvloop==0.22.1
 wassima==2.1.1
 watchfiles==1.2.0
 websockets==16.0
 x-wr-timezone==2.0.1
 xlsxwriter==3.2.9
@@ -3,19 +3,16 @@
 GET /api/queue/overview 의 집계 로직. 모든 수치는 기존 processing_queue /
 documents 컬럼에서 라이브 계산 — 신규 테이블/마이그레이션 0 (HARD 제약).
-구조: SQL 수집부(build_overview 내부 5쿼리)와 판정부(순수 함수)를 분리.
+구조: SQL 수집부(build_overview 내부 4쿼리)와 판정부(순수 함수)를 분리.
 판정부(rows_to_* / build_machines / build_summarize_eta / build_trend /
 build_totals / compute_eta_minutes)는 DB 없이 단위테스트 가능.
-귀속 규칙 (단일 진실):
+귀속 규칙 (단일 진실 — 2026-07-02 컷오버 후 나스+맥미니 2노드):
- stage→machine 정적 맵: gpu = extract/embed/chunk/markdown/preview/thumbnail/
+- stage→machine 정적 맵: nas = extract/embed/chunk/markdown/preview/thumbnail/
-  fulltext/stt · macmini = classify/summarize · macbook = deep_summary
+  fulltext/stt (DS 본체 Docker — 임베딩·리랭크 모델 콜은 맥미니로 나감) ·
-  (단, settings.ai.deep 부재 시 deep_summary 도 macmini 귀속).
+  macmini = classify/summarize/deep_summary (단일 생성 LLM 허브).
- summarize 는 풀(pool): pending/processing/failed 는 macmini 귀속이되, 완료
+- deferred_pending(payload.deferred_until 미래)은 LLM 백오프 신호 —
-  실적(done_*)은 documents.ai_model_version 조인으로 분리 — 'qwen-macbook'
+  summarize/deep_summary 소속인 macmini 카드 귀속.
  이면 macbook 실적, 아니면 macmini 실적.
 - deferred_pending(payload.deferred_until 미래)은 macbook 카드 귀속
  (보류 = 맥북 불가 신호).
 """
 from datetime import datetime, timedelta
@@ -25,42 +22,33 @@ from zoneinfo import ZoneInfo
 from sqlalchemy import bindparam, text
 from sqlalchemy.ext.asyncio import AsyncSession
 from core.config import settings
 KST = ZoneInfo("Asia/Seoul")
 # 내부 판별용 alias — 응답에 raw 모델명 노출 금지, 머신 label 만 노출.
 _MACBOOK_MODEL_ALIAS = "qwen-macbook"
 # stage→machine 정적 맵 재료 (선언 순서 = 카드 stages 표시 순서)
-_GPU_STAGES = (
+_NAS_STAGES = (
    "extract", "embed", "chunk", "markdown",
    "preview", "thumbnail", "fulltext", "stt",
 )
-_MACMINI_STAGES = ("classify", "summarize")
+_MACMINI_STAGES = ("classify", "summarize", "deep_summary")
-_MACBOOK_STAGES = ("deep_summary",)
+_STAGE_ORDER = _NAS_STAGES + _MACMINI_STAGES
 _STAGE_ORDER = _GPU_STAGES + _MACMINI_STAGES + _MACBOOK_STAGES
-_MACHINE_KEYS = ("gpu", "macmini", "macbook")
+_MACHINE_KEYS = ("nas", "macmini")
 _MACHINE_LABELS = {
-    "gpu": "GPU 서버",
+    "nas": "나스",
    "macmini": "맥미니",
    "macbook": "맥북 M5 Max",
 }
 # 머신 카드당 current 표시 상한
 _CURRENT_LIMIT = 2
-def stage_machine_map(deep_enabled: bool) -> dict[str, str]:
+def stage_machine_map() -> dict[str, str]:
-    """stage → machine key 맵. deep 슬롯 부재 시 deep_summary 는 macmini 귀속."""
+    """stage → machine key 맵 (정적 — 나스/맥미니 2노드)."""
    mapping: dict[str, str] = {}
-    for s in _GPU_STAGES:
+    for s in _NAS_STAGES:
-        mapping[s] = "gpu"
+        mapping[s] = "nas"
    for s in _MACMINI_STAGES:
        mapping[s] = "macmini"
    for s in _MACBOOK_STAGES:
        mapping[s] = "macbook" if deep_enabled else "macmini"
    return mapping
@@ -90,23 +78,6 @@ def rows_to_stage_stats(rows) -> dict[str, dict]:
    return stats
 def rows_to_summarize_split(rows) -> dict[str, dict]:
    """summarize 완료 실적 분리 쿼리 행 → {"macbook"|"macmini": {done_*}}.
    is_macbook = documents.ai_model_version 이 'qwen-macbook' 인지 (내부 판별 전용).
    """
    split = {
        "macbook": {"done_1h": 0, "done_today": 0, "done_15m": 0},
        "macmini": {"done_1h": 0, "done_today": 0, "done_15m": 0},
    }
    for row in rows:
        key = "macbook" if row[0] else "macmini"
        split[key]["done_1h"] += int(row[1] or 0)
        split[key]["done_today"] += int(row[2] or 0)
        split[key]["done_15m"] += int(row[3] or 0)
    return split
 def display_title(row: dict) -> str:
    """표시용 제목 — title > original_filename > file_path basename > 문서 id."""
    if row.get("title"):
@@ -120,13 +91,10 @@ def display_title(row: dict) -> str:
 def build_machines(
    stage_stats: dict[str, dict],
    summarize_split: dict[str, dict],
    current_rows: list[dict],
    *,
    deep_enabled: bool,
 ) -> list[dict]:
-    """머신 카드 3장 (gpu / macmini / macbook) 구성 — 귀속 규칙의 판정부."""
+    """머신 카드 2장 (nas / macmini) 구성 — 귀속 규칙의 판정부."""
-    smap = stage_machine_map(deep_enabled)
+    smap = stage_machine_map()
    def g(stage: str, field: str) -> int:
        return stage_stats.get(stage, {}).get(field, 0)
@@ -149,29 +117,23 @@ def build_machines(
        pending = sum(g(s, "pending") for s in stages)
        processing = sum(g(s, "processing") for s in stages)
        failed = sum(g(s, "failed") for s in stages)
        done_1h = sum(g(s, "done_1h") for s in stages)
        done_today = sum(g(s, "done_today") for s in stages)
        done_15m = sum(g(s, "done_15m") for s in stages)
-        # 완료 실적: summarize 는 풀이라 stage 합산에서 제외하고 split 로 귀속
+        # 보류 백오프 = LLM 불가 신호 → LLM stage 소속인 macmini 카드 귀속
        done_1h = sum(g(s, "done_1h") for s in stages if s != "summarize")
        done_today = sum(g(s, "done_today") for s in stages if s != "summarize")
        done_15m = sum(g(s, "done_15m") for s in stages if s != "summarize")
        if key in summarize_split:
            done_1h += summarize_split[key]["done_1h"]
            done_today += summarize_split[key]["done_today"]
            done_15m += summarize_split[key]["done_15m"]
        # 보류 백오프 = 맥북 불가 신호 → macbook 카드 귀속 (deep 슬롯 유무 무관)
        deferred_pending = (
            g("summarize", "deferred_pending") + g("deep_summary", "deferred_pending")
-            if key == "macbook" else 0
+            if key == "macmini" else 0
        )
        # state 판정 — 우선순위: 가동 > 보류 > 대기 (사용자 피드백 2026-06-11).
        # 일하고 있으면(처리 중 또는 최근 15분 완료) 백오프 잔여가 있어도 "가동" —
        # 보류 건수는 카드의 deferred_pending 라인이 따로 보여준다. "보류" 칩은
-        # 실제로 일이 멈춰 있고 백오프만 쌓인 상태(sleep/불가 지속)에서만.
+        # 실제로 일이 멈춰 있고 백오프만 쌓인 상태(LLM 허브 불가 지속)에서만.
        if processing > 0 or done_15m > 0:
            state = "active"
-        elif key == "macbook" and deferred_pending > 0:
+        elif deferred_pending > 0:
            state = "deferred"
        else:
            state = "idle"
@@ -213,16 +175,6 @@ def build_summarize_eta(stage_stats: dict[str, dict]) -> dict:
    }
 def build_summarize_by_machine(summarize_split: dict[str, dict]) -> dict:
    """summarize 머신별 완료 실적 분담 (macmini vs macbook) — 보드 레인의
    오프로드 가시화용. rows_to_summarize_split 이 이미 만든 값을 응답 형태로
    투영(done_1h/done_today 만, done_15m 은 내부 state 판정 전용이라 제외)."""
    def m(key: str) -> dict:
        s = summarize_split.get(key, {})
        return {"done_1h": int(s.get("done_1h", 0)), "done_today": int(s.get("done_today", 0))}
    return {"macmini": m("macmini"), "macbook": m("macbook")}
 def build_trend(
    inflow_buckets: dict[str, int],
    done_buckets: dict[str, int],
@@ -287,28 +239,23 @@ def build_totals(stage_stats: dict[str, dict]) -> dict:
 def compose_overview(
    stage_stats: dict[str, dict],
    summarize_split: dict[str, dict],
    inflow_buckets: dict[str, int],
    done_buckets: dict[str, int],
    current_rows: list[dict],
    *,
    deep_enabled: bool,
    now_kst: datetime,
 ) -> dict:
    """수집된 통계 → 응답 dict (계약 shape). 순수 함수 — DB 불요."""
    return {
-        "machines": build_machines(
+        "machines": build_machines(stage_stats, current_rows),
            stage_stats, summarize_split, current_rows, deep_enabled=deep_enabled
        ),
        "stages": build_stages(stage_stats),
        "summarize_eta": build_summarize_eta(stage_stats),
        "summarize_by_machine": build_summarize_by_machine(summarize_split),
        "trend_24h": build_trend(inflow_buckets, done_buckets, now_kst),
        "totals": build_totals(stage_stats),
    }
-# ─── SQL 수집부 (총 5쿼리) ────────────────────────────────────────────────────
+# ─── SQL 수집부 (총 4쿼리) ────────────────────────────────────────────────────
 # 1) stage×status 집계 + 시간창 완료/유입 + 보류 (1방)
 _STAGE_STATS_SQL = """
@@ -333,23 +280,7 @@ _STAGE_STATS_SQL = """
    GROUP BY stage
 """
-# 2) summarize 풀 완료 실적 분리 (documents.ai_model_version 조인, 1방)
+# 2/3) summarize 24h 추이 — KST 시간 버킷 (inflow/done 각 1방)
 #    스캔 하한 = 오늘 0시(KST)와 1h 전 중 더 이른 시각 (자정 직후 1h 창 보전).
 _SUMMARIZE_SPLIT_SQL = """
    SELECT
        COALESCE(d.ai_model_version = :macbook_alias, false)                 AS is_macbook,
        COUNT(*) FILTER (WHERE q.completed_at > NOW() - INTERVAL '1 hour')   AS done_1h,
        COUNT(*) FILTER (WHERE q.completed_at > :kst_midnight)               AS done_today,
        COUNT(*) FILTER (WHERE q.completed_at > NOW() - INTERVAL '15 minutes') AS done_15m
    FROM processing_queue q
    JOIN documents d ON d.id = q.document_id
    WHERE q.stage = 'summarize'
      AND q.status = 'completed'
      AND q.completed_at > LEAST(:kst_midnight, NOW() - INTERVAL '1 hour')
    GROUP BY 1
 """
 # 3/4) summarize 24h 추이 — KST 시간 버킷 (inflow/done 각 1방)
 _TREND_INFLOW_SQL = """
    SELECT to_char(date_trunc('hour', created_at AT TIME ZONE 'Asia/Seoul'),
                   'YYYY-MM-DD HH24:00')                                     AS bucket,
@@ -371,7 +302,7 @@ _TREND_DONE_SQL = """
    GROUP BY 1
 """
-# 5) processing 행 + 표시용 제목 재료 (1방 — 머신별 2건 슬라이스는 판정부에서)
+# 4) processing 행 + 표시용 제목 재료 (1방 — 머신별 2건 슬라이스는 판정부에서)
 _CURRENT_SQL = """
    SELECT q.stage, q.document_id, d.title, d.original_filename, d.file_path
    FROM processing_queue q
@@ -383,20 +314,13 @@ _CURRENT_SQL = """
 async def build_overview(session: AsyncSession) -> dict:
-    """5쿼리 수집 → compose_overview 판정 → 응답 dict."""
+    """4쿼리 수집 → compose_overview 판정 → 응답 dict."""
    now_kst = datetime.now(KST)
    kst_midnight = now_kst.replace(hour=0, minute=0, second=0, microsecond=0)
    deep_enabled = settings.ai is not None and settings.ai.deep is not None
    stage_rows = (
        await session.execute(text(_STAGE_STATS_SQL), {"kst_midnight": kst_midnight})
    ).all()
    split_rows = (
        await session.execute(
            text(_SUMMARIZE_SPLIT_SQL),
            {"kst_midnight": kst_midnight, "macbook_alias": _MACBOOK_MODEL_ALIAS},
        )
    ).all()
    inflow_rows = (await session.execute(text(_TREND_INFLOW_SQL))).all()
    done_rows = (await session.execute(text(_TREND_DONE_SQL))).all()
    current_result = (await session.execute(text(_CURRENT_SQL))).all()
@@ -414,11 +338,9 @@ async def build_overview(session: AsyncSession) -> dict:
    result = compose_overview(
        rows_to_stage_stats(stage_rows),
        rows_to_summarize_split(split_rows),
        {row[0]: int(row[1]) for row in inflow_rows},
        {row[0]: int(row[1]) for row in done_rows},
        current_rows,
        deep_enabled=deep_enabled,
        now_kst=now_kst,
    )
    # 큐 밖 관리 스크립트(백필 등) = background_jobs (migration 357). 테이블 부재 시 graceful([]).
@@ -426,13 +348,13 @@ async def build_overview(session: AsyncSession) -> dict:
    return result
-# kind -> 처리 머신 (보드 머신 카드 귀속용). 미상 kind = gpu(오케스트레이션 호스트).
+# kind -> 처리 머신 (보드 머신 카드 귀속용). 미상 kind = nas(오케스트레이션 호스트).
 _BG_JOB_MACHINE = {
    "global_digest": "macmini",
    "morning_briefing": "macmini",
    "section_summary": "macmini",
-    "hier_backfill": "gpu",
+    "hier_backfill": "nas",
-    "hier_redecompose": "gpu",
+    "hier_redecompose": "nas",
 }
@@ -466,7 +388,7 @@ async def _fetch_background_jobs(session: AsyncSession) -> list[dict]:
            "processed": int(r["processed"] or 0), "total": r["total"],
            "elapsed_sec": int(r["elapsed_sec"] or 0), "stale": bool(r["stale"]),
            "error": r["error"],
-            "machine": _BG_JOB_MACHINE.get(r["kind"], "gpu"),
+            "machine": _BG_JOB_MACHINE.get(r["kind"], "nas"),
        }
        for r in rows
    ]
@@ -17,6 +17,7 @@ snippet 생성:
 from __future__ import annotations
 import asyncio
 import os
 import re
 from typing import TYPE_CHECKING
@@ -33,8 +34,11 @@ logger = setup_logger("rerank")
 # 동시 rerank 호출 제한 (GPU saturation 방지)
 RERANK_SEMAPHORE = asyncio.Semaphore(2)
-# rerank input 크기 제한 (latency / VRAM hard cap)
+# rerank input 크기 제한 (latency / VRAM hard cap).
-MAX_RERANK_INPUT = 200
+# 2노드 이관(2026-07-02): env MAX_RERANK_INPUT 로 조정 가능 — 맥미니 llama.cpp 리랭크는
 # 후보 수에 선형(NAS발 실측 50=0.60s / 100=0.95s / 200=1.89s)이라 NAS 배포는 50 권장.
 # 기본 200 = 현행(GPU TEI) 무회귀.
 MAX_RERANK_INPUT = int(os.getenv("MAX_RERANK_INPUT", "200"))
 MAX_CHUNKS_PER_DOC = 2
 # Soft timeout (초)
@@ -18,6 +18,7 @@ from models.document_read import DocumentRead
 from models.study_concept_progress import StudyConceptProgress
 from models.study_question_progress import StudyQuestionProgress
 from models.study_topic import StudyTopic
 from services.study.concept_parser import parse_concept, resolve_related
 from services.study.sr_schedule import advance, first_due
 # 개념 행 조회 — 태그로 개념문서 필터 + 회독 진행 LEFT JOIN. md_content 는 전송 안 하고
@@ -205,3 +206,79 @@ async def mark_read(
    await session.commit()
    await session.refresh(prog)
    return {"ok": True, "review_stage": prog.review_stage, "due_at": prog.due_at}
 _CONCEPT_ONE_SQL = text(
    """
    SELECT d.id AS doc_id, d.title AS title, d.md_content AS md_content,
           split_part(replace(d.user_tags::text, '"', ''), '/', 3) AS subject,
           (d.md_content LIKE '%★★★%') AS f3,
           (d.md_content LIKE '%★★%')  AS f2,
           EXISTS (
             SELECT 1 FROM document_reads r
             WHERE r.document_id = d.id AND r.user_id = :uid
           ) AS is_read,
           p.review_stage AS review_stage,
           p.due_at AS due_at
    FROM documents d
    LEFT JOIN study_concept_progress p ON p.concept_doc_id = d.id AND p.user_id = :uid
    WHERE d.id = :doc_id AND d.deleted_at IS NULL AND d.user_tags::text LIKE :like
    """
 )
 async def concept_detail(
    session: AsyncSession, user_id: int, topic_id: int, doc_id: int
 ) -> dict | None:
    """개념 리더 재료 — md 구조 파싱 + 관련개념 백링크 해소 + 회독/SR 상태 + 같은 과목 이전/다음."""
    name = await _topic_name(session, topic_id)
    if not name:
        return None
    like = f"%@library/{name}/%"
    row = (
        await session.execute(
            _CONCEPT_ONE_SQL, {"uid": user_id, "doc_id": doc_id, "like": like}
        )
    ).mappings().first()
    if row is None:
        return None
    parsed = parse_concept(row["md_content"] or "")
    # 백링크 해소 + 이전/다음 = 같은 토픽 개념 title 인덱스(회독 rows 재사용)
    idx = await _concept_rows(session, user_id, name)
    title_index = [(r["doc_id"], r["title"], r["subject"]) for r in idx]
    resolved = resolve_related(parsed["related"], title_index)
    # 이전/다음 = 같은 과목, title 순
    same = sorted(
        [(r["doc_id"], r["title"]) for r in idx if r["subject"] == row["subject"]],
        key=lambda x: (x[1] or "", x[0]),
    )
    ids = [d for d, _ in same]
    prev_id = next_id = None
    if doc_id in ids:
        pos = ids.index(doc_id)
        if pos > 0:
            prev_id = ids[pos - 1]
        if pos < len(ids) - 1:
            next_id = ids[pos + 1]
    freq = 3 if row["f3"] else (2 if row["f2"] else 1)
    return {
        "doc_id": row["doc_id"],
        "db_title": row["title"],
        "title": parsed["title"] or row["title"],
        "subject": row["subject"],
        "freq": freq,
        "summary": parsed["summary"],
        "body": parsed["body"],
        "bincheol": parsed["bincheol"],
        "related": resolved,
        "is_read": row["is_read"],
        "review_stage": row["review_stage"],
        "due_at": row["due_at"],
        "prev_id": prev_id,
        "next_id": next_id,
    }
@@ -0,0 +1,139 @@
 """concept_links — 이론↔문제 브리지 롤업 (Stage B).
 study_concept_links(개념 doc ↔ 기출문항, 임베딩 코사인) + study_question_progress(내 풀이상태)를
 조인해 (a) 개념별 관련 기출 + 내 정답률(related_questions), (b) 개념 약점 지도(weakness_map) 산출.
 읽기 전용 집계 · LLM 0. 링크 적재는 scripts/concept_links_backfill.sql(임베딩) 배치.
 정답률 = 링크된 문항 중 progress.last_outcome 기준(attempted=풀이이력 보유, correct=최근정답).
 """
 from __future__ import annotations
 from sqlalchemy import text
 from sqlalchemy.ext.asyncio import AsyncSession
 _ACCURACY_WEAK_PCT = 60  # 정답률 < 60% = 약점(attempted>0 일 때만)
 _AGG_SQL = text(
    """
    SELECT count(*) AS linked,
           count(pr.study_question_id) FILTER (WHERE pr.last_outcome IS NOT NULL) AS attempted,
           count(*) FILTER (WHERE pr.last_outcome = 'correct') AS correct
    FROM study_concept_links l
    LEFT JOIN study_question_progress pr
      ON pr.study_question_id = l.question_id AND pr.user_id = :uid
    WHERE l.concept_doc_id = :doc_id AND l.link_source = 'embedding'
    """
 )
 _QROWS_SQL = text(
    """
    SELECT q.id AS id, q.subject AS subject, q.exam_round AS exam_round,
           q.exam_question_number AS qnum, l.score AS score,
           pr.last_outcome AS last_outcome, pr.review_stage AS review_stage
    FROM study_concept_links l
    JOIN study_questions q ON q.id = l.question_id AND q.deleted_at IS NULL AND q.is_active
    LEFT JOIN study_question_progress pr
      ON pr.study_question_id = q.id AND pr.user_id = :uid
    WHERE l.concept_doc_id = :doc_id AND l.link_source = 'embedding'
    ORDER BY l.score DESC
    LIMIT :limit
    """
 )
 _WEAKNESS_SQL = text(
    """
    SELECT d.id AS doc_id, d.title AS title,
           split_part(replace(d.user_tags::text, '"', ''), '/', 3) AS subject,
           count(l.id) AS linked,
           count(pr.study_question_id) FILTER (WHERE pr.last_outcome IS NOT NULL) AS attempted,
           count(*) FILTER (WHERE pr.last_outcome = 'correct') AS correct
    FROM documents d
    JOIN study_concept_links l ON l.concept_doc_id = d.id AND l.link_source = 'embedding'
    LEFT JOIN study_question_progress pr
      ON pr.study_question_id = l.question_id AND pr.user_id = :uid
    WHERE d.user_tags::text LIKE :like AND d.deleted_at IS NULL
    GROUP BY d.id, d.title, subject
    """
 )
 async def related_questions(
    session: AsyncSession, user_id: int, doc_id: int, limit: int = 20
 ) -> dict:
    """개념 doc 의 관련 기출 + 내 정답률(전체 링크 기준 집계 + 상위 N 표시용)."""
    agg = (
        await session.execute(_AGG_SQL, {"uid": user_id, "doc_id": doc_id})
    ).mappings().first()
    rows = (
        await session.execute(
            _QROWS_SQL, {"uid": user_id, "doc_id": doc_id, "limit": limit}
        )
    ).mappings().all()
    linked = (agg["linked"] if agg else 0) or 0
    attempted = (agg["attempted"] if agg else 0) or 0
    correct = (agg["correct"] if agg else 0) or 0
    accuracy = round(100 * correct / attempted) if attempted else None
    return {
        "linked": linked,
        "attempted": attempted,
        "correct": correct,
        "accuracy": accuracy,
        "questions": [
            {
                "id": r["id"],
                "subject": r["subject"],
                "exam_round": r["exam_round"],
                "qnum": r["qnum"],
                "score": round(r["score"], 3) if r["score"] is not None else None,
                "last_outcome": r["last_outcome"],
                "review_stage": r["review_stage"],
            }
            for r in rows
        ],
    }
 async def weakness_map(
    session: AsyncSession, user_id: int, topic_name: str, limit: int = 12
 ) -> dict:
    """개념 약점 지도 — 링크된 기출 정답률로 개념 채색. 약점(attempted>0·정답률<60%) 우선 정렬."""
    like = f"%@library/{topic_name}/%"
    rows = (
        await session.execute(_WEAKNESS_SQL, {"uid": user_id, "like": like})
    ).mappings().all()
    concepts = []
    for r in rows:
        attempted = r["attempted"] or 0
        correct = r["correct"] or 0
        accuracy = round(100 * correct / attempted) if attempted else None
        if accuracy is None:
            state = "unattempted"
        elif accuracy < _ACCURACY_WEAK_PCT:
            state = "weak"
        else:
            state = "ok"
        concepts.append(
            {
                "doc_id": r["doc_id"],
                "title": r["title"],
                "subject": r["subject"],
                "linked": r["linked"] or 0,
                "attempted": attempted,
                "accuracy": accuracy,
                "state": state,
            }
        )
    # 약점 우선(정답률 오름차순) → 미평가는 뒤로. 홈 위젯용 상위 N.
    weak = sorted(
        [c for c in concepts if c["state"] == "weak"],
        key=lambda c: (c["accuracy"], -c["attempted"], c["doc_id"]),
    )
    return {
        "weak": weak[:limit],
        "weak_total": len(weak),
        "evaluated_total": sum(1 for c in concepts if c["state"] != "unattempted"),
    }
@@ -0,0 +1,175 @@
 """concept_parser — 개념노트 markdown 구조 파서 + 관련개념 백링크 해소 (이론 리더용).
 정찰 실측 불변식(273/273): 개념노트는 고정 골격을 100% 따름 —
    # {H1 제목}                     (첫 줄, DB title 과 다른 표시용 제목)
    > **한 줄 요약**: {요약}          (blockquote, 라벨 고정)
    ## {본문 라벨}  ...              (BODY, 자유 라벨 H2 0~N, 트레일 ★ 가능)
    ## 빈출 포인트                    (항상, 관련개념 직전)
    ## 관련 개념                      (항상, 문서 최종 섹션)
 코드펜스(``` ASCII 도식) 내부의 ##/- 는 무시. 헤딩 트레일 ★ 는 스트립(라벨 정규화).
 '빈출 포인트'/'관련 개념' 앵커만 이름으로 잡고 나머지 BODY 는 순서·위치로 처리(라벨 화이트리스트 금지).
 순수 함수 · LLM 0.
 """
 from __future__ import annotations
 import re
 _FENCE = re.compile(r"^\s*```")
 _H1 = re.compile(r"^#\s+(.+?)\s*$")
 _H2 = re.compile(r"^##\s+(.+?)\s*$")  # ### 는 매칭 안 됨(## 뒤 \s 요구)
 _SUMMARY = re.compile(r"^>\s*\*\*한 줄 요약\*\*:\s*(.+)$")
 _STAR_SUFFIX = re.compile(r"\s*★+\s*$")
 _TRAIL_STARS = re.compile(r"★+\s*$")
 _BINCHEOL_ITEM = re.compile(r"^\s*-\s+(★*)\s*(.+)$")
 _RELATED_ITEM = re.compile(r"^\s*-\s+(.+)$")
 _PAREN = re.compile(r"\s*\(.*$")  # 괄호부터 끝(clarifier 힌트 절단)
 _NUM_PREFIX = re.compile(r"^\d+_")
 _STRIP_SYM = re.compile(r"[\s_·,./()\-]")
 _ANCHOR_BINCHEOL = "빈출 포인트"
 _ANCHOR_RELATED = "관련 개념"
 def parse_concept(md: str) -> dict:
    """개념노트 md → {title, summary, body[{label,stars,md}], bincheol[{tier,text}], related[{raw,phrase,hint}]}."""
    lines = (md or "").split("\n")
    title: str | None = None
    summary: str | None = None
    body: list[dict] = []
    bincheol_lines: list[str] = []
    related_lines: list[str] = []
    in_fence = False
    zone = "pre"  # pre | body | bincheol | related
    body_cur: dict | None = None
    def emit(line: str) -> None:
        if body_cur is not None:
            body_cur["_lines"].append(line)
        elif zone == "bincheol":
            bincheol_lines.append(line)
        elif zone == "related":
            related_lines.append(line)
        # pre-zone 내용(요약 앞 잡음)은 버림
    for ln in lines:
        if _FENCE.match(ln):
            in_fence = not in_fence
            emit(ln)
            continue
        if in_fence:
            emit(ln)
            continue
        if title is None:
            m = _H1.match(ln)
            if m:
                title = m.group(1).strip()
                continue
        if summary is None:
            m = _SUMMARY.match(ln)
            if m:
                summary = m.group(1).strip()
                continue
        m2 = _H2.match(ln)
        if m2:
            raw_label = m2.group(1).strip()
            star_m = _TRAIL_STARS.search(raw_label)
            stars = len(star_m.group(0).strip()) if star_m else 0
            label = _STAR_SUFFIX.sub("", raw_label).strip()
            if label == _ANCHOR_BINCHEOL:
                zone = "bincheol"
                body_cur = None
                continue
            if label == _ANCHOR_RELATED:
                zone = "related"
                body_cur = None
                continue
            body_cur = {"label": label, "stars": stars, "_lines": []}
            body.append(body_cur)
            zone = "body"
            continue
        emit(ln)
    body_out = []
    for s in body:
        text = "\n".join(s["_lines"]).strip()
        if text or s["label"]:
            body_out.append({"label": s["label"], "stars": s["stars"], "md": text})
    bincheol = []
    for ln in bincheol_lines:
        m = _BINCHEOL_ITEM.match(ln)
        if m:
            bincheol.append({"tier": len(m.group(1)), "text": m.group(2).strip()})
    related = []
    for ln in related_lines:
        m = _RELATED_ITEM.match(ln)
        if m:
            raw = m.group(1).strip()
            phrase = _PAREN.sub("", raw).strip()
            hint = raw[len(phrase):].strip() if len(raw) > len(phrase) else ""
            if phrase:
                related.append({"raw": raw, "phrase": phrase, "hint": hint})
    return {
        "title": title,
        "summary": summary,
        "body": body_out,
        "bincheol": bincheol,
        "related": related,
    }
 def _normalize(s: str) -> str:
    """해소용 정규화: NN_ 접두 제거 → 소문자 → 공백/기호 제거. 영문은 lowercase 유지."""
    s = _NUM_PREFIX.sub("", s or "")
    s = s.lower()
    s = _STRIP_SYM.sub("", s)
    return s
 def resolve_related(related: list[dict], title_index: list[tuple]) -> list[dict]:
    """관련개념 구절 → 개념 doc 해소. title_index = [(doc_id, title, subject), ...].
    다단 fallback(정찰 ~79%): 정규화 exact → 양방향 substring(≥2자 가드) → 미해소=dangling(doc_id None).
    """
    norm_exact: dict[str, int] = {}
    norm_list: list[tuple[str, int, str]] = []
    for did, ttl, _subj in title_index:
        n = _normalize(ttl)
        if n:
            norm_exact.setdefault(n, did)
            norm_list.append((n, did, ttl))
    out = []
    for it in related:
        pn = _normalize(it["phrase"])
        did: int | None = None
        rtitle: str | None = None
        if pn and len(pn) >= 2:
            if pn in norm_exact:
                did = norm_exact[pn]
            else:
                # substring 폴백: title-norm ⊆ phrase-norm 방향만(짧은 phrase 가 더 큰 title 을
                # 삼키는 오결선 방지, 예: '염산'→'염산나트륨' X) + 길이차 최소(가장 구체적) +
                # doc_id tiebreak(순서 무관 결정성). 후보 없으면 dangling(doc_id None).
                cands = [
                    (abs(len(n) - len(pn)), cand, ttl)
                    for n, cand, ttl in norm_list
                    if len(n) >= 2 and n in pn
                ]
                if cands:
                    cands.sort(key=lambda c: (c[0], c[1]))
                    _, did, rtitle = cands[0]
        if did is not None and rtitle is None:
            rtitle = next((t for d, t, _ in title_index if d == did), None)
        out.append(
            {"phrase": it["phrase"], "hint": it["hint"], "doc_id": did, "title": rtitle}
        )
    return out
@@ -0,0 +1,224 @@
 """summarize_units — 거대문서 요약 전용 분할(map-reduce 유닛) 순수함수 (presegment PR1).
 plan ds-presegment-mapreduce-2 (2026-06-29 설계 합의 · PR0 실측 봉인):
  - CAP_TOKENS = 12,000 tok/unit — greedy-pack 상한 (PR0: giant 236건 실측 캘리브레이션)
  - TRIGGER_TOKENS = 25,000 tok — 이하는 단일콜 유지, 초과 시 map-reduce
  - 3-way over% 게이트 (단독 CAP 초과 섹션의 토큰 비중. 헤딩 개수는 무의미 — ASME 1,494개):
      over% == 0        → 'auto'   (TIER1: 로컬 자동 분할, PR0 실측 78%)
      0 < over% <= 40   → 'hybrid' (패킹분 로컬 + 초과 섹션만 클로드, 8%)
      over% > 40        → 'whole'  (TIER2: 클로드 전체 분할, 14%)
  - 토큰 추정 = PR0 실 Qwen 토크나이저 캘리브레이션: 한글 0.529 tok/char · 기타 0.217.
    구 휴리스틱(0.625/0.25)은 ~15% 과대라 폐기.
 불변식:
  - 순수함수 — DB/네트워크/파일 접촉 0. 분할 = 요약 전용 아티팩트(문서 아님·검색/임베딩 미편입).
  - leaf 추출 = hier_decomp.builder 재사용, leaf_hard_max=∞ 로 window-split 억제
    (헤딩 leaf 만 — PR0 측정환경과 동일). 인접 섹션만 greedy-pack(순서 보존·중간 폐기 0
    — 구 deep_summary 의 head/mid/tail 가운데 폐기 버그를 커버리지로 대체).
  - 배선(deep_summary 분기·HOLD·클로드 알람)은 PR2/PR3 — 본 모듈은 계획만 산출.
 호출: plan_summarize_units(md_text) -> UnitPlan
 """
 from __future__ import annotations
 import sys
 from dataclasses import dataclass, field
 # 상대 import — 컨테이너(services.*)와 repo-root 테스트(app.services.*) 양쪽에서 동작.
 # (구 `from app.services...` 절대 import 는 컨테이너에 app 패키지가 없어 ModuleNotFoundError —
 #  PR1 은 소비자 0 이라 잠복했던 버그, PR2 배선 시점에 수정.)
 from .hier_decomp.builder import HierNode, build_hier_tree
 CAP_TOKENS = 12_000
 TRIGGER_TOKENS = 25_000
 HYBRID_MAX_OVER_PCT = 40.0
 # PR0 실 Qwen tokenizer 캘리브레이션 (tok/char)
 KO_TOK_PER_CHAR = 0.529
 OTHER_TOK_PER_CHAR = 0.217
 _HANGUL_RANGES = (
    (0xAC00, 0xD7A3),  # 완성형 음절
    (0x1100, 0x11FF),  # 자모
    (0x3130, 0x318F),  # 호환 자모
 )
 def _is_hangul(ch: str) -> bool:
    cp = ord(ch)
    return any(lo <= cp <= hi for lo, hi in _HANGUL_RANGES)
 def estimate_tokens(text: str) -> int:
    """PR0 캘리브레이션 기반 토큰 추정 (한글 0.529 · 기타 0.217 tok/char)."""
    if not text:
        return 0
    ko = sum(1 for ch in text if _is_hangul(ch))
    other = len(text) - ko
    return round(ko * KO_TOK_PER_CHAR + other * OTHER_TOK_PER_CHAR)
@dataclass
 class SummarizeUnit:
    """map-reduce 1유닛 — 인접 leaf 섹션들의 greedy-pack (요약 전용, 문서 아님)."""
    index: int
    section_titles: list[str | None] = field(default_factory=list)
    text: str = ""
    est_tokens: int = 0
    over_cap: bool = False  # 단독 섹션이 CAP 초과 (hybrid 시 클로드 대상)
@dataclass
 class UnitPlan:
    mode: str                    # 'single' | 'map_reduce'
    tier: str | None             # map_reduce 시 'auto' | 'hybrid' | 'whole'
    total_est_tokens: int = 0
    over_pct: float = 0.0
    units: list[SummarizeUnit] = field(default_factory=list)
 def extract_leaves(md_text: str) -> list[HierNode]:
    """헤딩 leaf 만 추출 — leaf_hard_max=∞ 로 window-split 억제 (PR0 측정환경 동일)."""
    nodes = build_hier_tree(
        md_text,
        leaf_target_max=sys.maxsize,
        leaf_hard_max=sys.maxsize,
    )
    return [n for n in nodes if n.is_leaf]
 def greedy_pack(leaves: list[HierNode], cap: int = CAP_TOKENS) -> list[SummarizeUnit]:
    """인접 leaf 를 순서 보존하며 est_tokens<=cap 으로 pack. 단독 초과 leaf = 전용 유닛(over_cap)."""
    units: list[SummarizeUnit] = []
    cur_titles: list[str | None] = []
    cur_texts: list[str] = []
    cur_tokens = 0
    def _flush() -> None:
        nonlocal cur_titles, cur_texts, cur_tokens
        if cur_texts:
            units.append(SummarizeUnit(
                index=len(units),
                section_titles=cur_titles,
                text="\n\n".join(cur_texts),
                est_tokens=cur_tokens,
            ))
            cur_titles, cur_texts, cur_tokens = [], [], 0
    for leaf in leaves:
        t = estimate_tokens(leaf.text)
        if t > cap:
            _flush()
            units.append(SummarizeUnit(
                index=len(units),
                section_titles=[leaf.section_title],
                text=leaf.text,
                est_tokens=t,
                over_cap=True,
            ))
            continue
        if cur_tokens + t > cap:
            _flush()
        cur_titles.append(leaf.section_title)
        cur_texts.append(leaf.text)
        cur_tokens += t
    _flush()
    return units
 def over_pct(leaves: list[HierNode], cap: int = CAP_TOKENS) -> float:
    """단독 CAP 초과 섹션들의 토큰 비중(%) — 3-way 게이트 입력."""
    total = 0
    over = 0
    for leaf in leaves:
        t = estimate_tokens(leaf.text)
        total += t
        if t > cap:
            over += t
    if total == 0:
        return 0.0
    return over * 100.0 / total
 def gate(over: float) -> str:
    """over% → tier. 0=auto / (0,40]=hybrid / >40=whole. 클로드 결과 재검증에도 재사용."""
    if over <= 0.0:
        return "auto"
    if over <= HYBRID_MAX_OVER_PCT:
        return "hybrid"
    return "whole"
 def plan_summarize_units(
    md_text: str, *,
    cap: int = CAP_TOKENS,
    trigger: int = TRIGGER_TOKENS,
 ) -> UnitPlan:
    """문서 → 요약 실행 계획. trigger 이하=single(현행 단일콜), 초과=map_reduce(tier+units)."""
    total = estimate_tokens(md_text)
    if total <= trigger:
        return UnitPlan(mode="single", tier=None, total_est_tokens=total)
    leaves = extract_leaves(md_text)
    pct = over_pct(leaves, cap)
    return UnitPlan(
        mode="map_reduce",
        tier=gate(pct),
        total_est_tokens=total,
        over_pct=round(pct, 2),
        units=greedy_pack(leaves, cap),
    )
 # ─── PR2 — map/reduce 프롬프트 조립 순수함수 (deep_summary_worker 가 소비) ───
 def render_map_slice(unit: SummarizeUnit, total_units: int) -> str:
    """map 콜의 {original_text_slices} 대체 — 유닛 위치·섹션 라벨 + 본문."""
    titles = " · ".join(t for t in unit.section_titles if t) or "(무제 구간)"
    return f"[유닛 {unit.index + 1}/{total_units} — 섹션: {titles}]\n{unit.text}"
 def _format_unit_summary(res: dict, total_units: int) -> str:
    """map 결과 1건 → reduce 입력 블록. res 키 = index/titles/tldr/detail/inconsistencies."""
    titles = " · ".join(t for t in (res.get("titles") or []) if t) or "(무제 구간)"
    lines = [f"[유닛 {int(res.get('index', 0)) + 1}/{total_units} — 섹션: {titles}]"]
    if res.get("tldr"):
        lines.append(f"TLDR: {res['tldr']}")
    if res.get("detail"):
        lines.append(str(res["detail"]))
    for inc in res.get("inconsistencies") or []:
        if isinstance(inc, dict):
            lines.append(f"불일치({inc.get('kind', '')}): {inc.get('desc', '')}")
    return "\n".join(lines)
 def build_reduce_units_block(
    results: list[dict],
    budget_tokens: int,
    *,
    min_detail_chars: int = 200,
 ) -> tuple[str, bool]:
    """reduce 입력 블록 조립 — budget_tokens 이하 보장(캡 초과 0 검증 게이트의 reduce 측).
    초과 시 detail 만 비례 절단(라벨·TLDR·불일치 보전, 원문 순서 유지). 반환 (block, truncated).
    """
    total_units = len(results)
    work = [dict(r) for r in results]
    truncated = False
    for _ in range(4):
        block = "\n\n".join(_format_unit_summary(r, total_units) for r in work)
        est = estimate_tokens(block)
        if est <= budget_tokens:
            return block, truncated
        ratio = budget_tokens / est
        for r in work:
            detail = str(r.get("detail") or "")
            keep = max(min_detail_chars, int(len(detail) * ratio * 0.9))
            if len(detail) > keep:
                r["detail"] = detail[:keep] + "…(절단)"
                truncated = True
    # 최후 방어 — 비례 절단이 floor(min_detail_chars)에 막히면 문자 하드 컷(KO 최악 비율 가정)
    block = "\n\n".join(_format_unit_summary(r, total_units) for r in work)
    if estimate_tokens(block) > budget_tokens:
        block = block[: max(1, int(budget_tokens / KO_TOK_PER_CHAR))]
        truncated = True
    return block, truncated
@@ -10,7 +10,9 @@ EscalationEnvelope + subject_domain 을 읽어, PR-A policy 템플릿 `p3c_deep_
 from __future__ import annotations
 import asyncio
 import json
 import os
 import time
 from datetime import datetime, timezone
@@ -29,10 +31,25 @@ from models.queue import ProcessingQueue, StageDeferred
 from policy.prompt_render import render_26b, policy_version as compute_policy_version
 from services.document_telemetry import record_analyze_event
 from services.search.llm_gate import Priority, acquire_mlx_gate
 from services.summarize_units import (
    CAP_TOKENS,
    UnitPlan,
    build_reduce_units_block,
    estimate_tokens,
    plan_summarize_units,
    render_map_slice,
 )
 logger = setup_logger("deep_summary_worker")
 DEEP_SUMMARY_TASK = "p3c_deep_summary"
 # presegment PR2 (plan ds-presegment-mapreduce-2) — 거대문서 map-reduce
 REDUCE_TASK = "p3c_deep_summary_reduce"
 # HYBRID/TIER2(클로드 유인 분할 필요) HOLD 재확인 간격. PR3(알람·경계 주입) 전까지는
 # 이 간격으로 재계획만 반복한다 — attempts 미소모(StageDeferred)라 영구 failed 없음.
 HOLD_RETRY_MINUTES = int(os.getenv("DEEP_SUMMARY_HOLD_RETRY_MINUTES", "1440"))
 # reduce 프롬프트 오버헤드가 비정상적으로 커도 유닛 블록 예산은 이 밑으로 안 내려감(방어).
 REDUCE_BUDGET_FLOOR_TOKENS = 1_000
 # inconsistencies kind 허용 목록 (feedback_document_server_domain_scope.md — 구매/계약 제외)
 ALLOWED_INCONSISTENCY_KINDS = {
@@ -94,6 +111,25 @@ async def process(
    envelope = EscalationEnvelope.from_json(json.dumps(envelope_raw))
    # ─── presegment PR2 게이트 (plan ds-presegment-mapreduce-2) ───
    # TRIGGER(25K tok) 이하 = 아래 기존 단일콜 경로 그대로(무회귀). 초과 시 3-way:
    #   auto(over%==0)   → 로컬 map-reduce (유닛별 26B → reduce)
    #   hybrid/whole     → HOLD(awaiting_split) — 맥미니 미전송, 클로드 유인 분할은 PR3
    # 게이트/유닛은 전체 extracted_text 기준 — 단일콜의 head/mid/tail "가운데 폐기"를
    # 전 유닛 커버리지로 대체한다. build_hier_tree 가 거대 md 에서 초 단위 CPU 라
    # 이벤트루프 점유 회피 위해 to_thread (presegment_worker._read_toc 와 동일 패턴).
    unit_plan = await asyncio.to_thread(plan_summarize_units, doc.extracted_text or "")
    if unit_plan.mode == "map_reduce":
        # units 빈 auto 는 이론상 불가(비어있지 않은 텍스트 = leaf >= 1)지만, 빈 reduce
        # 단일콜(환각 위험)로 흐르지 않게 방어적으로 HOLD 로 보낸다.
        if unit_plan.tier != "auto" or not unit_plan.units:
            await _hold_awaiting_split(session, queue_row, unit_plan, document_id)
        await _process_map_reduce(
            doc, queue_row, envelope, subject_domain, unit_plan, session,
            defer_on_deep_unavailable=defer_on_deep_unavailable,
        )
        return
    # 원문 슬라이스 추출 (envelope.original_pointers.text_ranges 기반)
    slices = _build_text_slices(doc.extracted_text or "", envelope.original_pointers)
@@ -214,6 +250,267 @@ async def process(
    )
 async def _hold_awaiting_split(
    session: AsyncSession, queue_row: ProcessingQueue, plan: UnitPlan, document_id: int
 ) -> None:
    """HYBRID/TIER2 — 클로드 유인 분할 대기(HOLD). 맥미니 미전송, StageDeferred 보류.
    payload.presegment.awaiting_split 마킹을 먼저 commit — StageDeferred 핸들러
    (queue_consumer)는 새 세션에서 행을 다시 읽어 deferred_until 만 병합하므로 유실 없음.
    알람(ntfy)·클로드 경계 주입은 PR3 — 그 전까지는 HOLD_RETRY_MINUTES 간격 재계획만 반복.
    무인 자동 cloud 호출 금지 룰 준수(클로드 경로는 항상 유인 게이트).
    """
    payload = dict(queue_row.payload or {})
    preseg = dict(payload.get("presegment") or {})
    preseg.update({
        "awaiting_split": True,
        "tier": plan.tier,
        "over_pct": plan.over_pct,
        "total_est_tokens": plan.total_est_tokens,
        "units": len(plan.units),
        # 클로드가 분할해야 할 초과 섹션 표본 (PR3 알람 본문용)
        "oversized_sections": [
            (u.section_titles[0] if u.section_titles else None)
            for u in plan.units if u.over_cap
        ][:20],
    })
    payload["presegment"] = preseg
    queue_row.payload = payload  # 재할당 = JSONB 변경 감지
    await session.commit()
    logger.info(
        f"[deep] id={document_id} awaiting_split tier={plan.tier} over_pct={plan.over_pct} "
        f"total_est_tokens={plan.total_est_tokens} units={len(plan.units)} "
        f"→ HOLD ({HOLD_RETRY_MINUTES}분 후 재확인, 클로드 분할=PR3 유인)"
    )
    raise StageDeferred(
        f"awaiting_split:{plan.tier}", retry_after_minutes=HOLD_RETRY_MINUTES
    )
 async def _call_26b(
    client: AIClient, prompt: str, *, defer_on_deep_unavailable: bool, document_id: int
 ):
    """map/reduce 공용 26B 호출 — 단일콜 경로와 동일한 deep 슬롯 우선 + fair-share 폴백.
    반환 (raw, used_cfg). 맥북(deep) 불가 시 consumer 경로는 맥미니 primary 로 즉시
    처리(동일 모델 — 강등 아님), drain 경로는 StageDeferred 전파(맥북 레버 시멘틱).
    """
    deep_cfg = client.ai.deep
    if deep_cfg is not None:
        try:
            return await call_deep_or_defer(client, prompt), deep_cfg
        except StageDeferred:
            if defer_on_deep_unavailable:
                raise
            logger.info(f"[deep] id={document_id} 맥북 불가 → 맥미니 primary 처리 (fair-share)")
    async with acquire_mlx_gate(Priority.BACKGROUND):
        return await client.call_primary(prompt), settings.ai.primary
 def _parse_deep_output(raw: str) -> tuple[DeepSummaryOutput | None, str | None]:
    """raw → DeepSummaryOutput. 단일콜 경로와 동일한 3단 파서. 실패 시 (None, parse_error)."""
    try:
        parsed = _parse_outermost_json(raw) or parse_json_response(raw)
        if not parsed:
            parsed = _regex_extract_fields(raw)
        return DeepSummaryOutput.model_validate(parsed or {}), None
    except (ValidationError, ValueError, TypeError) as exc:
        return None, f"parse:{type(exc).__name__}"
 async def _process_map_reduce(
    doc: Document,
    queue_row: ProcessingQueue,
    envelope: EscalationEnvelope,
    subject_domain: str,
    plan: UnitPlan,
    session: AsyncSession,
    *,
    defer_on_deep_unavailable: bool,
 ) -> None:
    """TIER1 자동 — 유닛별 map(26B) → reduce(26B) → 단일콜과 동일 필드 기록.
    멱등 재개: 성공 유닛은 payload.presegment.map_results 에 즉시 commit —
    502/defer/재시작 후 재클레임 시 완료 유닛은 건너뛴다. 유닛 인덱스는
    plan_summarize_units 가 같은 extracted_text 에 결정적이라 attempt 간 안정.
    파싱 실패 유닛이 남으면 raise → queue_consumer 의 기존 attempts/백오프 재사용
    (실패 유닛만 재호출되므로 재시도 비용 = 잔여 유닛뿐).
    """
    document_id = doc.id
    units = plan.units
    n = len(units)
    payload = dict(queue_row.payload or {})
    preseg = dict(payload.get("presegment") or {})
    preseg.pop("awaiting_split", None)  # 재계획으로 auto 가 된 경우 HOLD 마킹 해제
    map_results: dict = dict(preseg.get("map_results") or {})
    logger.info(
        f"[deep] id={document_id} map_reduce 시작 units={n} over_pct={plan.over_pct} "
        f"total_est_tokens={plan.total_est_tokens} resume={len(map_results)}/{n}"
    )
    rendered = render_26b(DEEP_SUMMARY_TASK, subject_domain)
    envelope_injection = envelope.to_system_injection()
    client = AIClient()
    start = time.perf_counter()
    used_cfg = client.ai.deep or settings.ai.primary
    failed_units: list[int] = []
    try:
        # ── map: 유닛별 26B (콜 사이마다 gate 를 놓아 짧은 인터랙티브 요청이 끼어든다) ──
        for unit in units:
            key = str(unit.index)
            if key in map_results:
                continue
            prompt = (
                rendered
                .replace("{escalation_envelope_json}", envelope_injection)
                .replace("{original_text_slices}", render_map_slice(unit, n))
            )
            # 검증 게이트 "모든 LLM 콜 캡 초과 0" 을 로그로 단정 가능하게 남긴다.
            logger.info(
                f"[deep] id={document_id} map {unit.index + 1}/{n} "
                f"unit_tokens={unit.est_tokens} prompt_est_tokens={estimate_tokens(prompt)} "
                f"cap={CAP_TOKENS}"
            )
            raw, used_cfg = await _call_26b(
                client, prompt,
                defer_on_deep_unavailable=defer_on_deep_unavailable,
                document_id=document_id,
            )
            out, perr = _parse_deep_output(raw)
            if out is None or not (out.detail or out.tldr):
                # 실패 유닛은 persist 하지 않음 — 재시도가 이 유닛만 다시 호출한다.
                failed_units.append(unit.index)
                logger.warning(
                    f"[deep] id={document_id} map {unit.index + 1}/{n} 결과 비었음/파싱 실패"
                    f"({perr}) — 유닛 재시도 대상"
                )
                continue
            # ★매 유닛 새 dict 로 재구성 (in-place 변경 금지) — 직전 commit 의 committed
            # 스냅샷이 같은 중첩 객체를 참조하면 old==new 로 보여 SQLAlchemy 가 UPDATE 를
            # 스킵한다(60254 라이브에서 unit 0 만 persist 된 aliasing 버그의 fix).
            map_results = {
                **map_results,
                key: {
                    "index": unit.index,
                    "titles": [t for t in unit.section_titles if t][:8],
                    "tldr": out.tldr,
                    "detail": out.detail,
                    "inconsistencies": _filter_inconsistencies(out.inconsistencies or []),
                },
            }
            preseg = {
                **preseg,
                "tier": plan.tier,
                "over_pct": plan.over_pct,
                "total_est_tokens": plan.total_est_tokens,
                "units": n,
                "map_results": map_results,
            }
            payload = {**payload, "presegment": preseg}
            queue_row.payload = payload  # 재할당 = JSONB 변경 감지
            await session.commit()  # 유닛 단위 멱등 재개 지점
        if failed_units:
            raise ValueError(
                f"map 유닛 {len(failed_units)}/{n}건 결과 없음 — 재시도 대상: {failed_units[:10]}"
            )
        # ── reduce: 요약들의 요약 1콜 (유닛 블록도 캡 이하로 절단 보장) ──
        reduce_rendered = render_26b(REDUCE_TASK, subject_domain)
        base_prompt = (
            reduce_rendered
            .replace("{escalation_envelope_json}", envelope_injection)
            .replace("{unit_count}", str(n))
        )
        budget = max(
            REDUCE_BUDGET_FLOOR_TOKENS, CAP_TOKENS - estimate_tokens(base_prompt)
        )
        ordered = [map_results[str(u.index)] for u in units]
        block, reduce_truncated = build_reduce_units_block(ordered, budget)
        reduce_prompt = base_prompt.replace("{unit_summaries}", block)
        logger.info(
            f"[deep] id={document_id} reduce units={n} "
            f"prompt_est_tokens={estimate_tokens(reduce_prompt)} cap={CAP_TOKENS} "
            f"truncated={reduce_truncated}"
        )
        raw, used_cfg = await _call_26b(
            client, reduce_prompt,
            defer_on_deep_unavailable=defer_on_deep_unavailable,
            document_id=document_id,
        )
    except StageDeferred:
        logger.info(
            f"[deep] id={document_id} map_reduce 보류 — 완료 유닛 {len(map_results)}/{n} 보존"
        )
        raise
    except Exception as exc:
        # 단일콜 경로와 동일 — 호출 실패는 전파해 queue_consumer 가 재시도/dead-letter 처리.
        logger.warning(f"[deep] id={document_id} map_reduce 실패: {exc}")
        raise
    finally:
        await client.close()
    latency_ms = int((time.perf_counter() - start) * 1000)
    deep_out, parse_error = _parse_deep_output(raw)
    if deep_out is None:
        # 단일콜 경로와 동일 시멘틱 — doc 미기록(legacy 결과 보존), 이벤트로 가시화.
        deep_out = DeepSummaryOutput()
        logger.warning(f"[deep] id={document_id} reduce 파싱 실패 ({parse_error}) — doc 미기록")
    if not parse_error:
        doc.ai_detail_summary = (deep_out.detail or "").strip() or None
        # 불일치 = reduce 출력 + map 유닛 합본 dedup — reduce 가 떨궈도 유닛 발견분 보전.
        merged = _filter_inconsistencies(deep_out.inconsistencies or [])
        seen = {(i["kind"], i["desc"]) for i in merged}
        for res in ordered:
            for inc in res.get("inconsistencies") or []:
                k = (inc.get("kind"), inc.get("desc"))
                if k not in seen:
                    seen.add(k)
                    merged.append(inc)
        doc.ai_inconsistencies = merged
        doc.ai_analysis_tier = "deep"
        doc.ai_processed_at = datetime.now(timezone.utc)
    try:
        pv = compute_policy_version(REDUCE_TASK)
    except Exception:
        pv = None
    await record_analyze_event(
        doc_id=document_id,
        user_id=None,
        mode="summary_deep",
        text_limit=used_cfg.context_char_limit or 260000,
        truncated=reduce_truncated,
        layers_returned=["detail_summary", "inconsistencies"] if not parse_error else [],
        cached=False,
        latency_ms=latency_ms,
        model_name=used_cfg.model,
        prompt_version=(f"{REDUCE_TASK}@{pv}" if pv else REDUCE_TASK),
        error_code=parse_error,
        source="document_server",
        subject_domain=subject_domain,
        risk_flags=list(envelope.risk_flags),
        high_impact_task=None,
        escalation_reasons=list(envelope.escalation_reasons),
        confidence=deep_out.confidence,
        policy_version=pv,
        shadow_would_route_to="primary",
        tier="primary",
        escalated_to_26b=True,
        suppressed_reason=None,
    )
    logger.info(
        f"[deep] id={document_id} map_reduce 완료 units={n} "
        f"detail_len={len(doc.ai_detail_summary or '')} inc={len(doc.ai_inconsistencies or [])} "
        f"latency_ms={latency_ms} parse_error={parse_error}"
    )
 def _build_text_slices(text: str, pointers: dict) -> str:
    """original_pointers.text_ranges 의 [{start, end}] 를 실제 본문 슬라이스로 합친다.
@@ -110,6 +110,11 @@ def _get_pdf_page_count(
 async def _call_ocr(file_path: Path, is_image: bool, max_pages: int = 200) -> str | None:
    """OCR 서비스 호출 — 타임아웃 페이지 수 비례"""
    if not settings.ocr_enabled:
        # 2노드 이관(2026-07-02): GPU Surya 폐기 — 명시 비활성. None 반환 = 기존 soft-fail
        # 의미론(호출자가 ocr_attempted/skip_reason 메타 기록). 스캔 문서는 비전 배치 경로 별도.
        logger.warning("[ocr] OCR_ENABLED=false — skip (스캔·이미지 추출은 비전 배치 경로)")
        return None
    container_path = f"/documents/{file_path.relative_to(Path(settings.nas_mount_path))}"
    timeout = 60 if is_image else min(600, max(120, max_pages * 3))
    try:
@@ -42,6 +42,14 @@ async def process(document_id: int, session: AsyncSession) -> None:
        logger.warning(f"[stt] id={document_id} file_path 없음 — skip")
        return
    if not settings.stt_enabled:
        # 2노드 이관(2026-07-02): GPU stt-service 폐기 — 명시 비활성. silent 금지:
        # 경고 로그 + extract_meta 터미널 기록 (재시도 안 함, 상태 가시).
        doc.extract_meta = {**(doc.extract_meta or {}), "stt_skip_reason": "disabled", "stt_terminal": True}
        await session.commit()
        logger.warning(f"[stt] id={document_id} STT_ENABLED=false — 터미널 skip (전사 없음)")
        return
    # NAS 마운트 경로로 절대화 (services/stt 컨테이너도 동일 경로에 bind mount)
    container_path = str(Path(settings.nas_mount_path) / doc.file_path)
@@ -60,6 +60,9 @@ ai:
    rerank:
      endpoint: "http://reranker:80/rerank"
      model: "bge-reranker-v2-m3"
      # 2노드 이관: "tei"(GPU TEI /rerank, 기본) | "llamacpp"(맥미니 llama.cpp,
      # 예: endpoint http://100.76.254.116:8807/v1/rerank). 미지원 값 = 기동 시 ValueError.
      protocol: "tei"
    # Phase 3.5a answerability classifier. 2026-05-14 GPU LLM 제거 후 Mac mini 26B 로 swap.
    # classifier_service 가 hasattr 체크로 optional 이므로 이 섹션 제거 시 classifier gate 는 자동 skip (score-only).
@@ -1,6 +1,7 @@
 <script lang="ts">
-  // 처리 머신 보드 v3 — 통합안 (plan ds-board-merged: C2 머신레인 + C3 번다운/정직ETA).
+  // 처리 머신 보드 v4 — 2026-07-02 컷오버 후 2노드 (나스+맥미니).
-  //   · 머신 3레인(GPU/맥미니/맥북) = "누가 일하나" + 요약 오프로드(맥북 합류) 가시화
+  //   · 머신 2레인(나스/맥미니) = "누가 일하나" — 나스=DS 본체 Docker(추출/마크다운/
  //     청크·임베딩 등), 맥미니=단일 생성 LLM 허브(분류/요약/심층분석 + bge-m3/리랭크)
  //   · 지배 백로그 번다운 패널 = "언제 끝나나" + 유입 차감한 정직 ETA(summarize_eta)
  //   · 신선도 '갱신 N초 전' + stale 경고 / 실패 드로어·상세 패널은 v2 자산 재사용.
  // 데이터 = GET /api/queue/overview (60s 폴링 store) + GET /api/queue/failed (드로어).
@@ -193,7 +194,7 @@
  const machineByKey = $derived(
    new Map<FlowMachine, MachineOverview>(overview.machines.map((m) => [m.key as FlowMachine, m])),
  );
-  const LANE_ORDER: FlowMachine[] = ['gpu', 'macmini', 'macbook'];
+  const LANE_ORDER: FlowMachine[] = ['nas', 'macmini'];
  const lanes = $derived(
    LANE_ORDER.map((key) => ({
      key,
@@ -203,13 +204,6 @@
    })),
  );
  // 요약 오프로드 분담 — 맥미니 vs 맥북 (A-1 summarize_by_machine)
  const split = $derived(overview.summarize_by_machine);
  const splitTotal1h = $derived(Math.max(1, split.macmini.done_1h + split.macbook.done_1h));
  const macbookSharePct = $derived(Math.round((split.macbook.done_1h / splitTotal1h) * 100));
  // 맥북이 요약을 실제로 가져가는 중인가 (합류 표식 게이트)
  const offloadActive = $derived(split.macbook.done_1h > 0);
  // ─── 백그라운드 작업 (큐 밖 스크립트 backfill) — processing_queue 사각지대 노출 ───
  const bgJobs = $derived(overview.background_jobs ?? []);
  const runningBg = $derived(bgJobs.filter((j) => j.state === 'running'));
@@ -266,7 +260,7 @@
        : `갱신 ${Math.round(ageSec / 60)}분 전`,
  );
-  // ─── 24h 번다운 (C3) — 요약 유입 vs 소화 + 맥북 합류 변곡점 마커 ───
+  // ─── 24h 번다운 (C3) — 요약 유입 vs 소화 ───
  const burn = $derived.by(() => {
    const t = overview.trend_24h;
    if (!t || t.length === 0) return null;
@@ -279,20 +273,12 @@
      t.map((b, i) => `${(i * step).toFixed(1)},${y(sel(b))}`).join(' ');
    const doneLine = line((b) => b.done);
    const area = `0,${h} ${doneLine} ${w.toFixed(1)},${h}`;
    // 합류 변곡점 = done 최대 버킷 (맥북 야간 drain 합류 추정)
    let mi = 0;
    t.forEach((b, i) => {
      if (b.done > t[mi].done) mi = i;
    });
    return {
      w,
      h,
      area,
      doneLine,
      inflowLine: line((b) => b.inflow),
      markX: (mi * step).toFixed(1),
      markHour: t[mi].hour,
      markDone: t[mi].done,
      peak: max,
    };
  });
@@ -332,7 +318,7 @@
    </span>
  </div>
-  <!-- 머신 레인 (누가 일하나 + 요약 오프로드) -->
+  <!-- 머신 레인 (누가 일하나) -->
  <div class="grid gap-2 mb-3">
    {#each lanes as lane (lane.key)}
      <div class="bg-surface border border-default rounded-card px-3.5 py-2.5">
@@ -342,11 +328,8 @@
          <span class="text-[10px] text-faint font-mono">{lane.meta.model}</span>
          <span class="text-[11px] text-dim tabular-nums ml-1">{formatRate(lane.card?.done_1h ?? 0)}/h</span>
          {#each bgForMachine(lane.key) as j (j.id)}<span class="text-[10px] font-semibold text-success tabular-nums ml-1">생성 중: {j.label ?? j.kind}{#if j.total} {j.processed}/{j.total}{/if}</span>{/each}
-          {#if lane.key === 'macbook' && (lane.card?.deferred_pending ?? 0) > 0}
+          {#if (lane.card?.deferred_pending ?? 0) > 0}
-            <span class="text-[10px] font-semibold text-warning tabular-nums">보류 {lane.card?.deferred_pending}</span>
+            <span class="text-[10px] font-semibold text-warning tabular-nums" title="LLM 백오프 — 자동 재개 대기">보류 {lane.card?.deferred_pending}</span>
          {/if}
          {#if lane.card?.state === 'deferred'}
            <span class="text-[9px] text-warning">잠듦 — 요약은 맥미니로 복귀</span>
          {/if}
        </div>
        <div class="flex items-stretch gap-1.5 flex-wrap">
@@ -368,26 +351,8 @@
              </div>
              <div class="text-sm font-extrabold tabular-nums leading-tight text-text">{n.pending.toLocaleString()}<span class="text-[9px] text-faint font-normal ml-0.5">대기</span></div>
              <div class="text-[9px] text-dim tabular-nums whitespace-nowrap">{formatRate(n.done1h)}/h · 오늘 {n.doneToday.toLocaleString()}</div>
              {#if n.def.key === 'summarize'}
                <div class="mt-1 h-1 w-full rounded-full overflow-hidden flex" title="맥미니 {split.macmini.done_1h}/h · 맥북 {split.macbook.done_1h}/h">
                  <span class="block h-full mtag-macmini-bar" style="width:{100 - macbookSharePct}%"></span>
                  <span class="block h-full mtag-macbook-bar" style="width:{macbookSharePct}%"></span>
                </div>
                <div class="text-[9px] text-faint tabular-nums whitespace-nowrap mt-0.5">맥미니 {split.macmini.done_1h} · 맥북 {split.macbook.done_1h}/h</div>
              {/if}
            </button>
          {/each}
          {#if lane.key === 'macbook' && offloadActive}
            <button
              class="text-left rounded-lg border border-dashed border-warning/50 px-2.5 py-1.5 cursor-pointer hover:bg-surface-hover min-w-[96px]"
              onclick={() => toggleNode('summarize')}
              title="맥북이 요약을 맥미니에서 가져와 처리 중"
            >
              <div class="flex items-center gap-1 text-[11px] font-semibold text-text whitespace-nowrap">요약 합류 <span class="text-[8px] font-bold text-warning">OFFLOAD</span></div>
              <div class="text-sm font-extrabold tabular-nums leading-tight text-text">{split.macbook.done_1h}<span class="text-[9px] text-faint font-normal ml-0.5">/h</span></div>
              <div class="text-[9px] text-dim tabular-nums whitespace-nowrap">요약의 {macbookSharePct}% 담당</div>
            </button>
          {/if}
        </div>
      </div>
    {/each}
@@ -399,15 +364,11 @@
      <div class="flex items-center gap-2 mb-2">
        <span class="text-[11px] font-bold text-text">요약 백로그 24시간</span>
        <span class="text-[9px] text-faint">유입(회색) vs 소화(녹색)</span>
        {#if offloadActive}<span class="text-[9px] text-warning ml-auto">맥북 합류 {burn.markHour} — 소화 급증</span>{/if}
      </div>
      <svg viewBox="0 0 {burn.w} {burn.h}" class="block w-full" style="height:64px" preserveAspectRatio="none" role="img" aria-label="요약 백로그 24시간 번다운">
        <polygon points={burn.area} fill="currentColor" class="text-success" opacity="0.12" />
        <polyline points={burn.inflowLine} fill="none" stroke="currentColor" stroke-width="1.2" class="text-faint" />
        <polyline points={burn.doneLine} fill="none" stroke="currentColor" stroke-width="1.6" class="text-success" />
        {#if offloadActive}
          <line x1={burn.markX} y1="0" x2={burn.markX} y2={burn.h} stroke="currentColor" stroke-width="1" stroke-dasharray="2 2" class="text-warning" opacity="0.7" />
        {/if}
      </svg>
      <div class="flex flex-wrap gap-x-4 gap-y-1 mt-2 pt-2 border-t border-default text-[10px] text-dim tabular-nums">
        {#each mainNodes.filter((n) => n.pending > 0 && n.def.key !== 'summarize') as n (n.def.key)}
@@ -558,13 +519,9 @@
 </div>
 <style>
-  /* 머신 색 — 디자인 토큰 외 3색 (gpu 청/macmini 보라/macbook 황) — 이 컴포넌트 한정 */
+  /* 머신 색 — 디자인 토큰 외 2색 (nas 청/macmini 보라) — 이 컴포넌트 한정 */
-  .mtag-gpu { background: #e7eef6; color: #3b6ea5; }
+  .mtag-nas { background: #e7eef6; color: #3b6ea5; }
  .mtag-macmini { background: #efe9f7; color: #8a5fbf; }
  .mtag-macbook { background: #f7eedd; color: #b07a10; }
  /* 요약 오프로드 분담 막대 채움 (맥미니 보라 / 맥북 황) */
  .mtag-macmini-bar { background: #8a5fbf; }
  .mtag-macbook-bar { background: #b07a10; }
  .node-sel { outline: 2px solid #3b6ea5; outline-offset: 1px; }
  .detail-frame { border-color: #3b6ea5; }
  .detail-head { background: #e7eef6; }
@@ -1,6 +1,6 @@
 <script lang="ts">
  // 처리 현황 드로어 (안6 라이트) — 전 페이지 상태 스트립 클릭 시 우측에서 열림.
-  // 머신 미니카드 3 + ETA 한 줄 + 실패 합계 + 홈 링크 축약본. 상세는 홈 보드가 담당.
+  // 머신 미니카드 2(나스/맥미니) + ETA 한 줄 + 실패 합계 + 홈 링크 축약본. 상세는 홈 보드가 담당.
  // 데이터 = queueOverview store 공유 (60s 폴링, 실패 시 null → 안내문으로 degrade).
  // 열림 상태는 uiState 단일 drawer slot('queue') — 사이드바 드로어와 동시 오픈 차단.
  import { X } from 'lucide-svelte';
@@ -51,7 +51,7 @@
      <div class="p-4 space-y-3">
        {#if data}
-          <!-- 머신 미니카드 3 -->
+          <!-- 머신 미니카드 (나스/맥미니) -->
          {#each data.machines as m (m.key)}
            <div class="bg-surface border border-default rounded-lg px-3.5 py-2.5">
              <div class="flex items-center justify-between gap-2">
@@ -2,7 +2,7 @@
  import { page } from '$app/stores';
  import { goto } from '$app/navigation';
  import { api } from '$lib/api';
-  import { ChevronRight, ChevronDown, FolderOpen, FolderTree, Inbox, Clock, Mail, Scale, StickyNote, GraduationCap, CalendarCheck, MessageCircle, Hash } from 'lucide-svelte';
+  import { ChevronRight, ChevronDown, FolderOpen, FolderTree, Inbox, Clock, Mail, Scale, StickyNote, GraduationCap, CalendarCheck, MessageCircle, Hash, HardHat } from 'lucide-svelte';
  let tree = $state([]);
  let loading = $state(true);
@@ -195,6 +195,13 @@
    >
      <FolderTree size={14} /> 자료실
    </a>
    <a
      href="/safety"
      class="w-full flex items-center gap-2 px-3 py-1.5 rounded-md text-sm transition-colors
        {$page.url.pathname.startsWith('/safety') ? 'bg-accent/15 text-accent' : 'text-dim hover:bg-surface hover:text-text'}"
    >
      <HardHat size={14} /> 안전 자료실
    </a>
    <a
      href="/clause"
      class="w-full flex items-center gap-2 px-3 py-1.5 rounded-md text-sm transition-colors
@@ -5,7 +5,7 @@
 * 필드 변경 시 양쪽 동시 수정 필수.
 */
-export type MachineKey = 'gpu' | 'macmini' | 'macbook';
+export type MachineKey = 'nas' | 'macmini';
 /** 머신 상태 — active(가동) / deferred(보류) / idle(대기) */
 export type MachineState = 'active' | 'deferred' | 'idle';
@@ -29,7 +29,7 @@ export interface MachineOverview {
  /** 최근 1시간 완료 건수 (처리율 N/h 표기) */
  done_1h: number;
  done_today: number;
-  /** 보류 건수 — 맥북 sleep 등으로 자동 재개 대기 중 */
+  /** 보류 건수 — LLM 허브 백오프 등으로 자동 재개 대기 중 */
  deferred_pending: number;
  current: MachineCurrentItem[];
 }
@@ -50,12 +50,6 @@ export interface TrendPoint {
  done: number;
 }
 /** summarize 머신별 완료 실적 분담 (오프로드 가시화 — ds-board-merged A-1) */
 export interface SummarizeByMachine {
  macmini: { done_1h: number; done_today: number };
  macbook: { done_1h: number; done_today: number };
 }
 export interface QueueTotals {
  pending: number;
  processing: number;
@@ -93,7 +87,6 @@ export interface BackgroundJob {
 export interface QueueOverview {
  machines: MachineOverview[];
  summarize_eta: SummarizeEta;
  summarize_by_machine: SummarizeByMachine;
  trend_24h: TrendPoint[];
  stages: QueueStageRow[];
  totals: QueueTotals;
@@ -62,7 +62,7 @@ export function formatAgeSec(sec: number): string {
 * ★ 모델/엔진 교체 시 이 블록 1곳만 수정 (예: 맥미니 모델 스왑).
 */
-export type FlowMachine = 'gpu' | 'macmini' | 'macbook';
+export type FlowMachine = 'nas' | 'macmini';
 export interface FlowNodeDef {
  key: string;
@@ -79,26 +79,25 @@ export interface FlowNodeDef {
 /** 메인 흐름 (문서 진행 순서). 뉴스 등 소스별 스킵 경로는 그림에 안 그림 — 단순화 한계. */
 export const FLOW_NODES: FlowNodeDef[] = [
-  { key: 'extract', label: '추출', stages: ['extract'], machine: 'gpu', engine: 'Surya OCR', sub: 'ocr-service' },
+  { key: 'extract', label: '추출', stages: ['extract'], machine: 'nas', engine: 'kordoc', sub: 'kordoc' },
-  { key: 'markdown', label: '마크다운', stages: ['markdown'], machine: 'gpu', engine: 'Marker', sub: 'marker-service' },
+  { key: 'markdown', label: '마크다운', stages: ['markdown'], machine: 'nas', engine: 'Marker', sub: 'marker-service' },
  { key: 'classify', label: '분류', stages: ['classify'], machine: 'macmini', engine: 'Qwen3.6-27B', sub: 'classify + triage' },
  { key: 'summarize', label: '요약', stages: ['summarize'], machine: 'macmini', engine: 'Qwen3.6-27B', sub: 'summarize' },
-  { key: 'chunkembed', label: '청크 · 임베딩', stages: ['chunk', 'embed'], machine: 'gpu', engine: 'TEI bge-m3', sub: 'text-embeddings-inference' },
+  { key: 'chunkembed', label: '청크 · 임베딩', stages: ['chunk', 'embed'], machine: 'nas', engine: 'bge-m3 (맥미니 콜)', sub: 'embed worker' },
-  { key: 'deep', label: '심층분석', stages: ['deep_summary'], machine: 'macbook', engine: 'Qwen3.6-27B', sub: 'deep_summary' },
+  { key: 'deep', label: '심층분석', stages: ['deep_summary'], machine: 'macmini', engine: 'Qwen3.6-27B', sub: 'deep_summary' },
 ];
 /** 보조 노드 — 메인 흐름 밖 (활동 있을 때만 보조 라인에 표시) */
 export const AUX_NODES: FlowNodeDef[] = [
-  { key: 'fulltext', label: '전문 수집', stages: ['fulltext'], machine: 'gpu', engine: 'Playwright', sub: 'playwright-fetcher' },
+  { key: 'fulltext', label: '전문 수집', stages: ['fulltext'], machine: 'nas', engine: 'Playwright', sub: 'playwright-fetcher' },
-  { key: 'stt', label: '전사', stages: ['stt'], machine: 'gpu', engine: 'Whisper', sub: 'stt-service' },
+  { key: 'stt', label: '전사', stages: ['stt'], machine: 'nas', engine: 'Whisper', sub: 'stt-service' },
-  { key: 'util', label: '미리보기 · 썸네일', stages: ['preview', 'thumbnail'], machine: 'gpu', engine: '유틸', sub: 'ffmpeg' },
+  { key: 'util', label: '미리보기 · 썸네일', stages: ['preview', 'thumbnail'], machine: 'nas', engine: '유틸', sub: 'ffmpeg' },
 ];
-/** 머신 스트립 메타 — 모델 표기 단일 지점 */
+/** 머신 스트립 메타 — 모델 표기 단일 지점 (2026-07-02 컷오버: 나스+맥미니 2노드) */
 export const MACHINE_META: Record<FlowMachine, { label: string; model: string }> = {
-  gpu: { label: 'GPU 서버', model: '특화 엔진' },
+  nas: { label: '나스', model: 'DS 본체 Docker · 특화 엔진' },
-  macmini: { label: '맥미니', model: 'Qwen3.6-27B-6bit · 24/7' },
+  macmini: { label: '맥미니', model: 'Qwen3.6-27B-6bit · bge-m3 · 24/7' },
  macbook: { label: '맥북 M5 Max', model: 'Qwen3.6-27B · 야간 drain' },
 };
 /** 흐름 보드 단계 라벨 (드로어/상세 행 표기) */
@@ -72,7 +72,7 @@
  // 처리 현황 스트립 (안6 라이트) — 60s 폴링 store 공유. fetch 실패/401 시
  // store 가 null → 스트립 자체를 숨김 (silent 비차단, 로그인 페이지 동일).
  let queue = $derived($queueOverview);
-  let queueMacbook = $derived(queue?.machines?.find((m) => m.key === 'macbook') ?? null);
+  let queueMacmini = $derived(queue?.machines?.find((m) => m.key === 'macmini') ?? null);
  function toggleQueueDrawer() {
    if (ui.isDrawerOpen('queue')) ui.closeDrawer();
    else ui.openDrawer('queue');
@@ -189,8 +189,8 @@
          </span>
          <span class="tabular-nums shrink-0">대기 <strong class="text-text">{queue.totals.pending.toLocaleString()}</strong></span>
          <span class="tabular-nums shrink-0 {queue.totals.failed > 0 ? 'text-error font-semibold' : ''}">실패 <strong class={queue.totals.failed > 0 ? '' : 'text-text'}>{queue.totals.failed.toLocaleString()}</strong></span>
-          {#if queueMacbook}
+          {#if queueMacmini}
-            <span class="text-[10px] font-bold rounded-full px-2 py-0.5 shrink-0 {machineChipClass(queueMacbook.state)}">맥북 {MACHINE_STATE_LABEL[queueMacbook.state]}</span>
+            <span class="text-[10px] font-bold rounded-full px-2 py-0.5 shrink-0 {machineChipClass(queueMacmini.state)}">맥미니 {MACHINE_STATE_LABEL[queueMacmini.state]}</span>
          {/if}
          <span class="ml-auto flex items-center gap-0.5 text-faint shrink-0">자세히 <ChevronDown size={11} /></span>
        </button>
@@ -0,0 +1,34 @@
 <script>
  // 안전 자료실 (safety-library-1 Phase 3) — 재해/법령·지침/서적·표준·매뉴얼 3탭.
  import { page } from '$app/stores';
  const TABS = [
    { href: '/safety/incidents', label: '재해사례' },
    { href: '/safety/laws', label: '법령·지침' },
    { href: '/safety/materials', label: '서적·표준·매뉴얼' },
  ];
 </script>
 <div class="max-w-5xl mx-auto px-4 py-5 flex flex-col gap-4">
  <header>
    <h1 class="text-lg font-bold text-text">안전 자료실</h1>
    <p class="text-xs text-dim mt-0.5">재해사례·법령·지침·표준 — 자료유형(material_type) 축 기반</p>
  </header>
  <nav class="flex gap-1 border-b border-default" aria-label="안전 자료실 탭">
    {#each TABS as tab}
      <a
        href={tab.href}
        aria-current={$page.url.pathname === tab.href ? 'page' : undefined}
        class="px-3 py-2 text-sm font-medium border-b-2 -mb-px transition-colors
          {$page.url.pathname === tab.href
            ? 'border-accent text-accent'
            : 'border-transparent text-dim hover:text-text'}"
      >
        {tab.label}
      </a>
    {/each}
  </nav>
  <slot />
 </div>
@@ -0,0 +1,9 @@
 <script>
  // /safety 진입 = 재해 탭 redirect (plan: +page=재해 탭 redirect)
  import { onMount } from 'svelte';
  import { goto } from '$app/navigation';
  onMount(() => {
    goto('/safety/incidents', { replaceState: true });
  });
 </script>
@@ -0,0 +1,75 @@
 <script>
  // 안전 자료실 공용 목록 — material_type + jurisdiction 필터로 GET /documents/ 조회.
  // C-1 계약: material_type 지정 = 기본 exclude(news·law_monitor·note) 해제 (documents.py list_documents).
  import { api } from '$lib/api';
  import { addToast } from '$lib/stores/toast';
  import DocumentCard from '$lib/components/DocumentCard.svelte';
  let { materialType, jurisdiction = '' } = $props();
  const PAGE_SIZE = 20;
  let docs = $state([]);
  let total = $state(0);
  let nextPage = $state(1);
  let loading = $state(false);
  async function load(reset = false) {
    loading = true;
    const pageToLoad = reset ? 1 : nextPage;
    try {
      const params = new URLSearchParams();
      params.set('material_type', materialType);
      if (jurisdiction) params.set('jurisdiction', jurisdiction);
      params.set('page', String(pageToLoad));
      params.set('page_size', String(PAGE_SIZE));
      const result = await api(`/documents/?${params}`);
      docs = reset ? result.items : [...docs, ...result.items];
      total = result.total;
      nextPage = pageToLoad + 1;
    } catch {
      addToast('error', '안전 자료 로딩 실패');
    } finally {
      loading = false;
    }
  }
  $effect(() => {
    // 필터 변경 시 1페이지부터 재조회 (materialType/jurisdiction 읽기 = 반응 트리거)
    void materialType;
    void jurisdiction;
    docs = [];
    load(true);
  });
  let hasMore = $derived(docs.length < total);
 </script>
 <div class="flex flex-col gap-2">
  {#if !loading || docs.length > 0}
    <p class="text-xs text-dim tabular-nums">총 {total.toLocaleString()}건</p>
  {/if}
  {#if docs.length > 0}
    <div class="flex flex-col gap-2">
      {#each docs as doc (doc.id)}
        <DocumentCard {doc} />
      {/each}
    </div>
  {:else if !loading}
    <div class="py-12 text-center text-sm text-dim">
      해당 조건의 자료가 없습니다.
    </div>
  {/if}
  {#if loading}
    <div class="py-6 text-center text-sm text-dim">불러오는 중…</div>
  {:else if hasMore}
    <button
      type="button"
      onclick={() => load(false)}
      class="self-center px-4 py-1.5 rounded-md text-sm text-dim border border-default hover:bg-surface hover:text-text transition-colors"
    >
      더 보기 ({docs.length}/{total.toLocaleString()})
    </button>
  {/if}
 </div>
@@ -0,0 +1,29 @@
 <script>
  // 재해사례 탭 — material_type=incident (KOSHA 사고사망·재해사례·CSB 등).
  // 케이스 그룹핑(boardno 본문+첨부 1카드)은 API 확장 필요라 후속(DS freeze 하 백엔드 무변경).
  import SafetyDocList from '../SafetyDocList.svelte';
  const JURISDICTIONS = [
    { value: '', label: '전체' },
    { value: 'KR', label: 'KR' },
    { value: 'US', label: 'US' },
  ];
  let jurisdiction = $state('');
 </script>
 <div class="flex flex-col gap-3">
  <div class="flex items-center gap-1.5" role="group" aria-label="관할 필터">
    {#each JURISDICTIONS as j}
      <button
        type="button"
        onclick={() => (jurisdiction = j.value)}
        class="px-2.5 py-1 rounded-full text-xs font-medium transition-colors
          {jurisdiction === j.value ? 'bg-accent/15 text-accent' : 'text-dim hover:bg-surface hover:text-text'}"
      >
        {j.label}
      </button>
    {/each}
  </div>
  <SafetyDocList materialType="incident" {jurisdiction} />
 </div>
@@ -0,0 +1,48 @@
 <script>
  // 법령·지침 탭 — 법령(law, 버전체인 current 만 코퍼스 노출) / 지침(guide, KOSHA GUIDE 등).
  // 법령 기본 관할 = KR (plan: country 누락 = KR 정규화). version_status 뱃지는 API 확장 후속.
  import SafetyDocList from '../SafetyDocList.svelte';
  const KINDS = [
    { value: 'law', label: '법령' },
    { value: 'guide', label: '지침' },
  ];
  const JURISDICTIONS = [
    { value: 'KR', label: 'KR' },
    { value: 'US', label: 'US' },
    { value: '', label: '전체' },
  ];
  let kind = $state('law');
  let jurisdiction = $state('KR');
 </script>
 <div class="flex flex-col gap-3">
  <div class="flex items-center justify-between flex-wrap gap-2">
    <div class="flex items-center gap-1" role="group" aria-label="자료유형">
      {#each KINDS as k}
        <button
          type="button"
          onclick={() => (kind = k.value)}
          class="px-3 py-1 rounded-md text-sm font-medium transition-colors
            {kind === k.value ? 'bg-accent/15 text-accent' : 'text-dim hover:bg-surface hover:text-text'}"
        >
          {k.label}
        </button>
      {/each}
    </div>
    <div class="flex items-center gap-1.5" role="group" aria-label="관할 필터">
      {#each JURISDICTIONS as j}
        <button
          type="button"
          onclick={() => (jurisdiction = j.value)}
          class="px-2.5 py-1 rounded-full text-xs font-medium transition-colors
            {jurisdiction === j.value ? 'bg-accent/15 text-accent' : 'text-dim hover:bg-surface hover:text-text'}"
        >
          {j.label}
        </button>
      {/each}
    </div>
  </div>
  <SafetyDocList materialType={kind} {jurisdiction} />
 </div>
@@ -0,0 +1,29 @@
 <script>
  // 서적·표준·매뉴얼 탭 — 필터 프리셋(전용 뷰는 50건+ 게이트 뒤, plan Phase 3).
  import SafetyDocList from '../SafetyDocList.svelte';
  const KINDS = [
    { value: 'standard', label: '표준 (NB 등)' },
    { value: 'book', label: '서적' },
    { value: 'manual', label: '매뉴얼' },
    { value: 'paper', label: '논문' },
  ];
  let kind = $state('standard');
 </script>
 <div class="flex flex-col gap-3">
  <div class="flex items-center gap-1" role="group" aria-label="자료유형">
    {#each KINDS as k}
      <button
        type="button"
        onclick={() => (kind = k.value)}
        class="px-3 py-1 rounded-md text-sm font-medium transition-colors
          {kind === k.value ? 'bg-accent/15 text-accent' : 'text-dim hover:bg-surface hover:text-text'}"
      >
        {k.label}
      </button>
    {/each}
  </div>
  <SafetyDocList materialType={kind} />
 </div>
@@ -4,7 +4,7 @@
  import { onMount } from 'svelte';
  import { api } from '$lib/api';
  import { addToast } from '$lib/stores/toast';
-  import { BookOpen, PenLine, GraduationCap, FolderKanban, Layers, Repeat, Flag, Inbox, Activity, CalendarCheck } from 'lucide-svelte';
+  import { BookOpen, PenLine, GraduationCap, FolderKanban, Layers, Repeat, Flag, Inbox, Activity, CalendarCheck, Target } from 'lucide-svelte';
  let cardReviewCount = $state(0);
  let questionFlagCount = $state(0);
@@ -12,6 +12,7 @@
  // 오늘의 공부 (이론 홈)
  let curriculum = $state(null);
  let todayConcepts = $state([]);
  let weakConcepts = $state([]);        // 약점 개념(관련 기출 정답률 낮음)
  let dashLoading = $state(true);
  let readPct = $derived(
@@ -28,10 +29,15 @@
      curriculum = cur;
      todayConcepts = today?.concepts ?? [];
    } catch {
-      // 대시보드 실패해도 허브 나머지는 동작 (조용히)
+      // 코어 대시보드 실패해도 허브 나머지는 동작 (조용히)
    } finally {
      dashLoading = false;
    }
    // 약점 개념 = 비차단(신규 엔드포인트 실패해도 코어 대시보드 블랙아웃 방지)
    try {
      const weak = await api('/study/concepts/weakness-map?limit=5');
      weakConcepts = weak?.weak ?? [];
    } catch {}
  }
  async function markRead(doc) {
@@ -110,7 +116,7 @@
          {#each todayConcepts as c (c.doc_id)}
            <li class="flex items-center gap-2 rounded border border-default px-3 py-2">
              <span class="text-accent shrink-0 text-xs" title="빈출">{#each Array(c.freq) as _}★{/each}</span>
-              <a href="/documents/{c.doc_id}" class="text-sm text-text hover:text-accent truncate flex-1">{c.title}</a>
+              <a href="/study/read/{c.doc_id}" class="text-sm text-text hover:text-accent truncate flex-1">{c.title}</a>
              <span class="shrink-0 text-[10px] rounded-full px-2 py-0.5 {c.reason === '재복습' ? 'bg-accent/15 text-accent' : 'bg-surface border border-default text-dim'}">{c.reason}</span>
              <button
                type="button"
@@ -121,6 +127,22 @@
          {/each}
        </ul>
      {/if}
      {#if weakConcepts.length > 0}
        <div class="mt-4 pt-3 border-t border-default">
          <div class="text-xs text-dim mb-2 flex items-center gap-1.5">
            <Target size={13} class="text-error" /> 약점 개념 <span class="text-faint">(관련 기출 정답률 낮음)</span>
          </div>
          <div class="flex flex-wrap gap-2">
            {#each weakConcepts as w (w.doc_id)}
              <a href="/study/read/{w.doc_id}"
                class="text-xs rounded-full border border-error/40 bg-error/10 text-error px-3 py-1 hover:bg-error/20 transition-colors">
                {w.title.replace(/^\d+_/, '')} <span class="font-semibold">{w.accuracy}%</span>
              </a>
            {/each}
          </div>
        </div>
      {/if}
    {/if}
  </section>
@@ -0,0 +1,254 @@
 <script>
  /**
   * /study/read/[docId] — 개념 학습 리더.
   * 개념노트(가스기사 documents)를 구조(요약/본문/빈출★/관련개념)로 렌더 +
   * '떠올리기' 능동 회상 토글 + 회독 SR(POST read) + 관련개념 백링크 + 이전/다음.
   * 본문 렌더 = MarkdownDoc(KaTeX + docimg 내장). 서버 파싱 = /api/study/concepts/{id}.
   */
  import { page } from '$app/stores';
  import { api } from '$lib/api';
  import { addToast } from '$lib/stores/toast';
  import { renderMathMarkdownInline } from '$lib/utils/mathMarkdown';
  import MarkdownDoc from '$lib/components/MarkdownDoc.svelte';
  import Button from '$lib/components/ui/Button.svelte';
  import EmptyState from '$lib/components/ui/EmptyState.svelte';
  import Skeleton from '$lib/components/ui/Skeleton.svelte';
  import { BookOpen, ArrowLeft, Eye, EyeOff, Check, ChevronLeft, ChevronRight, FileQuestion } from 'lucide-svelte';
  let docId = $derived($page.params.docId);
  let concept = $state(null);
  let relatedQ = $state(null);          // 관련 기출(이론↔문제, 비차단)
  let loading = $state(true);
  let notFound = $state(false);
  let mode = $state('read');            // 'read' | 'recall'(떠올리기)
  let revealed = $state({});            // {sectionIndex: true}
  let marking = $state(false);
  const STAGE_LABEL = { 0: '복습 시작', 1: '복습 1단계', 2: '복습 2단계', 3: '복습 3단계', 4: '학습 완료' };
  const OUTCOME_MARK = { correct: '○', wrong: '✕', unsure: '?' };
  const OUTCOME_CLASS = { correct: 'text-success', wrong: 'text-error', unsure: 'text-warning' };
  const outcomeMark = (o) => OUTCOME_MARK[o] ?? '–';
  const outcomeClass = (o) => OUTCOME_CLASS[o] ?? 'text-faint';
  async function load() {
    const reqId = docId; // in-flight 가드: 백링크 연타 시 stale 응답 무시
    loading = true;
    notFound = false;
    concept = null;
    relatedQ = null;
    revealed = {};
    mode = 'read';
    try {
      const data = await api(`/study/concepts/${reqId}`);
      if (reqId !== docId) return; // 그새 다른 개념으로 이동 → 폐기
      concept = data;
    } catch (e) {
      if (reqId !== docId) return;
      if (e?.status === 404) notFound = true;
      else addToast('error', '개념을 불러오지 못했습니다');
      return; // 본문 실패 → 관련기출 스킵
    } finally {
      if (reqId === docId) loading = false;
    }
    // 관련 기출(비차단 — 실패해도 본문 표시엔 영향 없음)
    try {
      const rq = await api(`/study/concepts/${reqId}/questions?limit=6`);
      if (reqId === docId) relatedQ = rq;
    } catch {}
  }
  // $effect 가 마운트 1회 + docId 변경(백링크/이전·다음) 재로드를 모두 커버 (onMount 불필요)
  $effect(() => {
    void docId;
    load();
  });
  function toggleMode() {
    mode = mode === 'read' ? 'recall' : 'read';
    revealed = {};
  }
  function reveal(i) {
    revealed = { ...revealed, [i]: true };
  }
  function shown(i) {
    return mode === 'read' || revealed[i];
  }
  async function markRead() {
    marking = true;
    try {
      const r = await api(`/study/concepts/${docId}/read`, { method: 'POST' });
      if (concept) {
        concept.is_read = true;
        concept.review_stage = r?.review_stage ?? concept.review_stage;
        concept.due_at = r?.due_at ?? concept.due_at;
      }
      addToast('success', '회독 완료 — 다음 복습에 다시 나옵니다');
    } catch {
      addToast('error', '회독 처리 실패');
    } finally {
      marking = false;
    }
  }
 </script>
 <svelte:head><title>{concept?.title ?? '개념'} — 공부</title></svelte:head>
 <div class="p-4 md:p-6 max-w-3xl mx-auto">
  <!-- 상단 네비 -->
  <div class="flex items-center gap-2 text-xs md:text-sm mb-4 min-w-0">
    <a href="/study" class="text-dim hover:text-text flex items-center gap-1 shrink-0">
      <ArrowLeft size={14} /> 공부
    </a>
    {#if concept?.subject}
      <span class="text-faint shrink-0">/</span>
      <span class="text-dim truncate">{concept.subject}</span>
    {/if}
  </div>
  {#if loading}
    <Skeleton h="h-10" rounded="card" />
    <div class="mt-3 space-y-2">
      {#each Array(4) as _}<Skeleton h="h-24" rounded="card" />{/each}
    </div>
  {:else if notFound}
    <EmptyState icon={BookOpen} title="개념을 찾을 수 없습니다" description="삭제되었거나 잘못된 주소입니다." />
  {:else if concept}
    <!-- 제목 + 빈출 tier -->
    <header class="mb-3">
      <div class="flex items-start gap-2">
        <h1 class="text-xl md:text-2xl font-semibold text-text flex-1">{concept.title}</h1>
        <span class="text-accent text-sm shrink-0 mt-1" title="빈출도">
          {#each Array(concept.freq) as _}★{/each}
        </span>
      </div>
      {#if concept.is_read || (concept.review_stage !== null && concept.review_stage !== undefined)}
        <div class="mt-1 text-xs text-dim">
          {#if concept.review_stage !== null && concept.review_stage !== undefined}
            {STAGE_LABEL[concept.review_stage] ?? '복습 중'}
          {:else}회독함{/if}
        </div>
      {/if}
    </header>
    <!-- 한 줄 요약 (고정 표시) -->
    {#if concept.summary}
      <div class="mb-4 rounded-lg border-l-4 border-accent bg-accent/10 px-4 py-3 markdown-body text-sm text-text">
        {@html renderMathMarkdownInline(concept.summary)}
      </div>
    {/if}
    <!-- 모드 토글 -->
    <div class="flex items-center gap-2 mb-4">
      <Button variant={mode === 'recall' ? 'primary' : 'secondary'} size="sm" icon={mode === 'recall' ? EyeOff : Eye} onclick={toggleMode}>
        {mode === 'recall' ? '떠올리기 모드' : '읽기 모드'}
      </Button>
      {#if mode === 'recall'}
        <span class="text-xs text-dim">각 섹션을 떠올린 뒤 확인하세요</span>
      {/if}
    </div>
    <!-- 본문 섹션 -->
    {#if concept.body.length > 0}
      <div class="space-y-3 mb-5">
        {#each concept.body as sec, i (i)}
          <section class="rounded-lg border border-default bg-surface overflow-hidden">
            <div class="flex items-center gap-2 px-4 py-2.5 border-b border-default bg-surface-hover">
              <h2 class="text-sm font-semibold text-text flex-1">{sec.label}</h2>
              {#if sec.stars > 0}
                <span class="text-accent text-xs shrink-0">{#each Array(sec.stars) as _}★{/each}</span>
              {/if}
            </div>
            {#if shown(i)}
              <div class="px-4 py-3">
                <MarkdownDoc documentId={concept.doc_id} mdContent={sec.md} mdStatus={null}
                  class="markdown-body max-w-none text-text" />
              </div>
            {:else}
              <button type="button" onclick={() => reveal(i)}
                class="w-full px-4 py-6 text-center text-sm text-dim hover:text-accent hover:bg-accent/5 transition-colors">
                <Eye size={16} class="inline mr-1" /> 떠올린 뒤 확인
              </button>
            {/if}
          </section>
        {/each}
      </div>
    {/if}
    <!-- 빈출 포인트 -->
    {#if concept.bincheol.length > 0}
      <section class="mb-5 rounded-lg border border-default bg-surface p-4">
        <h2 class="text-sm font-semibold text-text mb-2 flex items-center gap-1.5">
          <span class="text-accent">★</span> 빈출 포인트
        </h2>
        <ul class="space-y-1.5">
          {#each concept.bincheol as item}
            <li class="flex gap-2 text-sm text-text">
              <span class="text-accent shrink-0 text-xs mt-0.5">{#each Array(item.tier || 1) as _}★{/each}</span>
              <span class="markdown-body flex-1">{@html renderMathMarkdownInline(item.text)}</span>
            </li>
          {/each}
        </ul>
      </section>
    {/if}
    <!-- 관련 개념 (백링크) -->
    {#if concept.related.length > 0}
      <section class="mb-5">
        <h2 class="text-xs text-dim mb-2">관련 개념</h2>
        <div class="flex flex-wrap gap-2">
          {#each concept.related as rel}
            {#if rel.doc_id}
              <a href="/study/read/{rel.doc_id}"
                class="text-xs rounded-full border border-accent/40 bg-accent/10 text-accent px-3 py-1 hover:bg-accent/20 transition-colors">
                {rel.phrase}
              </a>
            {:else}
              <span class="text-xs rounded-full border border-default bg-surface text-faint px-3 py-1" title="아직 없는 개념">
                {rel.phrase}
              </span>
            {/if}
          {/each}
        </div>
      </section>
    {/if}
    <!-- 관련 기출 (이론↔문제 브리지) -->
    {#if relatedQ && relatedQ.linked > 0}
      <section class="mb-5 rounded-lg border border-default bg-surface p-4">
        <h2 class="text-sm font-semibold text-text mb-2 flex items-center gap-1.5">
          <FileQuestion size={15} class="text-accent" /> 관련 기출
          <span class="ml-1 text-xs font-normal text-dim">
            {relatedQ.linked}문항{#if relatedQ.accuracy !== null} · 정답률 <span class="{relatedQ.accuracy < 60 ? 'text-error' : 'text-text'} font-medium">{relatedQ.accuracy}%</span>{:else} · 아직 안 풂{/if}
          </span>
        </h2>
        <ul class="space-y-0.5">
          {#each relatedQ.questions as q (q.id)}
            <li>
              <a href="/study/topics/4/questions/{q.id}"
                class="flex items-center gap-2 text-xs py-1 text-dim hover:text-accent transition-colors">
                <span class="{outcomeClass(q.last_outcome)} shrink-0 w-4 text-center font-bold">{outcomeMark(q.last_outcome)}</span>
                <span class="truncate">{q.subject ?? '기출'}{#if q.exam_round} · {q.exam_round}{/if}</span>
              </a>
            </li>
          {/each}
        </ul>
      </section>
    {/if}
    <!-- 액션바 -->
    <div class="flex items-center gap-2 border-t border-default pt-4 mt-2">
      {#if concept.prev_id}
        <Button variant="ghost" size="sm" icon={ChevronLeft} href="/study/read/{concept.prev_id}">이전</Button>
      {/if}
      <div class="flex-1"></div>
      <Button variant="primary" size="sm" icon={Check} onclick={markRead} loading={marking}>
        {concept.is_read ? '다시 회독' : '회독 완료'}
      </Button>
      {#if concept.next_id}
        <Button variant="secondary" size="sm" icon={ChevronRight} href="/study/read/{concept.next_id}">다음 개념</Button>
      {/if}
    </div>
  {/if}
 </div>
@@ -0,0 +1,15 @@
 -- 382_study_concept_links.sql — 개념문서 ↔ 기출문항 링크 (이론↔문제 브리지, Stage B).
 -- concept_doc_id=documents.id, question_id=study_questions.id — FK 없음(hot 테이블 락 회피, 선례).
 -- link_source: 'embedding'(bge-m3 코사인 top-k, 주력) | 'ref'(해설 .md 참조, 후속 enrichment).
 -- score=코사인 유사도(0~1). UNIQUE(doc,question,source) — source별 공존 허용(재튜닝=source 전삭제 후 재삽입).
 CREATE TABLE IF NOT EXISTS study_concept_links (
  id             bigserial PRIMARY KEY,
  concept_doc_id bigint NOT NULL,
  question_id    bigint NOT NULL,
  link_source    text   NOT NULL,
  score          double precision,
  created_at     timestamptz NOT NULL DEFAULT now(),
  CONSTRAINT uq_concept_link UNIQUE (concept_doc_id, question_id, link_source)
 );
 CREATE INDEX IF NOT EXISTS idx_concept_links_doc ON study_concept_links(concept_doc_id);
 CREATE INDEX IF NOT EXISTS idx_concept_links_q ON study_concept_links(question_id);
@@ -0,0 +1,23 @@
 -- concept_links_backfill.sql — 개념↔문항 임베딩 링크 재생성 (Stage B, 멱등·재실행 안전).
 -- 정찰 확정: bge-m3 1024d 코사인, per-concept top-k=10, threshold 0.62 → ~2362링크·284/289개념·964문항.
 -- 재튜닝 시 DELETE(embedding 소스만) 후 재삽입 = ref 링크(후속) 불변. 개념 doc = 가스기사 태그.
 DELETE FROM study_concept_links WHERE link_source = 'embedding';
 INSERT INTO study_concept_links (concept_doc_id, question_id, link_source, score)
 WITH cd AS (
  SELECT id, embedding FROM documents
  WHERE user_tags::text LIKE '%@library/가스기사/%'
    AND deleted_at IS NULL AND embedding IS NOT NULL
 ),
 ranked AS (
  SELECT cd.id AS concept_doc_id, q.id AS question_id,
         1 - (q.embedding <=> cd.embedding) AS score,
         row_number() OVER (PARTITION BY cd.id ORDER BY q.embedding <=> cd.embedding) AS rn
  FROM cd
  JOIN study_questions q
    ON q.study_topic_id = 4 AND q.embedding IS NOT NULL
   AND q.deleted_at IS NULL AND q.is_active
 )
 SELECT concept_doc_id, question_id, 'embedding', score
 FROM ranked
 WHERE rn <= 10 AND score >= 0.62
 ON CONFLICT (concept_doc_id, question_id, link_source) DO NOTHING;
@@ -0,0 +1,80 @@
 """summarize_units PR2 헬퍼 단위테스트 — map/reduce 프롬프트 조립 순수함수.
 핵심 불변식:
  - render_map_slice: 유닛 위치(1-based)/섹션 라벨 + 본문 그대로 (손실 0).
  - build_reduce_units_block: 어떤 입력에도 반환 블록 est_tokens <= budget (캡 초과 0
    검증 게이트의 reduce 측). 절단은 detail 만 — 라벨/TLDR/불일치/순서 보존.
 pytest + 단독 실행 양쪽 지원:
  PYTHONPATH=. pytest tests/summarize_units/ -q
 """
 from __future__ import annotations
 from app.services.summarize_units import (
    SummarizeUnit,
    build_reduce_units_block,
    estimate_tokens,
    render_map_slice,
 )
 def _result(idx: int, detail: str, *, tldr: str = "요약", inc: list | None = None) -> dict:
    return {
        "index": idx,
        "titles": [f"섹션{idx}"],
        "tldr": tldr,
        "detail": detail,
        "inconsistencies": inc or [],
    }
 # ---------- render_map_slice ----------
 def test_render_map_slice_label_and_body():
    unit = SummarizeUnit(index=2, section_titles=["개요", None, "본론"], text="본문입니다")
    out = render_map_slice(unit, total_units=5)
    assert out.startswith("[유닛 3/5 — 섹션: 개요 · 본론]\n")
    assert out.endswith("본문입니다")
 def test_render_map_slice_untitled():
    unit = SummarizeUnit(index=0, section_titles=[None], text="x")
    assert "(무제 구간)" in render_map_slice(unit, total_units=1)
 # ---------- build_reduce_units_block ----------
 def test_reduce_block_within_budget_untouched():
    results = [_result(i, "가" * 100) for i in range(3)]
    block, truncated = build_reduce_units_block(results, budget_tokens=11_000)
    assert not truncated
    # 순서/라벨/TLDR 보존
    assert block.index("[유닛 1/3") < block.index("[유닛 2/3") < block.index("[유닛 3/3")
    assert "TLDR: 요약" in block
    assert "가" * 100 in block
 def test_reduce_block_truncates_to_budget():
    # 유닛 8개 × 한글 detail 5,000자 ≈ 21K tok — budget 5,000 으로 절단 강제
    results = [_result(i, "가" * 5_000) for i in range(8)]
    block, truncated = build_reduce_units_block(results, budget_tokens=5_000)
    assert truncated
    assert estimate_tokens(block) <= 5_000
    # 라벨(유닛 순서)은 절단 후에도 보존
    assert "[유닛 1/8" in block
 def test_reduce_block_hard_cut_floor():
    # min_detail_chars floor 에 막혀 비례 절단으로 불충분한 극단 케이스 — 하드 컷 발동
    results = [_result(i, "가" * 300) for i in range(50)]
    block, truncated = build_reduce_units_block(results, budget_tokens=500)
    assert truncated
    assert estimate_tokens(block) <= 500
 def test_reduce_block_preserves_inconsistencies():
    results = [
        _result(0, "가" * 50, inc=[{"kind": "version_drift", "desc": "개정판 차이"}]),
    ]
    block, _ = build_reduce_units_block(results, budget_tokens=10_000)
    assert "불일치(version_drift): 개정판 차이" in block
@@ -0,0 +1,180 @@
 """summarize_units 단위테스트 (presegment PR1 — 순수함수·fixture).
 핵심 불변식:
  - estimate_tokens = PR0 캘리브레이션(한글 0.529 · 기타 0.217 tok/char) 정확 재현.
  - greedy_pack: 순서 보존·인접만·cap 준수·단독 초과 leaf=over_cap 전용 유닛·텍스트 손실 0
    (구 deep_summary head/mid/tail 가운데 폐기 버그의 반대 성질).
  - gate 3-way: 0=auto / (0,40]=hybrid / >40=whole (경계 포함).
  - plan_summarize_units: trigger 이하=single(현행 단일콜 유지=무회귀) / 초과=map_reduce.
 pytest + 단독 실행 양쪽 지원:
  PYTHONPATH=. .venv/bin/pytest tests/summarize_units/ -q
 """
 from __future__ import annotations
 from app.services.hier_decomp.builder import HierNode
 from app.services.summarize_units import (
    CAP_TOKENS,
    TRIGGER_TOKENS,
    SummarizeUnit,
    estimate_tokens,
    extract_leaves,
    gate,
    greedy_pack,
    over_pct,
    plan_summarize_units,
 )
 def _leaf(idx: int, text: str, title: str | None = None) -> HierNode:
    return HierNode(idx=idx, parent_idx=None, level=1, node_type=None,
                    section_title=title, heading_path=title, text=text)
 # ---------- estimate_tokens ----------
 def test_estimate_tokens_korean_calibration():
    # 한글 1000자 → 529 tok (PR0: 0.529 tok/char)
    assert estimate_tokens("가" * 1000) == 529
 def test_estimate_tokens_english_calibration():
    # 비한글 1000자 → 217 tok (PR0: 0.217 tok/char)
    assert estimate_tokens("a" * 1000) == 217
 def test_estimate_tokens_mixed_and_empty():
    assert estimate_tokens("") == 0
    mixed = "가" * 100 + "a" * 100
    assert estimate_tokens(mixed) == round(100 * 0.529 + 100 * 0.217)
 # ---------- greedy_pack ----------
 def test_greedy_pack_adjacency_and_cap():
    # 4000tok 짜리 한글 leaf 4개 (4000/0.529 ≈ 7562자) → cap 12000 이면 [3개, 1개]... 아니
    # 4000*3=12000 = cap 정확 경계(<=cap 허용) → [1,2,3] + [4]
    body = "가" * 7562  # ≈ 3999~4000 tok
    leaves = [_leaf(i, body, f"s{i}") for i in range(4)]
    units = greedy_pack(leaves, cap=12_000)
    assert len(units) == 2
    assert [len(u.section_titles) for u in units] == [3, 1]
    # 순서 보존
    assert units[0].section_titles == ["s0", "s1", "s2"]
    assert units[1].section_titles == ["s3"]
    # cap 준수
    assert all(u.est_tokens <= 12_000 for u in units)
 def test_greedy_pack_oversized_leaf_gets_own_unit():
    small = "가" * 1000            # ≈ 529 tok
    big = "가" * 30_000            # ≈ 15,870 tok > CAP
    leaves = [_leaf(0, small, "a"), _leaf(1, big, "mega"), _leaf(2, small, "b")]
    units = greedy_pack(leaves, cap=CAP_TOKENS)
    assert len(units) == 3
    assert units[1].over_cap and units[1].section_titles == ["mega"]
    assert not units[0].over_cap and not units[2].over_cap
    # 인접성: 초과 leaf 가 앞뒤 pack 을 넘나들며 합쳐지지 않음
    assert units[0].section_titles == ["a"] and units[2].section_titles == ["b"]
 def test_greedy_pack_no_text_loss():
    leaves = [_leaf(i, f"본문{i} " + "가" * 500, f"s{i}") for i in range(7)]
    units = greedy_pack(leaves, cap=1_000)
    joined = "\n\n".join(u.text for u in units)
    for leaf in leaves:
        assert leaf.text in joined  # 커버리지 — 중간 폐기 0
 def test_greedy_pack_empty():
    assert greedy_pack([]) == []
 # ---------- over_pct + gate ----------
 def test_over_pct_and_gate_boundaries():
    assert gate(0.0) == "auto"
    assert gate(0.01) == "hybrid"
    assert gate(40.0) == "hybrid"
    assert gate(40.01) == "whole"
    assert gate(100.0) == "whole"
 def test_over_pct_computation():
    # leaf: 6000tok + 18000tok(초과) → over% = 18000/24000 = 75%
    l_small = _leaf(0, "가" * round(6000 / 0.529), "a")
    l_big = _leaf(1, "가" * round(18000 / 0.529), "b")
    pct = over_pct([l_small, l_big], cap=CAP_TOKENS)
    assert 74.0 < pct < 76.0
    assert over_pct([], cap=CAP_TOKENS) == 0.0
    assert over_pct([l_small], cap=CAP_TOKENS) == 0.0
 # ---------- plan_summarize_units (fixture md) ----------
 def _md_doc(sections: int, chars_per_section: int, ch: str = "가") -> str:
    parts = []
    for i in range(sections):
        parts.append(f"# 제{i+1}장 섹션{i}\n\n" + ch * chars_per_section)
    return "\n\n".join(parts)
 def test_plan_small_doc_stays_single():
    md = _md_doc(3, 1000)  # ≈ 3×529 tok ≪ trigger
    plan = plan_summarize_units(md)
    assert plan.mode == "single" and plan.tier is None and plan.units == []
    assert plan.total_est_tokens <= TRIGGER_TOKENS
 def test_plan_large_doc_auto_tier():
    # 섹션 20개 × ≈4000tok = ≈80K tok > trigger, 전 섹션 < cap → auto
    md = _md_doc(20, 7562)
    plan = plan_summarize_units(md)
    assert plan.mode == "map_reduce"
    assert plan.tier == "auto" and plan.over_pct == 0.0
    assert len(plan.units) >= 2
    assert all(u.est_tokens <= CAP_TOKENS for u in plan.units)
 def test_plan_mega_section_whole_tier():
    # 작은 섹션 2 + 초대형 1(≈53K tok — 전체의 >40%) → whole
    md = (_md_doc(2, 7562)
          + "\n\n# 메가섹션\n\n" + "가" * 100_000)
    plan = plan_summarize_units(md)
    assert plan.mode == "map_reduce"
    assert plan.tier == "whole" and plan.over_pct > 40.0
    assert any(u.over_cap for u in plan.units)
 def test_plan_hybrid_tier():
    # 정상 섹션 15개(≈60K tok) + 초과 섹션 1개(≈15.9K tok) → over% ≈ 21% → hybrid
    md = _md_doc(15, 7562) + "\n\n# 초과섹션\n\n" + "가" * 30_000
    plan = plan_summarize_units(md)
    assert plan.mode == "map_reduce"
    assert plan.tier == "hybrid"
    assert 0.0 < plan.over_pct <= 40.0
    over_units = [u for u in plan.units if u.over_cap]
    assert len(over_units) == 1  # hybrid 시 클로드 대상 = 이 유닛들만
 def test_plan_headingless_giant_is_whole():
    # 헤딩 없는 거대 EN 문서 — leaf 1개 전체 초과 → over% 100 → whole (PR0: EN 책 다수)
    md = "x" * 200_000  # ≈ 43K tok > trigger, 단일 leaf > cap
    plan = plan_summarize_units(md)
    assert plan.mode == "map_reduce" and plan.tier == "whole"
 def test_plan_deterministic():
    md = _md_doc(10, 7562)
    p1, p2 = plan_summarize_units(md), plan_summarize_units(md)
    assert p1 == p2
 if __name__ == "__main__":
    import sys
    fns = [v for k, v in sorted(globals().items()) if k.startswith("test_")]
    for fn in fns:
        fn()
        print(f"ok {fn.__name__}")
    print(f"{len(fns)} passed (standalone)")
    sys.exit(0)
@@ -0,0 +1,266 @@
 """presegment PR2 — deep_summary_worker map-reduce/HOLD 배선 단위테스트.
 worker-process 레벨(DB 필요)의 큐 상태 전이는 라이브 E2E 로 검증하고, 여기서는
 새 메커니즘의 seam 을 단위 검증한다 (test_fair_share.py 선례):
  - _hold_awaiting_split: payload 마킹 commit 후 StageDeferred(HOLD_RETRY_MINUTES).
  - _process_map_reduce: 유닛별 map → reduce → doc 필드 기록 / 모든 콜 캡 준수 /
    payload.presegment.map_results 유닛 단위 persist(멱등 재개) / 실패 유닛 raise /
    drain 보류(StageDeferred) 시 완료 유닛 보존.
 """
 from __future__ import annotations
 import os
 import sys
 from types import SimpleNamespace
 import pytest
 sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "app"))
 from ai.envelope import EscalationEnvelope  # noqa: E402
 from models.queue import StageDeferred  # noqa: E402
 from services.summarize_units import (  # noqa: E402
    CAP_TOKENS,
    estimate_tokens,
    plan_summarize_units,
 )
 import workers.deep_summary_worker as dsw  # noqa: E402
 # ─── fixtures ────────────────────────────────────────────────────────────────
 # 30 절 × 한글 2,000자 ≈ 31.7K tok (> TRIGGER 25K) · 절당 ≈ 1,060 tok (< CAP) → auto
 GIANT_AUTO_MD = "\n".join(f"# 절 {i}\n" + ("가" * 2_000) for i in range(30))
 # 헤딩 1개 + 한글 60,000자 단일 섹션 ≈ 31.7K tok (> CAP) → over% 100 → whole
 GIANT_WHOLE_MD = "# 통짜\n" + ("가" * 60_000)
 MAP_JSON = (
    '{"mode": "single", "tldr": "유닛 요약", "detail": "유닛 상세.",'
    ' "inconsistencies": [{"kind": "version_drift", "desc": "개정판 차이"}],'
    ' "confidence": 0.9}'
 )
 REDUCE_JSON = (
    '{"mode": "single", "tldr": "전체 요약", "detail": "최종 상세.",'
    ' "inconsistencies": [], "confidence": 0.8}'
 )
 class FakeSession:
    """commit 시점의 queue_row.payload 를 **객체 참조**로 박제 — SQLAlchemy 의 committed
    스냅샷과 동일하게, 이후 in-place 변경이 과거 커밋 객체에 소급 반영되는 aliasing
    (60254 라이브에서 unit 0 만 persist 된 버그)을 검증 시점 직렬화로 탐지한다."""
    def __init__(self, row=None):
        self.commits = 0
        self._row = row
        self.snapshots: list = []
    async def commit(self):
        self.commits += 1
        if self._row is not None:
            self.snapshots.append(self._row.payload)  # 참조 박제 — 복사 금지(의도)
 class FakeClient:
    """deep 슬롯 보유 클라이언트 — call_deep_or_defer 가 call_deep 을 타게 한다."""
    def __init__(self, responses=None, fail_indexes=frozenset(), defer_from=None):
        self.ai = SimpleNamespace(
            deep=SimpleNamespace(model="qwen-macbook", context_char_limit=260_000)
        )
        self.prompts: list[str] = []
        self._fail_indexes = fail_indexes  # 이 순번(0-based) 콜은 파싱 불가 응답
        self._defer_from = defer_from  # 이 순번부터 연결 실패(StageDeferred 변환 대상)
    async def call_deep(self, prompt: str, system=None) -> str:
        import httpx
        idx = len(self.prompts)
        if self._defer_from is not None and idx >= self._defer_from:
            raise httpx.ConnectError("macbook down")
        self.prompts.append(prompt)
        if idx in self._fail_indexes:
            return "정상 JSON 아님"
        if "유닛 요약 (총" in prompt:  # reduce 프롬프트 마커
            return REDUCE_JSON
        return MAP_JSON
    async def close(self):
        pass
 def _doc():
    return SimpleNamespace(
        id=999,
        extracted_text=GIANT_AUTO_MD,
        ai_detail_summary=None,
        ai_inconsistencies=None,
        ai_analysis_tier="triage",
        ai_processed_at=None,
    )
 def _envelope():
    return EscalationEnvelope(
        from_stage="classify",
        escalation_reasons=("long_context",),
        risk_flags=(),
        distilled_context="4B 요지",
        original_pointers={"doc_ids": [999]},
    )
@pytest.fixture
 def _patch_telemetry(monkeypatch):
    events: list[dict] = []
    async def fake_record(**kwargs):
        events.append(kwargs)
    monkeypatch.setattr(dsw, "record_analyze_event", fake_record)
    return events
 # ─── _hold_awaiting_split ────────────────────────────────────────────────────
@pytest.mark.asyncio
 async def test_hold_marks_payload_and_defers():
    plan = plan_summarize_units(GIANT_WHOLE_MD)
    assert plan.mode == "map_reduce" and plan.tier == "whole"
    session, row = FakeSession(), SimpleNamespace(payload={"envelope": {"x": 1}})
    with pytest.raises(StageDeferred) as ei:
        await dsw._hold_awaiting_split(session, row, plan, document_id=999)
    assert ei.value.retry_after_minutes == dsw.HOLD_RETRY_MINUTES
    assert session.commits == 1  # 마킹이 defer 전에 commit — consumer 재읽기에서 보존
    preseg = row.payload["presegment"]
    assert preseg["awaiting_split"] is True
    assert preseg["tier"] == "whole"
    assert preseg["units"] == len(plan.units)
    assert row.payload["envelope"] == {"x": 1}  # 기존 payload 병합 보존
 # ─── _process_map_reduce — 정상 경로 ────────────────────────────────────────
@pytest.mark.asyncio
 async def test_map_reduce_end_to_end(monkeypatch, _patch_telemetry):
    plan = plan_summarize_units(GIANT_AUTO_MD)
    assert plan.mode == "map_reduce" and plan.tier == "auto"
    n = len(plan.units)
    assert n >= 2  # greedy-pack 이 실제로 유닛을 나눴는지
    client = FakeClient()
    monkeypatch.setattr(dsw, "AIClient", lambda: client)
    doc = _doc()
    row = SimpleNamespace(payload={"envelope": {"x": 1}})
    session = FakeSession(row)
    await dsw._process_map_reduce(
        doc, row, _envelope(), "generic", plan, session,
        defer_on_deep_unavailable=False,
    )
    # 콜 수 = 유닛 map n + reduce 1
    assert len(client.prompts) == n + 1
    # 검증 게이트: 모든 콜 est_tokens <= CAP + 오버헤드(정책 템플릿+envelope ~3K)
    for p in client.prompts:
        assert estimate_tokens(p) <= CAP_TOKENS + 3_000
    # doc 기록 = reduce 출력, 불일치 = map 유닛 합본 dedup
    assert doc.ai_detail_summary == "최종 상세."
    assert doc.ai_analysis_tier == "deep"
    assert doc.ai_inconsistencies == [{"kind": "version_drift", "desc": "개정판 차이"}]
    # 유닛 단위 persist — 유닛마다 commit
    assert row.payload["presegment"]["units"] == n
    assert len(row.payload["presegment"]["map_results"]) == n
    assert session.commits == n
    # ★aliasing 회귀 방지: 각 commit 이 박제한 payload 객체를 사후에 봤을 때
    # map_results 가 1,2,...,n 로 단조 증가해야 한다. in-place 변경(구 버그)이면
    # 모든 스냅샷이 같은 dict 를 공유해 [n,n,...,n] 으로 보인다 = SQLAlchemy 가
    # committed 스냅샷과 new 가 같다고 판정해 UPDATE 를 스킵하는 것과 등가.
    per_commit_units = [
        len(s["presegment"]["map_results"]) for s in session.snapshots
    ]
    assert per_commit_units == list(range(1, n + 1))
    # telemetry 1건 (reduce 기준)
    events = _patch_telemetry
    assert len(events) == 1 and events[0]["error_code"] is None
 # ─── 멱등 재개 ───────────────────────────────────────────────────────────────
@pytest.mark.asyncio
 async def test_map_reduce_resume_skips_done_units(monkeypatch, _patch_telemetry):
    plan = plan_summarize_units(GIANT_AUTO_MD)
    n = len(plan.units)
    client = FakeClient()
    monkeypatch.setattr(dsw, "AIClient", lambda: client)
    done_unit = {
        "index": 0, "titles": ["절 0"], "tldr": "이전 요약", "detail": "이전 상세.",
        "inconsistencies": [],
    }
    row = SimpleNamespace(payload={
        "envelope": {"x": 1},
        "presegment": {"map_results": {"0": done_unit}},
    })
    doc, session = _doc(), FakeSession()
    await dsw._process_map_reduce(
        doc, row, _envelope(), "generic", plan, session,
        defer_on_deep_unavailable=False,
    )
    # 유닛 0 은 재호출 안 함 — map (n-1) + reduce 1
    assert len(client.prompts) == n
    assert row.payload["presegment"]["map_results"]["0"]["detail"] == "이전 상세."
    assert doc.ai_detail_summary == "최종 상세."
 # ─── map 유닛 실패 → raise (성공분 persist) ─────────────────────────────────
@pytest.mark.asyncio
 async def test_map_unit_parse_failure_raises_but_persists_good_units(
    monkeypatch, _patch_telemetry
 ):
    plan = plan_summarize_units(GIANT_AUTO_MD)
    n = len(plan.units)
    client = FakeClient(fail_indexes={1})  # 두 번째 map 콜만 파싱 불가
    monkeypatch.setattr(dsw, "AIClient", lambda: client)
    doc, session = _doc(), FakeSession()
    row = SimpleNamespace(payload={"envelope": {"x": 1}})
    with pytest.raises(ValueError, match="map 유닛"):
        await dsw._process_map_reduce(
            doc, row, _envelope(), "generic", plan, session,
            defer_on_deep_unavailable=False,
        )
    # 성공 유닛(n-1)은 persist — 재시도 시 실패 1건만 재호출
    assert len(row.payload["presegment"]["map_results"]) == n - 1
    assert "1" not in row.payload["presegment"]["map_results"]
    assert doc.ai_detail_summary is None  # doc 은 미기록
    assert _patch_telemetry == []  # 가짜 완료 이벤트 없음
 # ─── drain 보류 — 완료 유닛 보존 + StageDeferred 전파 ───────────────────────
@pytest.mark.asyncio
 async def test_map_defer_propagates_and_keeps_progress(monkeypatch, _patch_telemetry):
    plan = plan_summarize_units(GIANT_AUTO_MD)
    client = FakeClient(defer_from=1)  # 첫 유닛 성공 후 맥북 절단
    monkeypatch.setattr(dsw, "AIClient", lambda: client)
    doc, session = _doc(), FakeSession()
    row = SimpleNamespace(payload={"envelope": {"x": 1}})
    with pytest.raises(StageDeferred):
        await dsw._process_map_reduce(
            doc, row, _envelope(), "generic", plan, session,
            defer_on_deep_unavailable=True,  # drain 시멘틱 — 보류 전파
        )
    assert len(row.payload["presegment"]["map_results"]) == 1
    assert doc.ai_detail_summary is None
@@ -4,6 +4,8 @@ services/queue_overview 의 SQL 수집부와 분리된 순수 판정 함수
 (stage_machine_map / build_machines / build_summarize_eta / build_trend /
 build_totals / compute_eta_minutes / rows_to_* / display_title) 를
 mock 행으로 검증한다. 통합(실 SQL)은 배포 후 라이브 smoke 로 확인.
 2026-07-02 컷오버 후 2노드(나스+맥미니) 기준 — 구 3노드 레인은 제거됨.
 """
 from datetime import datetime
@@ -18,7 +20,6 @@ from services.queue_overview import (
    compute_eta_minutes,
    display_title,
    rows_to_stage_stats,
    rows_to_summarize_split,
    stage_machine_map,
 )
@@ -36,186 +37,115 @@ def _stage(**kw) -> dict:
    return base
 def _split(macbook: dict | None = None, macmini: dict | None = None) -> dict:
    """summarize 풀 완료 실적 split — 미지정 0."""
    zero = {"done_1h": 0, "done_today": 0, "done_15m": 0}
    return {
        "macbook": {**zero, **(macbook or {})},
        "macmini": {**zero, **(macmini or {})},
    }
 def _machine(machines: list[dict], key: str) -> dict:
    return next(m for m in machines if m["key"] == key)
 # ─── stage→machine 귀속 맵 ────────────────────────────────────────────────────
-def test_stage_machine_map_deep_enabled():
+def test_stage_machine_map_two_nodes():
-    smap = stage_machine_map(deep_enabled=True)
+    smap = stage_machine_map()
    for s in ("extract", "embed", "chunk", "markdown", "preview", "thumbnail", "fulltext", "stt"):
-        assert smap[s] == "gpu"
+        assert smap[s] == "nas"
    assert smap["classify"] == "macmini"
    assert smap["summarize"] == "macmini"
    assert smap["deep_summary"] == "macbook"
 def test_stage_machine_map_deep_disabled():
    """deep 슬롯 부재 시 deep_summary 도 macmini 귀속."""
    smap = stage_machine_map(deep_enabled=False)
    assert smap["deep_summary"] == "macmini"
 # ─── 머신 카드 귀속 합산 ──────────────────────────────────────────────────────
-def test_gpu_stage_counts_attribution():
+def test_nas_stage_counts_attribution():
    stats = {
        "extract": _stage(pending=3, processing=1, done_1h=5, done_today=9, done_15m=1),
        "stt": _stage(failed=2, done_1h=1, done_today=2),
    }
-    machines = build_machines(stats, _split(), [], deep_enabled=True)
+    machines = build_machines(stats, [])
-    gpu = _machine(machines, "gpu")
+    nas = _machine(machines, "nas")
-    assert (gpu["pending"], gpu["processing"], gpu["failed"]) == (3, 1, 2)
+    assert (nas["pending"], nas["processing"], nas["failed"]) == (3, 1, 2)
-    assert (gpu["done_1h"], gpu["done_today"]) == (6, 11)
+    assert (nas["done_1h"], nas["done_today"]) == (6, 11)
-    # gpu 의 stages 는 정적 8종 전부 (집계 0 이어도 표시)
+    # nas 의 stages 는 정적 8종 전부 (집계 0 이어도 표시)
-    assert gpu["stages"] == [
+    assert nas["stages"] == [
        "extract", "embed", "chunk", "markdown",
        "preview", "thumbnail", "fulltext", "stt",
    ]
-def test_summarize_pool_split_attribution():
+def test_macmini_llm_stages_attribution():
-    """summarize pending/failed = macmini 귀속, 완료 실적은 split 로 분리 —
+    """classify/summarize/deep_summary 전부 macmini 귀속 (단일 생성 LLM 허브)."""
    stage-level summarize done 수치는 카드에 이중 합산되지 않는다."""
    stats = {
        "classify": _stage(done_1h=2, done_today=3),
        "summarize": _stage(pending=7, failed=1, done_1h=10, done_today=20),
        "deep_summary": _stage(pending=2, processing=1, done_1h=3, done_today=4),
    }
-    split = _split(macbook={"done_1h": 4, "done_today": 8}, macmini={"done_1h": 6, "done_today": 12})
+    machines = build_machines(stats, [])
    machines = build_machines(stats, split, [], deep_enabled=True)
    macmini = _machine(machines, "macmini")
-    macbook = _machine(machines, "macbook")
+    assert macmini["pending"] == 9 and macmini["failed"] == 1
-
+    assert macmini["processing"] == 1
-    assert macmini["pending"] == 7 and macmini["failed"] == 1
+    assert macmini["done_1h"] == 2 + 10 + 3
-    assert macmini["done_1h"] == 2 + 6          # classify + macmini 몫 (10 아님)
+    assert macmini["done_today"] == 3 + 20 + 4
-    assert macmini["done_today"] == 3 + 12
+    assert macmini["stages"] == ["classify", "summarize", "deep_summary"]
-    assert macbook["done_1h"] == 4 and macbook["done_today"] == 8
+    assert _machine(machines, "nas")["pending"] == 0
    assert macbook["pending"] == 0              # 풀 pending 은 macmini 만
-def test_summarize_by_machine_projection():
+def test_deferred_pending_on_macmini_card():
-    """build_summarize_by_machine = split 의 done_1h/done_today 를 머신별로 투영
+    """보류(deferred_until 미래)는 summarize+deep_summary 합산으로 macmini 카드 귀속
-    (done_15m 은 제외 — 내부 state 판정 전용)."""
+    (보류 = LLM 백오프 신호)."""
    from services.queue_overview import build_summarize_by_machine
    split = _split(
        macbook={"done_1h": 226, "done_today": 312, "done_15m": 60},
        macmini={"done_1h": 37, "done_today": 94, "done_15m": 9},
    )
    sbm = build_summarize_by_machine(split)
    assert sbm == {
        "macmini": {"done_1h": 37, "done_today": 94},
        "macbook": {"done_1h": 226, "done_today": 312},
    }
    assert "done_15m" not in sbm["macbook"]
 def test_compose_overview_includes_summarize_by_machine():
    """compose_overview 응답 계약에 summarize_by_machine 포함 (FE 레인 분담 재료)."""
    now_kst = datetime(2026, 6, 13, 13, 0, tzinfo=KST)
    stats = {"summarize": _stage(pending=1317, done_1h=264)}
    split = _split(macbook={"done_1h": 226, "done_today": 312}, macmini={"done_1h": 37, "done_today": 94})
    ov = compose_overview(stats, split, {}, {}, [], deep_enabled=True, now_kst=now_kst)
    assert ov["summarize_by_machine"]["macbook"]["done_1h"] == 226
    assert ov["summarize_by_machine"]["macmini"]["done_today"] == 94
 def test_deep_disabled_deep_summary_counts_to_macmini():
    stats = {"deep_summary": _stage(pending=2, processing=1, done_1h=3, done_today=4)}
    machines = build_machines(stats, _split(), [], deep_enabled=False)
    macmini = _machine(machines, "macmini")
    macbook = _machine(machines, "macbook")
    assert macmini["pending"] == 2 and macmini["processing"] == 1
    assert macmini["done_1h"] == 3 and macmini["done_today"] == 4
    assert macbook["stages"] == [] and macbook["pending"] == 0
    assert _machine(machines, "macmini")["stages"] == ["classify", "summarize", "deep_summary"]
 def test_deferred_pending_always_on_macbook_card():
    """보류(deferred_until 미래)는 summarize+deep_summary 합산으로 macbook 카드 귀속.
    deep 슬롯 유무와 무관 (보류 = 맥북 불가 신호)."""
    stats = {
        "summarize": _stage(pending=5, deferred_pending=2),
        "deep_summary": _stage(pending=1, deferred_pending=1),
    }
-    for deep_enabled in (True, False):
+    machines = build_machines(stats, [])
-        machines = build_machines(stats, _split(), [], deep_enabled=deep_enabled)
+    assert _machine(machines, "macmini")["deferred_pending"] == 3
-        assert _machine(machines, "macbook")["deferred_pending"] == 3
+    assert _machine(machines, "nas")["deferred_pending"] == 0
        assert _machine(machines, "gpu")["deferred_pending"] == 0
        assert _machine(machines, "macmini")["deferred_pending"] == 0
 # ─── state 판정 ───────────────────────────────────────────────────────────────
-def test_macbook_state_active_wins_over_deferred_while_working():
+def test_macmini_state_active_wins_over_deferred_while_working():
    """가동 > 보류 (사용자 피드백 2026-06-11): 일하고 있으면 백오프 잔여가 있어도 '가동'.
    보류 건수는 deferred_pending 필드가 별도로 전달 — 카드 라인이 표시.
    """
-    stats = {"summarize": _stage(pending=1, deferred_pending=1)}
+    stats = {"summarize": _stage(pending=1, deferred_pending=1, done_15m=3)}
-    split = _split(macbook={"done_15m": 3})
+    machines = build_machines(stats, [])
-    machines = build_machines(stats, split, [], deep_enabled=True)
+    mm = _machine(machines, "macmini")
-    mb = _machine(machines, "macbook")
+    assert mm["state"] == "active"
-    assert mb["state"] == "active"
+    assert mm["deferred_pending"] == 1
    assert mb["deferred_pending"] == 1
-def test_macbook_state_deferred_only_when_not_working():
+def test_macmini_state_deferred_only_when_not_working():
    """일이 멈춰 있고(처리 0·최근 완료 0) 백오프만 쌓인 상태에서만 '보류'."""
    stats = {"summarize": _stage(pending=1, deferred_pending=1)}
-    machines = build_machines(stats, _split(), [], deep_enabled=True)
+    machines = build_machines(stats, [])
-    assert _machine(machines, "macbook")["state"] == "deferred"
+    assert _machine(machines, "macmini")["state"] == "deferred"
-def test_macbook_state_active_on_recent_qwen_done():
+def test_macmini_state_idle():
-    split = _split(macbook={"done_15m": 1})
+    machines = build_machines({}, [])
    machines = build_machines({}, split, [], deep_enabled=True)
    assert _machine(machines, "macbook")["state"] == "active"
 def test_macbook_state_idle():
    machines = build_machines({}, _split(), [], deep_enabled=True)
    assert _machine(machines, "macbook")["state"] == "idle"
 def test_gpu_state_active_on_processing():
    stats = {"extract": _stage(processing=1)}
    machines = build_machines(stats, _split(), [], deep_enabled=True)
    assert _machine(machines, "gpu")["state"] == "active"
 def test_gpu_state_active_on_recent_done():
    stats = {"embed": _stage(done_15m=2)}
    machines = build_machines(stats, _split(), [], deep_enabled=True)
    assert _machine(machines, "gpu")["state"] == "active"
 def test_gpu_state_idle_when_old_done_only():
    stats = {"embed": _stage(done_1h=5, done_today=9)}     # 15분 내 완료 없음
    machines = build_machines(stats, _split(), [], deep_enabled=True)
    assert _machine(machines, "gpu")["state"] == "idle"
 def test_macmini_state_not_active_on_macbook_pool_done():
    """summarize 풀 완료가 전부 macbook 몫이면 macmini 는 active 아님 (귀속 기준)."""
    stats = {"summarize": _stage(done_15m=1)}
    split = _split(macbook={"done_15m": 1})
    machines = build_machines(stats, split, [], deep_enabled=True)
    assert _machine(machines, "macmini")["state"] == "idle"
 def test_nas_state_active_on_processing():
    stats = {"extract": _stage(processing=1)}
    machines = build_machines(stats, [])
    assert _machine(machines, "nas")["state"] == "active"
 def test_nas_state_active_on_recent_done():
    stats = {"embed": _stage(done_15m=2)}
    machines = build_machines(stats, [])
    assert _machine(machines, "nas")["state"] == "active"
 def test_nas_state_idle_when_old_done_only():
    stats = {"embed": _stage(done_1h=5, done_today=9)}     # 15분 내 완료 없음
    machines = build_machines(stats, [])
    assert _machine(machines, "nas")["state"] == "idle"
 def test_macmini_state_active_on_summarize_processing():
    stats = {"summarize": _stage(processing=1)}
-    machines = build_machines(stats, _split(), [], deep_enabled=True)
+    machines = build_machines(stats, [])
    assert _machine(machines, "macmini")["state"] == "active"
@@ -228,21 +158,18 @@ def test_current_summarize_to_macmini_max_two():
        {"stage": "summarize", "document_id": 3, "title": "문서C", "original_filename": None, "file_path": None},
        {"stage": "extract", "document_id": 4, "title": "문서D", "original_filename": None, "file_path": None},
    ]
-    machines = build_machines({}, _split(), rows, deep_enabled=True)
+    machines = build_machines({}, rows)
    macmini = _machine(machines, "macmini")
-    gpu = _machine(machines, "gpu")
+    nas = _machine(machines, "nas")
    assert [c["document_id"] for c in macmini["current"]] == [1, 2]    # 최대 2건
    assert macmini["current"][0] == {"document_id": 1, "title": "문서A", "stage": "summarize"}
-    assert [c["document_id"] for c in gpu["current"]] == [4]
+    assert [c["document_id"] for c in nas["current"]] == [4]
    assert _machine(machines, "macbook")["current"] == []
-def test_current_deep_summary_follows_deep_slot():
+def test_current_deep_summary_to_macmini():
    rows = [{"stage": "deep_summary", "document_id": 9, "title": "심층", "original_filename": None, "file_path": None}]
-    enabled = build_machines({}, _split(), rows, deep_enabled=True)
+    machines = build_machines({}, rows)
-    disabled = build_machines({}, _split(), rows, deep_enabled=False)
+    assert _machine(machines, "macmini")["current"][0]["document_id"] == 9
    assert _machine(enabled, "macbook")["current"][0]["document_id"] == 9
    assert _machine(disabled, "macmini")["current"][0]["document_id"] == 9
 def test_display_title_fallback_chain():
@@ -344,32 +271,15 @@ def test_rows_to_stage_stats_conversion():
    assert stats["summarize"]["deferred_pending"] == 2
 def test_rows_to_summarize_split_conversion():
    rows = [
        (True, 4, 8, 1),       # is_macbook
        (False, 6, 12, 0),
    ]
    split = rows_to_summarize_split(rows)
    assert split["macbook"] == {"done_1h": 4, "done_today": 8, "done_15m": 1}
    assert split["macmini"] == {"done_1h": 6, "done_today": 12, "done_15m": 0}
 def test_rows_to_summarize_split_empty():
    split = rows_to_summarize_split([])
    assert split["macbook"]["done_1h"] == 0 and split["macmini"]["done_today"] == 0
 def test_compose_overview_contract_shape():
    """응답 dict 의 키가 FE 계약 shape 과 정확히 일치하는지 고정."""
    out = compose_overview(
        {"summarize": _stage(pending=1)},
        _split(),
        {}, {}, [],
        deep_enabled=True,
        now_kst=datetime(2026, 6, 11, 14, 30, tzinfo=KST),
    )
    assert set(out.keys()) == {"machines", "stages", "summarize_eta", "trend_24h", "totals"}
-    assert [m["key"] for m in out["machines"]] == ["gpu", "macmini", "macbook"]
+    assert [m["key"] for m in out["machines"]] == ["nas", "macmini"]
    for m in out["machines"]:
        assert set(m.keys()) == {
            "key", "label", "state", "stages", "pending", "processing", "failed",
@@ -381,7 +291,7 @@ def test_compose_overview_contract_shape():
    assert set(out["trend_24h"][0].keys()) == {"hour", "inflow", "done"}
    assert set(out["totals"].keys()) == {"pending", "processing", "failed"}
    # 머신 label 고정 (raw 모델명 노출 금지 — label 만)
-    assert [m["label"] for m in out["machines"]] == ["GPU 서버", "맥미니", "맥북 M5 Max"]
+    assert [m["label"] for m in out["machines"]] == ["나스", "맥미니"]
 # ─── build_stages (단계별 현황 — 2026-06-11 사용자 피드백: 완료 가시화) ──────
@@ -0,0 +1,54 @@
 """rerank 프로토콜 정규화 단위 테스트 — 2노드 이관 P1-4 (llama.cpp /v1/rerank).
 순수 함수(ai/rerank_protocol.py)만 대상 — HTTP/DB 의존 없음.
 실행: PYTHONPATH=app pytest tests/test_rerank_protocol.py
 """
 import json
 from pathlib import Path
 from ai.rerank_protocol import normalize_llamacpp_rerank
 FIXTURES = Path(__file__).parent / "fixtures"
 def test_normalize_llamacpp_shape_and_desc_sort():
    payload = {
        "model": "bge-reranker-v2-m3",
        "results": [
            {"index": 0, "relevance_score": 0.12},
            {"index": 1, "relevance_score": 2.21},
            {"index": 2, "relevance_score": -1.5},
        ],
    }
    out = normalize_llamacpp_rerank(payload)
    # TEI 계약: [{"index","score"}] score 내림차순
    assert [r["index"] for r in out] == [1, 0, 2]
    assert all(set(r) == {"index", "score"} for r in out)
    assert out[0]["score"] == 2.21
 def test_normalize_llamacpp_missing_fields_skipped():
    payload = {
        "results": [
            {"index": 0},  # relevance_score 없음 → 버림
            {"relevance_score": 1.0},  # index 없음 → 버림
            {"index": 3, "relevance_score": 0.5},
        ]
    }
    assert normalize_llamacpp_rerank(payload) == [{"index": 3, "score": 0.5}]
 def test_normalize_llamacpp_empty_and_absent_results():
    assert normalize_llamacpp_rerank({}) == []
    assert normalize_llamacpp_rerank({"results": []}) == []
 def test_tei_fixture_shape_is_already_contract():
    """TEI 캡처 fixture(Phase 2B G0-1 spec 박제)의 실응답이 정규화 없이 계약 형태임을 확인."""
    doc = json.loads((FIXTURES / "tei_rerank_response.json").read_text())
    captured = doc["captured_responses"]["baseline_bge_v2_m3"]["raw"]
    assert isinstance(captured, list) and captured
    assert {"index", "score"} <= set(captured[0])
    # spec 문자열도 계약과 일치 (score desc 정렬 포함)
    assert "index" in doc["response_shape"] and "score" in doc["response_shape"]
Author	SHA1	Message	Date
hyungi	b91b05e889	refactor(board): 처리 머신 보드 나스+맥미니 2노드 재구성 2026-07-02 컷오버 반영 — GPU 서버 퇴역, 맥북 night-drain 보류(06-29 결정). - 레인 2개: 나스(추출/마크다운/청크·임베딩 등 DS 본체 Docker 스테이지), 맥미니(분류/요약/심층분석 — 단일 생성 LLM 허브 + bge-m3/리랭크) - summarize 풀 분리(summarize_by_machine·ai_model_version 조인 SQL) 제거 — FE 유일 소비자 확인 후 응답 스키마에서 정리 (5쿼리 -> 4쿼리) - 맥북 전제 UI 제거: 요약 오프로드 분담막대·요약 합류 칩·번다운 합류 변곡점 마커·잠듦 문구·전역 스트립 맥북 칩(맥미니 칩으로 대체) - deferred_pending = LLM 백오프 신호로 맥미니 카드 귀속 (기능 보존) - 번다운 차트·정직 ETA·실패 드로어·백그라운드 작업 등 머신 무관 기능 보존 - background_jobs 머신 귀속 기본값 gpu -> nas - 단위테스트 2노드 기준 재작성 (27 passed) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-02 16:51:32 +09:00
hyungi	304a2b9c0f	Merge pull request 'Feat/two node endpoints' (#51 ) from feat/two-node-endpoints into main Reviewed-on: #51	2026-07-02 14:31:27 +09:00
hyungi	d53fcc2b36	feat(search): MAX_RERANK_INPUT env 조정 가능화 — 2노드 리랭크 지연 대응 맥미니 llama.cpp 리랭크는 후보 수 선형(실측 50=0.60s/200=1.89s) — NAS 배포에서 MAX_RERANK_INPUT=50 으로 tail 지연 축소. 기본 200 = 현행 무회귀. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-02 13:30:04 +09:00
hyungi	43594620b1	fix(tests): rerank fixture 경로 정정 — captured_responses.*.raw 가 실응답 리스트 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-02 13:11:33 +09:00
hyungi	b73a5cc601	feat(infra): 2노드 이관 P1-4 — rerank 프로토콜 스위치(tei\|llamacpp)·OCR/STT 명시 게이트·413 재홈 - AIModelConfig.protocol 판별자 신설(기본 tei = 무회귀), llamacpp = /v1/rerank 요청·응답 스키마 정규화(ai/rerank_protocol.py 순수함수 + 단위테스트 4) - OCR_ENABLED/STT_ENABLED 명시 게이트 — GPU CUDA 서비스(Surya/faster-whisper) 폐기 대응, silent 아님(경고 로그 + extract_meta 터미널 기록) - DS Caddyfile request_body 100MB — 413 정책을 edge(home-caddy)에서 내부로 재홈 (DSM 리버스 프록시 전환 대비, upload.max_bytes 정합) - SSE X-Accel-Buffering는 기점검 결과 기구현(eid_chat)이라 무변경 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-02 13:11:06 +09:00
hyungi	3b7fd900e4	fix(summarize): map_results persist aliasing — 유닛 스냅샷 소급 오염으로 UPDATE 스킵 60254 라이브 E2E 에서 발견: 완주는 성공했으나 payload.presegment.map_results 에 unit 0 만 persist. 원인 = map_results dict 를 in-place 변경 → 직전 commit 의 SQLAlchemy committed 스냅샷이 같은 중첩 객체를 참조 → old==new 판정 → 2번째 commit 부터 UPDATE 스킵. 멱등 재개 시 완료 유닛 재호출 비용 발생(정확성 무영향). fix = 매 유닛 map_results/preseg/payload 전부 새 dict 재구성(공유 참조 0). test = FakeSession 이 commit 시점 payload 객체 참조를 박제, 사후 직렬화로 스냅샷 유닛 수가 1..n 단조 증가 단정 — 구 코드에 대해 FAILED 네거티브 검증 완료. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-02 09:47:57 +09:00
hyungi	c2077b3108	feat(summarize): presegment PR2 — deep_summary 분기 + HOLD 배선 (TIER1 로컬 map-reduce) plan ds-presegment-mapreduce-2. TRIGGER(25K tok) 이하 = 기존 단일콜 byte-불변 무회귀. 초과 시 3-way over% 게이트: auto=유닛별 map(26B)→reduce(26B, p3c_deep_summary_reduce 변형) → ai_detail_summary 동일 기록(불일치=reduce+map 합본 dedup) / hybrid·whole= HOLD(payload.presegment.awaiting_split + StageDeferred 24h, 맥미니 미전송 — 알람· 클로드 유인 분할은 PR3). - 유닛 단위 멱등 재개: 성공 유닛 즉시 payload.map_results commit — 502/defer/재시작 후 완료 유닛 skip, 실패 유닛만 raise→기존 attempts/백오프 재사용 - 모든 LLM 콜 캡(12K tok) 이하 — map=greedy-pack 보장, reduce=build_reduce_units_block 비례 절단 보장, est_tokens 로그로 단정 가능 - 콜 사이 gate 해제 → 짧은 인터랙티브 요청 interleave (허브 굶김 해소 본체) - fix: summarize_units 의 `from app.services...` 절대 import — 컨테이너(빌드 컨텍스트 ./app)에 app 패키지가 없어 배선 시 ModuleNotFoundError 나는 PR1 잠복 버그 → 상대 import 로 수정 (컨테이너/repo-root 테스트 양쪽 동작) - tests: 헬퍼 6 + worker seam 5 (map-reduce e2e·재개·유닛실패·drain 보류·HOLD) — PR1 15 포함 26 passed, 인접 policy/hier_decomp/fair_share 123 passed Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-02 09:14:22 +09:00
hyungi	51e8034759	feat(safety): 안전 자료실 UI Phase 3 — /safety 3탭(재해·법령지침·서적표준) safety-library-1 Phase 3 슬라이스. /safety=재해 redirect, 탭=incident / law·guide 세그먼트(법령 기본 KR) / standard·book·manual·paper 프리셋. 공용 SafetyDocList(GET /documents/ material_type C-1 계약 재사용, 백엔드 무변경=freeze 정합) + Sidebar 네비 1건. 케이스 그룹핑·version_status 뱃지=API 확장 필요라 후속. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-01 23:13:12 +00:00
hyungi	61e70864e4	feat(summarize): presegment PR1 — summarize_units 순수함수(greedy-pack + 3-way 게이트) plan ds-presegment-mapreduce-2 PR1. CAP 12K tok/unit · TRIGGER 25K · over% 게이트(0=auto/<=40=hybrid/>40=whole). 토큰추정=PR0 실 Qwen 캘리브 (KO 0.529/기타 0.217 tok/char). leaf=hier_decomp.builder 재사용 (leaf_hard_max=inf 로 window-split 억제). 순수함수·DB/IO 0·배선은 PR2. tests/summarize_units 15 passed. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-01 23:07:40 +00:00
hyungi	a182def9e6	ops(deps): requirements.lock 도입 — 라이브 pip freeze 101개 완전 핀 DS 보안감사 리메디 6순위 잔재(lockfile) 종결. requirements.txt(floor 사양)는 유지, Dockerfile 설치 소스를 requirements.lock(== 핀)으로 전환 — 재빌드 시 의존성 변동 위험 제거. lock = 라이브 컨테이너 known-good freeze 스냅샷. 검증: 신규 이미지 freeze == lock 일치·import smoke·클린부팅·health 200. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-01 22:28:27 +00:00
hyungi	6d447f9cba	feat(study): 이론↔문제 브리지 (Stage B) — 개념별 정답률·약점 개념 지도 이론공부 B→A→C 의 B. 완성된 문제풀이에 이론 연결(약점 구동). - 마이그 382 study_concept_links(개념 doc↔기출, FK 없음) + 백필 SQL(임베딩 코사인 top-k=10·threshold 0.62 → 2362링크·284개념·964문항) - concept_links 서비스(related_questions·weakness_map 롤업) + GET /concepts/{id}/questions·/concepts/weakness-map(라우트 순서=weakness-map 먼저) - 리더 관련기출 섹션(정답률·문항 stub→문항상세) + 홈 약점개념 위젯 - 적대리뷰 반영: Promise.all 격리(weakness-map 실패→코어 대시보드 블랙아웃 방지)·q.subject null 폴백. 백필=배포 후 트랜잭션 래핑 실행. 문제풀이 무접촉 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-01 12:05:09 +09:00
hyungi	f38ec177d7	feat(study): 개념 학습 리더 (Stage A) — 구조 파싱·떠올리기·백링크 이론공부 개선 B→A→C 의 A. 개념노트를 구조(요약/본문/빈출★/관련개념)로 렌더 + 능동 회상(떠올리기) + 관련개념 백링크 + 이전/다음. - concept_parser: md 골격 파서(273/273 불변식) + 관련개념 백링크 해소(exact→title⊆phrase substring, 과대매치 가드) - concept_curriculum.concept_detail + GET /api/study/concepts/{id} (개념문서 태그 스코프) - /study/read/[docId] 리더(MarkdownDoc KaTeX+docimg 재사용·읽기/떠올리기 모드) + 홈 오늘의개념 링크 연결 - 적대리뷰 5건 반영(이중로드·substring 오결선·엔드포인트 스코프·prev/next 결정성·in-flight 가드). 마이그 없음·문제풀이 무접촉 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-01 11:51:40 +09:00