Files
hyungi_document_server/migrations/318_document_chunks_char_start.sql
T
hyungi aeb9290cbd feat(documents): hier 절 char_start offset (Path B) — md_content 점프 builder offset
플랜 ds-outline-anchor-b5 (g1~g6 코드). 핵심 ASME/법령 windowed 절의 0% 점프를
서버계산 char_start(builder offset)로 100% deterministic 점프로 전환.

- g1 migration 318: document_chunks.char_start INTEGER NULL (단일 statement, 멱등)
- g2 builder: char_start emit = FE 라인/offset 모델 미러(split('\n')+UTF-16 code unit+코드펜스 skip).
  window-child=NULL, split-parent=heading offset, preamble=NULL, CR 미strip, NFC=telemetry.
  node.text 보존(라인모델 hash-neutral) → hash_stable doc 보존. 단위테스트 7건.
- g3 persist+backfill 하이브리드:
  * persist INSERT char_start
  * update-char-start (g3-tU): hash_stable doc 비파괴 — 100% jump-target VERIFY(NEW-1) +
    position-aligned PK UPDATE(NEW-2), 미달 doc DEMOTE → re-decompose 합류(NEW-4)
  * --reprocess (g3-t2): md_content 출처(g0-t1) + jump-target-set 완료마커(B1) + B_jumptarget>=1(B3),
    --doc 필수 else REFUSE. self-heal sweep(g3-t3).
- g4 /sections: char_start inner+outer SELECT + split-parent 노출(is_leaf OR %_split)
- g5 FE: resolveAnchorMap(BE-first, NEW-5 jump-target-candidate-scoped 폴백, C1 OR-exclude),
  per-render-site basis guard(C3), endsWith('_split') 정정 + collapseWindows split-parent 흡수(C2).
  단위테스트 25건(NEW-5/B4/C1/C2 포함).
- g6 hier_outline_quality_gate.py: read-only g-measure(verdict/B_jumptarget/hash_stable/dup/fence)

배포(g7: --no-deps, 스냅샷, UPDATE-only 32 + re-decompose 230∪demote, 정확도 게이트)는 별 ops 단계.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 10:12:26 +09:00

16 lines
1.0 KiB
SQL

-- 318_document_chunks_char_start.sql
-- 플랜 ds-outline-anchor-b5 (Path B, g1-t1): hier 절 → md_content 본문 점프용 offset 컬럼.
--
-- char_start = md_content 내 heading 라인 시작 offset, **UTF-16 code unit** 기준
-- (FE outlineAnchors.ts:64 `off += raw.length + 1` / MarkdownDoc.svelte:63 `out.slice(off)` 와 동일 단위).
-- NULL 허용 = (a) md_content 없음(legacy/news/Path A) (b) window-child(node_type='window') (c) preamble(title NULL).
-- → jump-target(비-window leaf OR %_split parent)만 NOT NULL 을 받는다(BY DESIGN, B1/B3 완료마커 기준).
--
-- 두 backfill 경로 공통 prereq:
-- - UPDATE-only path(g3-tU, hash_stable): 저장된 hier 행에 char_start 만 UPDATE (DELETE/CASCADE/재임베딩 0).
-- - re-decompose path(g3-t2, hash_changed): persist INSERT 시 char_start 동봉.
--
-- 멱등: ADD COLUMN IF NOT EXISTS + init_db version-skip + pg_advisory_xact_lock. BEGIN/COMMIT 금지(단일 statement).
ALTER TABLE document_chunks ADD COLUMN IF NOT EXISTS char_start INTEGER NULL;