feat(search): expose hier section outline & summaries in document detail

PR-DocSrv-Hier-Section-UI-1 Phase 1 (코드+커밋만, 배포는 Phase 2 backfill 완주 후).

- backend: GET /documents/{id}/sections — hier leaf 목차 + chunk_section_analysis
  요약. document_chunks 직접 조회(retrieval 아닌 목차 표시라 corpus_chunks 뷰
  의도적 우회 — docstring 명시). DISTINCT ON 으로 최신 분석 1행.
- frontend: SectionOutline.svelte(좌측 목차, per-doc 동적 그룹/flat, window
  dedupe, 클릭 시 요약/breadcrumb 인라인), headingPath.ts 순수 유틸(+node:test
  단위테스트 8케이스). [id]/+page.svelte 3-zone 레이아웃 + 우측 메타 Tabs
  [정보|AI|관리] 로 카드 스프롤 해소.
- 절 없는 문서/404 는 목차 숨김(graceful). 본문 점프는 follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
hyungi
2026-05-25 00:22:34 +00:00
parent ec174fc1e7
commit f7198d9d68
5 changed files with 537 additions and 49 deletions
+75
View File
@@ -537,6 +537,81 @@ async def get_document(
return DocumentDetailResponse.model_validate(doc)
# ─── 절(hier section) 목차 + 요약 (PR-DocSrv-Hier-Section-UI-1) ───
class SectionItem(BaseModel):
chunk_id: int
section_title: str | None = None # raw 마크다운 포함 — 정제는 프런트(headingPath.ts)
heading_path: str | None = None # raw
level: int | None = None
node_type: str | None = None # window | section_split | null
is_leaf: bool
section_type: str | None = None
summary: str | None = None # status='summarized' 인 분석행에만, 그 외 None
confidence: float | None = None
class DocumentSectionsResponse(BaseModel):
doc_id: int
sections: list[SectionItem]
@router.get("/{doc_id}/sections", response_model=DocumentSectionsResponse)
async def get_document_sections(
doc_id: int,
user: Annotated[User, Depends(get_current_user)],
session: Annotated[AsyncSession, Depends(get_session)],
):
"""문서의 hier 절(leaf) 목차 + 절-레벨 요약(chunk_section_analysis).
⚠ 뷰 우회 — 의도적 예외 (변경 금지):
retrieval 경로(retrieval_service / *_rag)는 in_corpus=false 누출 방지를 위해
반드시 corpus_chunks 뷰만 본다. 그러나 이 endpoint 는 retrieval 이 아니라
"문서 전체 leaf 목차 표시"라서 in_corpus=false(검색 비활성) 절도 보여야 하므로
document_chunks 를 직접 조회한다. corpus_chunks 로 바꾸면 비활성 절이 목차에서
사라지는 회귀가 생기니 절대 바꾸지 말 것. (Hier-Decomp 코퍼스 격리 규율의 명시적 예외)
DISTINCT ON (c.id) + ORDER BY a.created_at/a.id DESC: chunk 당 최신 분석 1행만
(prompt_version 다중 시 중복 JOIN 방지). 절 없는 문서(legacy/news)는 sections=[].
"""
from sqlalchemy import text as sql_text
doc = await session.get(Document, doc_id)
if not doc or doc.deleted_at is not None:
raise HTTPException(status_code=404, detail="문서를 찾을 수 없습니다")
rows = (
await session.execute(
sql_text(
"""
SELECT chunk_id, section_title, heading_path, level, node_type, is_leaf,
section_type, summary, confidence
FROM (
SELECT DISTINCT ON (c.id)
c.id AS chunk_id, c.chunk_index, c.section_title, c.heading_path,
c.level, c.node_type, c.is_leaf,
a.section_type,
CASE WHEN a.status = 'summarized' THEN a.summary ELSE NULL END AS summary,
a.confidence
FROM document_chunks c
LEFT JOIN chunk_section_analysis a
ON a.chunk_id = c.id AND a.status = 'summarized'
WHERE c.doc_id = :doc_id
AND c.source_type = 'hier_section'
AND c.is_leaf = true
ORDER BY c.id, a.created_at DESC, a.id DESC
) t
ORDER BY t.chunk_index
"""
).bindparams(doc_id=doc_id)
)
).mappings().all()
return DocumentSectionsResponse(
doc_id=doc_id,
sections=[SectionItem(**dict(r)) for r in rows],
)
# ─── 자료실 인접 자료 (이전/다음) ───
# 학습 흐름: 한 자료 다 읽으면 같은 챕터의 다음 자료로 자연스럽게 이동.
# library_path (정확 일치 + 하위 prefix) 안에서 title 오름차순 기준.
@@ -0,0 +1,117 @@
<script lang="ts">
// 문서 상세 좌측 절(section) 목차 (PR-DocSrv-Hier-Section-UI-1).
// - groupOrFlat 로 per-doc 동적 (top-segment 1단 그룹 vs flat).
// - 항목 클릭 → 인라인 아코디언으로 요약/section_type/heading_path breadcrumb 표시.
// - 본문 스크롤 점프 없음(§Q2, deep-link 는 follow-up). summary=NULL 은 "요약 없음" 문구.
import Badge from '$lib/components/ui/Badge.svelte';
import {
cleanHeading,
pathSegments,
groupOrFlat,
sectionTypeLabel,
type DocumentSection,
type OutlineItem,
} from '$lib/utils/headingPath';
interface Props {
sections: DocumentSection[];
}
let { sections }: Props = $props();
let layout = $derived(groupOrFlat(sections));
let total = $derived(sections.length);
let selectedId = $state<number | null>(null);
function toggle(item: OutlineItem) {
const id = item.section.chunk_id;
selectedId = selectedId === id ? null : id;
}
function title(s: DocumentSection): string {
return cleanHeading(s.section_title) || pathSegments(s.heading_path).at(-1) || '(제목 없음)';
}
function isLowConf(s: DocumentSection): boolean {
return s.confidence != null && s.confidence < 0.5;
}
</script>
{#snippet itemRow(item: OutlineItem)}
{@const s = item.section}
{@const open = selectedId === s.chunk_id}
{@const typeLabel = sectionTypeLabel(s.section_type)}
<li>
<button
type="button"
onclick={() => toggle(item)}
aria-expanded={open}
class={[
'w-full text-left px-2 py-1.5 rounded-md text-xs flex items-start gap-1.5 transition-colors',
open ? 'bg-surface-active text-text' : 'text-dim hover:bg-surface hover:text-text',
].join(' ')}
>
<span class="flex-1 min-w-0 leading-snug break-words">{title(s)}</span>
<span class="flex items-center gap-1 shrink-0">
{#if item.fragmentCount > 1}
<Badge tone="neutral" size="sm">{item.fragmentCount}조각</Badge>
{/if}
{#if typeLabel}
<Badge tone="accent" size="sm">{typeLabel}</Badge>
{/if}
</span>
</button>
{#if open}
<div class="px-2 pb-2 pt-1 text-xs">
{#if pathSegments(s.heading_path).length}
<div class="text-faint mb-1 leading-snug break-words">
{pathSegments(s.heading_path).join(' ')}
</div>
{/if}
{#if s.summary}
<p class="text-text leading-relaxed whitespace-pre-line">{s.summary}</p>
{#if isLowConf(s)}
<div class="mt-1.5">
<Badge tone="warning" size="sm">저신뢰 — 표 추출이 불완전할 수 있음</Badge>
</div>
{/if}
{:else}
<p class="text-faint italic">요약 없음 — 짧은 절이거나 아직 분석되지 않았습니다.</p>
{/if}
</div>
{/if}
</li>
{/snippet}
<div class="text-xs">
<h3 class="text-xs font-semibold text-dim uppercase mb-2 flex items-center justify-between">
<span>절 목차</span>
<span class="text-faint font-normal">{total}</span>
</h3>
{#if layout.mode === 'group'}
<div class="space-y-3">
{#each layout.groups as g (g.key)}
<div>
<div
class={[
'px-2 mb-1 text-[11px] font-semibold uppercase tracking-wide',
g.isOther ? 'text-faint' : 'text-dim',
].join(' ')}
>
{g.key}
<span class="font-normal text-faint">({g.items.length})</span>
</div>
<ul class="space-y-0.5">
{#each g.items as item (item.section.chunk_id)}
{@render itemRow(item)}
{/each}
</ul>
</div>
{/each}
</div>
{:else}
<ul class="space-y-0.5">
{#each layout.items as item (item.section.chunk_id)}
{@render itemRow(item)}
{/each}
</ul>
{/if}
</div>
+110
View File
@@ -0,0 +1,110 @@
// 순수함수 회귀 테스트. 실행(로컬, 의존성 0): node --test src/lib/utils/headingPath.test.ts
// (Node ≥23 또는 22.6+ --experimental-strip-types — TS 타입 네이티브 strip.)
import { test } from 'node:test';
import assert from 'node:assert/strict';
import {
cleanHeading,
pathSegments,
collapseWindows,
groupOrFlat,
sectionTypeLabel,
type DocumentSection,
} from './headingPath.ts';
let _id = 0;
function sec(p: Partial<DocumentSection>): DocumentSection {
return {
chunk_id: ++_id,
section_title: null,
heading_path: null,
level: null,
node_type: null,
is_leaf: true,
section_type: null,
summary: null,
confidence: null,
...p,
};
}
test('cleanHeading: 마크다운/HTML 잔재 strip', () => {
assert.equal(cleanHeading('**UG-5 PLATE**<sup>2</sup>'), 'UG-5 PLATE');
assert.equal(cleanHeading(' **DESIGN** '), 'DESIGN');
assert.equal(cleanHeading('a b\tc'), 'a b c');
assert.equal(cleanHeading(null), '');
assert.equal(cleanHeading(''), '');
});
test('pathSegments: > 분할 + 정제', () => {
assert.deepEqual(pathSegments('**A** > **B**<sup>1</sup> > C'), ['A', 'B', 'C']);
assert.deepEqual(pathSegments(null), []);
assert.deepEqual(pathSegments(' '), []);
});
test('sectionTypeLabel: 한글 매핑 + passthrough', () => {
assert.equal(sectionTypeLabel('requirement'), '요건');
assert.equal(sectionTypeLabel('unknown_type'), 'unknown_type');
assert.equal(sectionTypeLabel(null), null);
});
test('collapseWindows: 연속 동일 heading window 만 dedupe, 순서 유지', () => {
const input = [
sec({ heading_path: 'Intro', node_type: null }),
sec({ heading_path: 'Pearson', node_type: 'window' }),
sec({ heading_path: 'Pearson', node_type: 'window' }),
sec({ heading_path: 'Pearson', node_type: 'window' }),
sec({ heading_path: 'Conf', node_type: null }),
sec({ heading_path: 'Pearson', node_type: 'window' }), // 비연속 → 새 항목
];
const out = collapseWindows(input);
assert.equal(out.length, 4);
assert.equal(out[0].fragmentCount, 1); // Intro
assert.equal(out[1].fragmentCount, 3); // Pearson ×3 합침
assert.equal(out[2].fragmentCount, 1); // Conf
assert.equal(out[3].fragmentCount, 1); // 비연속 Pearson
// 순서 보존
assert.deepEqual(
out.map((o) => cleanHeading(o.section.heading_path)),
['Intro', 'Pearson', 'Conf', 'Pearson'],
);
});
test('groupOrFlat: 적은 그룹 + 낮은 기타% → group (5140-류)', () => {
// 3 top segment × 4 = 12절, window 없음 → group_count 3, 기타 0%
const sections: DocumentSection[] = [];
for (const top of ['장1', '장2', '장3']) {
for (let i = 0; i < 4; i++) sections.push(sec({ heading_path: `${top} > 절${i}` }));
}
const layout = groupOrFlat(sections);
assert.equal(layout.mode, 'group');
assert.equal(layout.groups.length, 3);
assert.deepEqual(layout.groups.map((g) => g.key), ['장1', '장2', '장3']); // 등장순서
assert.equal(layout.groups[0].items.length, 4);
});
test('groupOrFlat: 기타% ≥ 50 → flat 강등 (5186/5225-류)', () => {
const sections: DocumentSection[] = [
sec({ heading_path: 'A > a1' }),
sec({ heading_path: 'B > b1' }),
sec({ node_type: 'window', heading_path: 'W1' }),
sec({ node_type: 'window', heading_path: 'W2' }),
sec({ node_type: 'section_split', heading_path: 'S1' }),
sec({ node_type: 'window', heading_path: 'W3' }), // 기타 4/6 = 66.7%
];
const layout = groupOrFlat(sections);
assert.equal(layout.mode, 'flat');
assert.ok(layout.items.length > 0);
});
test('groupOrFlat: group_count > 30 → flat 강등', () => {
const sections: DocumentSection[] = [];
for (let i = 0; i < 31; i++) sections.push(sec({ heading_path: `seg${i} > x` }));
const layout = groupOrFlat(sections);
assert.equal(layout.mode, 'flat');
});
test('groupOrFlat: 빈 입력 → flat, 항목 0', () => {
const layout = groupOrFlat([]);
assert.equal(layout.mode, 'flat');
assert.equal(layout.items.length, 0);
});
+154
View File
@@ -0,0 +1,154 @@
// hier 절(section) 목차 표시용 순수 유틸 (PR-DocSrv-Hier-Section-UI-1).
// SvelteKit/Svelte 의존 0 → Node 내장 test runner(`node --test`)로 검증 가능.
//
// 책임:
// - cleanHeading: section_title/heading_path 의 raw 마크다운/HTML 잔재 strip.
// - pathSegments: heading_path("A > B > C")를 정제 세그먼트 배열로.
// - collapseWindows: 연속 동일 heading 의 node_type='window'(과대 본문 인공 분할) dedupe.
// - groupOrFlat: per-doc 동적 판정 — top-segment 1단 그룹 vs flat (실측 임계 기반).
export interface DocumentSection {
chunk_id: number;
section_title: string | null;
heading_path: string | null;
level: number | null;
node_type: string | null; // 'window' | 'section_split' | null
is_leaf: boolean;
section_type: string | null;
summary: string | null;
confidence: number | null;
}
/** window dedupe 후 목차 한 항목 (대표 절 + 합쳐진 조각 수). */
export interface OutlineItem {
section: DocumentSection;
fragmentCount: number; // >1 이면 "(n조각)" 배지
}
export interface OutlineGroup {
key: string; // top segment (OTHER → '기타')
isOther: boolean;
items: OutlineItem[];
}
export interface OutlineLayout {
mode: 'group' | 'flat';
items: OutlineItem[]; // flat 모드에서 채워짐
groups: OutlineGroup[]; // group 모드에서 채워짐
}
const OTHER = '__OTHER__';
// 동적 그룹 판정 임계 (실측 pilot 3 검증: 5140 group→그룹 / 5186·5225→flat).
const GROUP_MIN = 2;
const GROUP_MAX = 30;
const OTHER_PCT_MAX = 50;
/** section_type → 한글 라벨 (느슨한 enum, 미지정/미상은 그대로 표시). */
export const SECTION_TYPE_LABEL: Record<string, string> = {
definition: '정의',
requirement: '요건',
procedure: '절차',
formula: '수식',
data_table: '표·데이터',
example: '예시',
case_study: '사례',
question: '문제',
reference: '참조',
overview: '개요',
other: '기타',
};
export function sectionTypeLabel(t: string | null | undefined): string | null {
if (!t) return null;
return SECTION_TYPE_LABEL[t] ?? t;
}
export function cleanHeading(raw: string | null | undefined): string {
if (!raw) return '';
return raw
.replace(/<sup>.*?<\/sup>/gi, '') // 각주 위첨자
.replace(/<sub>.*?<\/sub>/gi, '')
.replace(/<[^>]+>/g, '') // 잔여 HTML 태그
.replace(/\*\*/g, '') // **bold**
.replace(/[*_`]/g, '') // 잔여 마크다운 마커
.replace(/\s+/g, ' ')
.trim();
}
export function pathSegments(hp: string | null | undefined): string[] {
if (!hp) return [];
// ⚠ 먼저 strip 후 split: heading_path 에 <sup>2</sup> 등 raw HTML 의 '>' 가 섞여 있어
// bare '>' 로 먼저 split 하면 태그가 잘림(단위테스트로 발견). cleanHeading 이 HTML 태그를
// 제거하므로 separator ' > '(bare '>')만 남은 뒤 split 한다.
return cleanHeading(hp)
.split('>')
.map((s) => s.trim())
.filter(Boolean);
}
/** 그룹 키: window/section_split(인공 조각) 또는 path 없음/깨짐 → OTHER. */
function topSegment(s: DocumentSection): string {
if (s.node_type === 'window' || s.node_type === 'section_split') return OTHER;
const segs = pathSegments(s.heading_path);
return segs.length === 0 ? OTHER : segs[0];
}
/**
* chunk_index ( ), cleaned heading_path
* node_type='window' 1 dedupe. = ( ), fragmentCount .
*/
export function collapseWindows(sections: DocumentSection[]): OutlineItem[] {
const out: OutlineItem[] = [];
for (const s of sections) {
const prev = out[out.length - 1];
const h = cleanHeading(s.heading_path);
if (
s.node_type === 'window' &&
prev &&
prev.section.node_type === 'window' &&
h !== '' &&
cleanHeading(prev.section.heading_path) === h
) {
prev.fragmentCount += 1;
} else {
out.push({ section: s, fragmentCount: 1 });
}
}
return out;
}
/**
* per-doc 판정: top-segment 1 vs flat.
* raw ( ), collapseWindows .
* - 채택: GROUP_MIN distinct top-segment GROUP_MAX AND % < OTHER_PCT_MAX.
* - flat .
*/
export function groupOrFlat(sections: DocumentSection[]): OutlineLayout {
const total = sections.length;
const order: string[] = [];
const map = new Map<string, DocumentSection[]>();
let otherCount = 0;
for (const s of sections) {
const key = topSegment(s);
if (key === OTHER) otherCount += 1;
if (!map.has(key)) {
map.set(key, []);
order.push(key);
}
map.get(key)!.push(s);
}
const groupCount = map.size;
const otherPct = total === 0 ? 0 : (otherCount / total) * 100;
const useGroup = groupCount >= GROUP_MIN && groupCount <= GROUP_MAX && otherPct < OTHER_PCT_MAX;
if (!useGroup) {
return { mode: 'flat', items: collapseWindows(sections), groups: [] };
}
const groups: OutlineGroup[] = order.map((key) => ({
key: key === OTHER ? '기타' : key,
isOther: key === OTHER,
items: collapseWindows(map.get(key)!),
}));
return { mode: 'group', items: [], groups };
}
+81 -49
View File
@@ -27,6 +27,8 @@
import DocumentDangerZone from '$lib/components/editors/DocumentDangerZone.svelte';
import AnalysisPanel from '$lib/components/AnalysisPanel.svelte';
import ReadCounter from '$lib/components/ReadCounter.svelte';
import SectionOutline from '$lib/components/SectionOutline.svelte';
import Tabs from '$lib/components/ui/Tabs.svelte';
marked.use({ mangle: false, headerIds: false });
function renderMd(text) {
@@ -84,6 +86,20 @@
}
}
// 절(hier section) 목차 — 본문 로드와 독립, 실패(404 포함) 무해.
// reqId guard: 문서 전환 race 시 stale 결과가 새 문서에 붙지 않게.
let sections = $state([]);
let hasSections = $derived(sections.length > 0);
async function loadSections() {
const reqId = docId;
try {
const r = await api(`/documents/${reqId}/sections`);
if (reqId === docId) sections = r?.sections ?? [];
} catch {
if (reqId === docId) sections = []; // Phase 1 미배포 시 404 → 목차 숨김(graceful)
}
}
// "1회독 완료 + 다음 자료로" 한 번에
async function readAndGoNext() {
try {
@@ -117,6 +133,7 @@
}
// 자료실 자료면 인접 자료 미리 fetch (학습 흐름 네비)
if (doc && doc.category === 'library') loadNeighbors();
if (doc) loadSections();
});
let viewerType = $derived(
@@ -206,9 +223,25 @@
<Button variant="secondary" size="sm" onclick={() => location.reload()}>다시 시도</Button>
</EmptyState>
{:else if doc}
<div class="max-w-6xl mx-auto grid grid-cols-1 lg:grid-cols-3 gap-6">
<!-- 왼쪽 (2/3) — 뷰어 + affordance row -->
<div class="lg:col-span-2 space-y-4">
<div class="mx-auto grid grid-cols-1 gap-6 {hasSections ? 'max-w-7xl xl:grid-cols-[18rem_minmax(0,1fr)_20rem]' : 'max-w-6xl lg:grid-cols-3'}">
{#if hasSections}
<!-- 좌측 절 목차 — xl+ sticky rail (그 아래 viewport 는 본문 상단 collapsible) -->
<aside class="hidden xl:block xl:sticky xl:top-6 xl:self-start xl:max-h-[calc(100vh-3rem)] xl:overflow-y-auto">
<Card>
<SectionOutline {sections} />
</Card>
</aside>
{/if}
<!-- 본문 (좌측 목차 없을 때 lg 2/3) -->
<div class="{hasSections ? '' : 'lg:col-span-2'} space-y-4">
{#if hasSections}
<!-- xl 미만: 절 목차 접이식 -->
<details class="xl:hidden">
<summary class="cursor-pointer text-sm text-dim px-1 py-2 select-none">절 목차 ({sections.length})</summary>
<Card class="mt-2"><SectionOutline {sections} /></Card>
</details>
{/if}
<!-- Affordance row -->
<div class="flex flex-wrap items-center gap-2">
{#if doc.edit_url}
@@ -382,53 +415,52 @@
{/if}
</div>
<!-- 오른쪽 (1/3) — editors stack -->
<aside class="space-y-4">
{#if doc.category === 'library'}
<ReadCounter
documentId={doc.id}
initialCount={doc.read_count ?? 0}
initialLastReadAt={doc.last_read_at ?? null}
/>
{/if}
<!-- 오른쪽 — 메타 Tabs [정보 | AI | 관리] (카드 11개 수직 스프롤 해소) -->
<aside class="min-w-0">
<Card>
<LibraryPathEditor {doc} />
</Card>
<Card>
<NoteEditor {doc} />
</Card>
<Card>
<EditUrlEditor {doc} />
</Card>
<Card>
<TagsEditor {doc} />
</Card>
<Card>
<AIClassificationEditor {doc} />
</Card>
<Card>
<AnalysisPanel docId={doc.id} doc={doc} />
</Card>
<Card>
<FileInfoView {doc} />
</Card>
<Card>
<ProcessingStatusView {doc} />
</Card>
<!-- E.3 관련 문서 stub -->
<Card>
<h4 class="text-xs font-semibold text-dim uppercase mb-1.5">관련 문서</h4>
<!-- TODO(backend): GET /documents/{id}/related?limit=10 (벡터 유사도) -->
<EmptyState
icon={FileText}
title="추후 지원"
description="관련 문서 추천은 backend 연동 후 제공됩니다."
/>
</Card>
<Card>
<DocumentDangerZone {doc} ondelete={handleDocDelete} />
<Tabs
tabs={[
{ id: 'info', label: '정보' },
{ id: 'ai', label: 'AI' },
{ id: 'manage', label: '관리' },
]}
>
{#snippet children(tab)}
<div class="pt-3 space-y-4">
{#if tab === 'info'}
{#if doc.category === 'library'}
<ReadCounter
documentId={doc.id}
initialCount={doc.read_count ?? 0}
initialLastReadAt={doc.last_read_at ?? null}
/>
{/if}
<FileInfoView {doc} />
<ProcessingStatusView {doc} />
{:else if tab === 'ai'}
<AnalysisPanel docId={doc.id} doc={doc} />
<AIClassificationEditor {doc} />
<div>
<h4 class="text-xs font-semibold text-dim uppercase mb-1.5">관련 문서</h4>
<!-- TODO(backend): GET /documents/{id}/related?limit=10 (벡터 유사도) -->
<EmptyState
icon={FileText}
title="추후 지원"
description="관련 문서 추천은 backend 연동 후 제공됩니다."
/>
</div>
{:else}
<LibraryPathEditor {doc} />
<NoteEditor {doc} />
<EditUrlEditor {doc} />
<TagsEditor {doc} />
<div class="pt-2 border-t border-default">
<DocumentDangerZone {doc} ondelete={handleDocDelete} />
</div>
{/if}
</div>
{/snippet}
</Tabs>
</Card>
</aside>
</div>