feat(worker-pool): Registry-1C cap 1MB + deterministic compaction

사용자 결정 2026-05-19: 100KB cap 이 운영 7d 데이터 1.36MB 대비 부족 →
cap 상향만으로 raw 비대화 위험. cap 1MB + payload compaction 병행.

fetch_recap_context() 변경:
- memo payload item field 축소 = id/title/ai_tldr/ai_event_kind/created_at (5 필드)
  (ai_bullets/file_type/source_channel/category/extracted_text 등 제외)
- memo top-N = RECAP_MEMO_TOP_N env (default 200) — 초과분은 aggregate 로
- aggregate = memos_by_day + memos_by_kind + omitted_memos
- payload_compacted flag = aggregate fallback 발현 여부
- events 는 raw (운영 7d 데이터에서 통상 0~소량)

internal_worker.py:
- PAYLOAD_MAX_BYTES → _payload_max_bytes() env override
  (WORKER_RECAP_PAYLOAD_MAX_BYTES default 1_000_000)
- JobsRecapResponse 에 payload_compacted / omitted_memos 노출
- 413 detail 에 "after compaction" 명시 + RECAP_MEMO_TOP_N 조정 안내

테스트 3 항목 신규 + 기존 endpoint 413 test 업데이트:
- 700 memo → 200 kept + 500 omitted + compacted=true + < 1MB
- 10 memo → compacted=false + omitted=0
- 비정상 큰 title (compaction 후에도 cap 초과) → 413 유지

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Hyungi Ahn
2026-05-19 12:55:51 +09:00
parent 0ea72c1aa6
commit eae1f48d62
4 changed files with 296 additions and 23 deletions
+16 -3
View File
@@ -83,6 +83,8 @@ async def test_recap_endpoint_creates_worker_job(env_setup):
assert js["memo_count"] >= 0
assert js["event_count"] >= 0
assert js["payload_bytes"] > 0
assert "payload_compacted" in js
assert "omitted_memos" in js
# DB verify
job = await fetch_worker_job(js["job_id"])
assert job is not None
@@ -93,7 +95,7 @@ async def test_recap_endpoint_creates_worker_job(env_setup):
@pytest.mark.asyncio
async def test_recap_payload_413_when_oversize(env_setup, monkeypatch):
"""payload 100KB 초과 시 413."""
"""payload 1MB 초과 시 413 (사용자 결정 2026-05-19 cap 1MB)."""
from api import internal_worker as iw_mod
from main import app
@@ -107,10 +109,20 @@ async def test_recap_payload_413_when_oversize(env_setup, monkeypatch):
"period_start": "2026-05-12T00:00:00+09:00",
"period_end": "2026-05-19T00:00:00+09:00",
"timezone": "Asia/Seoul",
"memos": [{"id": i, "title": "x" * 1000} for i in range(120)], # ~120KB
# ~1.2MB raw payload (compaction 후에도 cap 초과 가정)
"memos": [{"id": i, "title": "x" * 6000} for i in range(200)],
"events": [],
"memo_count": 120,
"memo_count": 200,
"event_count": 0,
"summary_stats": {
"total_memos": 200,
"memos_kept": 200,
"omitted_memos": 0,
"top_n": 200,
"memos_by_day": {},
"memos_by_kind": {},
},
"payload_compacted": False,
}
monkeypatch.setattr(iw_mod, "fetch_recap_context", fake_fetch)
@@ -125,3 +137,4 @@ async def test_recap_payload_413_when_oversize(env_setup, monkeypatch):
)
assert r.status_code == 413, r.text
assert "bytes" in r.json()["detail"]
assert "after compaction" in r.json()["detail"]