fix(news): digest/briefing 생성 LLM 타임아웃 게이트 단일소스화 + deep_summary 컨슈머 분리

2026-06-11 맥미니 모델 교체(Gemma4 26B→Qwen3.6-27B-6bit, 콜당 ~90~300s)의 타임아웃 상향 sweep 이 config.yaml/synthesis 만 갱신하고 digest/briefing 코드의 하드코딩 LLM_CALL_TIMEOUT=25(빠른 Gemma 기준)를 누락 → digest 600s 하드캡 초과로 06-10 이후 미생성, briefing 4/4 LLM 폴백(status=failed). (적대 리뷰로 블로커 정정: concurrency=1 사설 세마포로는 digest 44~68 클러스터가 하드캡에 여전히 걸림 + llm_gate 영구 룰 위반.) - 타임아웃·재시도·하드캡을 config.pipeline 단일소스로 이관(digest_llm_timeout_s=300, attempts=2, pipeline_hard_cap_s=3000). 다음 모델 교체 때 재발 차단. - digest/briefing LLM 호출을 사설 Semaphore 제거하고 전역 MLX gate(BACKGROUND) 경유로 변경 — llm_gate 영구 룰(같은 endpoint 단일 게이트, 새 Semaphore 금지) 준수 + ask/eid(FOREGROUND)와 조율. 동시성 lever = 기존 mlx_gate_concurrency 2→4 (continuous batching 실측 — 3동시콜 wall 121s ≈ 단일콜, 직렬 대비 ~3배). - digest/briefing pipeline cluster 루프를 asyncio.gather 동시 실행으로 전환 (실동시성은 게이트가 제한, rank/순서 보존). - deep_summary(70~300s)를 메인 consume_queue 에서 분리해 consume_deep_queue 신설 (markdown/fast split 선례) — 단일 deep 호출이 1분 틱 초과로 메인 큐를 영구 coalesce 시키던 문제 제거. - 죽은 PIPELINE_HARD_CAP=600(briefing/pipeline.py) 제거, summarizer docstring 갱신, deep 컨슈머 disjoint/hold 테스트 추가. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 23:02:39 +00:00
parent b2949d26ff
commit a82b0724df
11 changed files with 151 additions and 34 deletions
@@ -26,7 +26,8 @@ def _fake_consumer_env(monkeypatch, held):
        lambda: {
            s: object()
            for s in (queue_consumer.MAIN_QUEUE_STAGES
-                      + queue_consumer.FAST_QUEUE_STAGES + ["markdown"])
+                      + queue_consumer.FAST_QUEUE_STAGES
+                      + queue_consumer.DEEP_QUEUE_STAGES + ["markdown"])
        },
    )
    monkeypatch.setattr(queue_consumer, "_hold_logged", False)
@@ -83,13 +84,37 @@ async def test_fast_consumer_respects_hold(monkeypatch):
    assert processed == ["chunk"]


+@pytest.mark.asyncio
+async def test_deep_consumer_processes_deep_only(monkeypatch):
+    """deep 컨슈머(2026-06-15 분리) = deep_summary 전용 (메인 루프와 디커플)."""
+    processed = _fake_consumer_env(monkeypatch, [])
+
+    await queue_consumer.consume_deep_queue()
+
+    assert processed == ["deep_summary"]
+
+
+@pytest.mark.asyncio
+async def test_deep_consumer_respects_hold(monkeypatch):
+    """deep_summary 홀드 시 deep 컨슈머가 claim 안 함."""
+    processed = _fake_consumer_env(monkeypatch, ["deep_summary"])
+
+    await queue_consumer.consume_deep_queue()
+
+    assert processed == []
+
+
 def test_fast_split_invariants():
-    """세 컨슈머 stage 집합 disjoint + embed/chunk 배치 상향 회귀 가드."""
+    """네 컨슈머 stage 집합 disjoint + embed/chunk 배치 상향 + deep split 회귀 가드."""
    main = set(queue_consumer.MAIN_QUEUE_STAGES)
    fast = set(queue_consumer.FAST_QUEUE_STAGES)
    md = set(queue_consumer.MARKDOWN_QUEUE_STAGES)
+    deep = set(queue_consumer.DEEP_QUEUE_STAGES)
    assert not (main & fast) and not (main & md) and not (fast & md)
+    assert not (main & deep) and not (fast & deep) and not (md & deep)
    assert fast == {"embed", "chunk"}
+    assert deep == {"deep_summary"}
+    assert "deep_summary" not in main  # 2026-06-15 split 회귀 가드
    assert queue_consumer.BATCH_SIZE["embed"] >= 10
    assert queue_consumer.BATCH_SIZE["chunk"] >= 10