431d4fe010
야간 수집 뉴스 (KST 00:00~05:00) topic×country 비교 분석 1페이지 카드.
Phase 4 Global Digest 와 코드/로직/테이블 분리, 알고리즘만 services/clustering_common 공유.
Backend 신규:
- migrations/255_morning_briefings.sql: morning_briefings + briefing_topics
(briefing_date UNIQUE, UNIQUE(briefing_id,topic_rank), FK CASCADE,
historical_* 3컬럼 nullable, cluster_members JSONB, country_perspectives
JSONB, status 4-state success|partial|failed|empty)
- app/models/briefing.py: SQLAlchemy ORM
- app/services/briefing/loader.py: KST 5h 윈도우 + news_sources prefix
fallback (Phase 4 패턴 미러) + historical candidate pool 로더
- app/services/briefing/clustering.py: cluster_global topic-first
(LAMBDA=ln(2)/2h, MIN_COUNTRIES_PER_TOPIC=2, MAX_TOPICS=7)
- app/services/briefing/comparator.py: call_primary 26B + JSON envelope
sanitize (cap perspectives 10 / divergences 3 / convergences 2 /
quotes 5) + fallback row 고정 형태 + retrieve_historical cosine top-K
- app/services/briefing/pipeline.py: load→cluster→select(K=7,λ=0.6)
→historical→compare→status 4-state→delete+insert transaction
- app/workers/briefing_worker.py: APScheduler/수동 호출 공용 진입점,
600s hard cap
- app/prompts/briefing_comparative.txt: 한국어 비교 분석 JSON 프롬프트,
{articles_block} + {historical_block} 2섹션, 인용 금지 라벨
- app/api/briefing.py: GET /latest, GET ?date=, POST /regenerate?date=
(admin, sync delete+insert tx, regenerated:true)
Backend 수정:
- app/main.py: briefing_router 등록 (/api/briefing prefix). scheduler
등록은 PR-3 에서.
- app/services/digest/selection.py: select_for_llm 매개변수화 (K, λ
caller 주입). Phase 4 동작은 default 값으로 보존.
Historical 정책:
- BRIEFING_HISTORICAL_ENABLED env flag, default off.
- flag off → historical_* 컬럼 모두 NULL, prompt {historical_block} 빈
라벨, retrieval 호출 안 함.
- flag on (PR-1b 에서 enable) → cluster centroid 와 과거 30일 doc
embedding cosine top-K 5 (sim≥0.70), prompt 에 주입.
Country canonical (실측 확인 후):
- documents.country 컬럼 부재 확정
- document_chunks.country 매칭률 0% (chunks 자체가 뉴스에 안 만들어짐)
- 유일 country 신호 = news_sources prefix 매핑 (Phase 4 와 동일)
Tests:
- tests/test_briefing_historical.py: 3 경로 회귀 (flag off/on with
fixture/on zero match) + sanitize cap + fallback row 형태.
Verification: PR-1.8 에서 GPU 컨테이너 pytest + 수동 regenerate.
204 lines
6.4 KiB
Python
204 lines
6.4 KiB
Python
"""Morning Briefing API — read-only + 수동 regenerate.
|
|
|
|
엔드포인트:
|
|
- GET /api/briefing/latest : 가장 최근 briefing
|
|
- GET /api/briefing?date=YYYY-MM-DD : 특정 날짜 briefing
|
|
- POST /api/briefing/regenerate?date=... : 동기 워커 트리거 (admin), DELETE+INSERT tx
|
|
|
|
응답은 topic 평면 list (axis 반대 — Phase 4 와 달리 country 그룹 X).
|
|
각 topic 안에 country_perspectives JSONB 가 들어있어 cross-country 비교 분석을 표현.
|
|
"""
|
|
|
|
from datetime import date as date_type
|
|
from datetime import datetime
|
|
from typing import Annotated
|
|
|
|
from fastapi import APIRouter, Depends, HTTPException, Query
|
|
from pydantic import BaseModel
|
|
from sqlalchemy import select
|
|
from sqlalchemy.ext.asyncio import AsyncSession
|
|
from sqlalchemy.orm import selectinload
|
|
|
|
from core.auth import get_current_user, require_admin
|
|
from core.database import get_session
|
|
from models.briefing import BriefingTopic, MorningBriefing
|
|
from models.user import User
|
|
|
|
router = APIRouter()
|
|
|
|
|
|
# ─── Pydantic 응답 모델 ───
|
|
|
|
|
|
class CountryPerspective(BaseModel):
|
|
country: str
|
|
summary: str
|
|
article_ids: list[int] = []
|
|
|
|
|
|
class KeyQuote(BaseModel):
|
|
country: str = ""
|
|
source: str = ""
|
|
quote: str
|
|
|
|
|
|
class TopicResponse(BaseModel):
|
|
topic_rank: int
|
|
topic_label: str
|
|
headline: str
|
|
country_perspectives: list[CountryPerspective]
|
|
divergences: list[str]
|
|
convergences: list[str]
|
|
key_quotes: list[KeyQuote]
|
|
historical_context: str | None = None
|
|
cluster_members: list[int] = []
|
|
article_count: int
|
|
country_count: int
|
|
importance_score: float
|
|
llm_fallback_used: bool
|
|
|
|
|
|
class BriefingResponse(BaseModel):
|
|
briefing_date: date_type
|
|
window_start: datetime
|
|
window_end: datetime
|
|
decay_lambda: float
|
|
total_articles: int
|
|
total_countries: int
|
|
total_topics: int
|
|
generation_ms: int | None
|
|
llm_calls: int
|
|
llm_failures: int
|
|
status: str
|
|
headline_oneliner: str | None = None
|
|
topics: list[TopicResponse]
|
|
|
|
|
|
class RegenerateResponse(BaseModel):
|
|
status: str
|
|
briefing_id: int | None
|
|
briefing_date: date_type
|
|
total_topics: int
|
|
total_articles: int
|
|
llm_calls: int
|
|
llm_failures: int
|
|
generation_ms: int
|
|
regenerated: bool
|
|
|
|
|
|
# ─── helpers ───
|
|
|
|
|
|
def _build_response(b: MorningBriefing) -> BriefingResponse:
|
|
topics = []
|
|
for t in sorted(b.topics, key=lambda x: x.topic_rank):
|
|
topics.append(
|
|
TopicResponse(
|
|
topic_rank=t.topic_rank,
|
|
topic_label=t.topic_label,
|
|
headline=t.headline,
|
|
country_perspectives=[
|
|
CountryPerspective(**cp) for cp in (t.country_perspectives or [])
|
|
],
|
|
divergences=list(t.divergences or []),
|
|
convergences=list(t.convergences or []),
|
|
key_quotes=[KeyQuote(**q) for q in (t.key_quotes or [])],
|
|
historical_context=t.historical_context,
|
|
cluster_members=list(t.cluster_members or []),
|
|
article_count=t.article_count,
|
|
country_count=t.country_count,
|
|
importance_score=t.importance_score,
|
|
llm_fallback_used=t.llm_fallback_used,
|
|
)
|
|
)
|
|
|
|
return BriefingResponse(
|
|
briefing_date=b.briefing_date,
|
|
window_start=b.window_start,
|
|
window_end=b.window_end,
|
|
decay_lambda=b.decay_lambda,
|
|
total_articles=b.total_articles,
|
|
total_countries=b.total_countries,
|
|
total_topics=b.total_topics,
|
|
generation_ms=b.generation_ms,
|
|
llm_calls=b.llm_calls,
|
|
llm_failures=b.llm_failures,
|
|
status=b.status,
|
|
headline_oneliner=b.headline_oneliner,
|
|
topics=topics,
|
|
)
|
|
|
|
|
|
async def _load_briefing(
|
|
session: AsyncSession,
|
|
target_date: date_type | None,
|
|
) -> MorningBriefing | None:
|
|
query = select(MorningBriefing).options(selectinload(MorningBriefing.topics))
|
|
if target_date is not None:
|
|
query = query.where(MorningBriefing.briefing_date == target_date)
|
|
else:
|
|
query = query.order_by(MorningBriefing.briefing_date.desc())
|
|
query = query.limit(1)
|
|
result = await session.execute(query)
|
|
return result.scalar_one_or_none()
|
|
|
|
|
|
# ─── Routes ───
|
|
|
|
|
|
@router.get("/latest", response_model=BriefingResponse)
|
|
async def get_latest(
|
|
user: Annotated[User, Depends(get_current_user)],
|
|
session: Annotated[AsyncSession, Depends(get_session)],
|
|
):
|
|
"""가장 최근 morning briefing."""
|
|
b = await _load_briefing(session, target_date=None)
|
|
if b is None:
|
|
raise HTTPException(status_code=404, detail="아직 생성된 briefing 없음")
|
|
return _build_response(b)
|
|
|
|
|
|
@router.get("", response_model=BriefingResponse)
|
|
async def get_briefing(
|
|
user: Annotated[User, Depends(get_current_user)],
|
|
session: Annotated[AsyncSession, Depends(get_session)],
|
|
date: date_type | None = Query(default=None, description="YYYY-MM-DD (KST briefing_date)"),
|
|
):
|
|
"""특정 날짜 briefing (date 미지정 시 최신)."""
|
|
b = await _load_briefing(session, target_date=date)
|
|
if b is None:
|
|
raise HTTPException(
|
|
status_code=404,
|
|
detail=f"briefing 없음 (date={date})" if date else "아직 생성된 briefing 없음",
|
|
)
|
|
return _build_response(b)
|
|
|
|
|
|
@router.post("/regenerate", response_model=RegenerateResponse)
|
|
async def regenerate(
|
|
user: Annotated[User, Depends(require_admin)],
|
|
date: date_type | None = Query(default=None, description="YYYY-MM-DD KST 기준 briefing_date"),
|
|
):
|
|
"""수동 트리거 (admin). 동기 실행 — delete+insert transaction.
|
|
|
|
date 미지정 시 오늘 KST. 같은 날 row 존재 시 transaction 안에서 삭제 후 신규 생성.
|
|
응답 status='success' | 'partial' | 'failed' | 'empty'.
|
|
"""
|
|
from workers.briefing_worker import run
|
|
|
|
result = await run(target_date=date)
|
|
if result is None:
|
|
raise HTTPException(status_code=500, detail="briefing 워커 실행 실패 (로그 확인)")
|
|
|
|
return RegenerateResponse(
|
|
status=result["status"],
|
|
briefing_id=result.get("briefing_id"),
|
|
briefing_date=date or datetime.now().date(),
|
|
total_topics=result["total_topics"],
|
|
total_articles=result["total_articles"],
|
|
llm_calls=result["llm_calls"],
|
|
llm_failures=result["llm_failures"],
|
|
generation_ms=result["generation_ms"],
|
|
regenerated=result.get("regenerated", True),
|
|
)
|