fix(grounding): citation marker [n] 을 fabricated_number 에서 제외

[1][2][4] 같은 citation 마커의 숫자가 evidence 에 없다고 판정되어
모든 정상 답변이 refuse(2+strong) 되는 critical bug.
answer 에서 \[\d+\] 제거 후 숫자 추출.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Hyungi Ahn
2026-04-10 08:59:29 +09:00
parent 1beba3402b
commit a0e1717206

View File

@@ -76,14 +76,16 @@ def check(
evidence_text = " ".join(e.span_text for e in evidence)
# ── Strong 1: fabricated number ──
answer_nums = _extract_number_literals(answer)
# ── Strong 1: fabricated number (equality, not substring) ──
# ⚠ citation marker [n] 제거 후 숫자 추출 (안 그러면 [1][2][3] 이 fabricated 로 오탐)
answer_clean = re.sub(r'\[\d+\]', '', answer)
answer_nums = _extract_number_literals(answer_clean)
evidence_nums = _extract_number_literals(evidence_text)
evidence_digits = {re.sub(r'[^\d]', '', en) for en in evidence_nums}
evidence_digits.discard('')
for num in answer_nums:
digits_only = re.sub(r'[^\d]', '', num)
if digits_only and not any(
digits_only in re.sub(r'[^\d]', '', en) for en in evidence_nums
):
if digits_only and digits_only not in evidence_digits:
strong.append(f"fabricated_number:{num}")
# ── Strong/Weak 2: query-answer intent alignment ──