Files
hyungi_document_server/tests/fixtures/macmini_gemma4_query_rewrite_response.json
hyungi 446ba82c91 feat(eval): Phase 2Q Diagnose Phase 1A — fixture (4 카테고리 × 2 LLM) + prompt v1
phase-2q-query-rewrite-diagnose.md v6 plan 의 Phase 1 fixture 박제 (G0-1 + G0-2).

산출물:
- app/prompts/query_rewrite.txt — multi-query rewrite prompt v1 (3 variants: 원본 + 한국어 rephrase + 영어 번역)
- tests/fixtures/macmini_gemma4_query_rewrite_response.json — 4 카테고리 (korean_only/mixed/english_only/exam)
- tests/fixtures/macbook_qwen_query_rewrite_response.json — 4 카테고리 동일

inspect 9 결과 (2026-05-24):
- Mac mini gemma-4-26B-A4B :8801 = response_format json_object 지원
- MacBook qwen3.6-27B-8bit :8810 = response_format json_object 미지원 (120s hang) — prompt rule only
- prompt rule \"no markdown, no code fence\" 강제 시 둘 다 strict JSON (gemma 도 fence wrap 없음)
- parser fallback (markdown fence regex) 유지 — 첫 호출 prompt 없을 때 wrap 관찰 사례

8 호출 측정:
- gemma 1.16~1.36s / qwen 1.93~2.24s (warm)
- variants 의미 일관 + 도메인 용어 (ASME/Section VIII/압력용기/가스기사) verbatim preserve
- 한국어→영어 cross-lingual translation 자연

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 22:09:29 +00:00

209 lines
9.0 KiB
JSON

{
"_meta": {
"description": "Phase 2Q G0-1 Mac mini gemma-4 query rewrite response fixture (4 카테고리)",
"captured_at": "2026-05-24",
"endpoint": "http://100.76.254.116:8801/v1/chat/completions",
"model": "gemma-4-26b-a4b-it-8bit",
"sampling": {
"temperature": 0.3,
"max_tokens": 256,
"response_format": {
"type": "json_object"
}
},
"prompt_version": "v1",
"prompt_path": "app/prompts/query_rewrite.txt",
"note": "response_format=json_object 지원. prompt rule 'no markdown' 강제 시 strict JSON. parser fallback (markdown fence regex) 유지 — 첫 시도 prompt 없이 호출 시 ```json...``` wrap 관찰 (2026-05-24 inspect 9)."
},
"cases": {
"korean_only": {
"query": "압력용기 설계 기준",
"request_body": {
"model": "gemma-4-26b-a4b-it-8bit",
"messages": [
{
"role": "system",
"content": "You are a search query rewriter for a multilingual document search system (Korean primary, English/mixed secondary).\n\nTask: given the user's search query, produce 3 search-friendly variants:\n- variant 0 = original query (verbatim, no change)\n- variant 1 = Korean rephrase with different phrasing (synonyms / 명사구 변형 / 조사 변형)\n- variant 2 = English translation OR cross-lingual rephrase (if Korean → English term; if English → Korean term)\n\nRules:\n- Each variant ≤ 80 chars.\n- Preserve domain-specific terms (ASME, KGS, 가스기사, 압력용기) verbatim — no abbreviation/transliteration.\n- Do not invent new entities.\n- Output STRICT JSON only (no prose, no markdown, no code fence): {\"variants\": [\"...\", \"...\", \"...\"]}"
},
{
"role": "user",
"content": "압력용기 설계 기준"
}
],
"temperature": 0.3,
"max_tokens": 256,
"response_format": {
"type": "json_object"
}
},
"response": {
"id": "chatcmpl-1779574028",
"object": "chat.completion",
"created": 1779574028,
"model": "mlx-community/gemma-4-26b-a4b-it-8bit",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "{\"variants\": [\"압력용기 설계 기준\", \"압력용기 설계 규격 및 기준\", \"Pressure vessel design standards\"]}"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 214,
"completion_tokens": 29,
"total_tokens": 243
}
},
"parsed_variants": [
"압력용기 설계 기준",
"압력용기 설계 규격 및 기준",
"Pressure vessel design standards"
]
},
"mixed": {
"query": "ASME Section VIII 적용 기준",
"request_body": {
"model": "gemma-4-26b-a4b-it-8bit",
"messages": [
{
"role": "system",
"content": "You are a search query rewriter for a multilingual document search system (Korean primary, English/mixed secondary).\n\nTask: given the user's search query, produce 3 search-friendly variants:\n- variant 0 = original query (verbatim, no change)\n- variant 1 = Korean rephrase with different phrasing (synonyms / 명사구 변형 / 조사 변형)\n- variant 2 = English translation OR cross-lingual rephrase (if Korean → English term; if English → Korean term)\n\nRules:\n- Each variant ≤ 80 chars.\n- Preserve domain-specific terms (ASME, KGS, 가스기사, 압력용기) verbatim — no abbreviation/transliteration.\n- Do not invent new entities.\n- Output STRICT JSON only (no prose, no markdown, no code fence): {\"variants\": [\"...\", \"...\", \"...\"]}"
},
{
"role": "user",
"content": "ASME Section VIII 적용 기준"
}
],
"temperature": 0.3,
"max_tokens": 256,
"response_format": {
"type": "json_object"
}
},
"response": {
"id": "chatcmpl-1779574029",
"object": "chat.completion",
"created": 1779574029,
"model": "mlx-community/gemma-4-26b-a4b-it-8bit",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "{\"variants\": [\"ASME Section VIII 적용 기준\", \"ASME Section VIII 적용 규정 및 기준\", \"Application standards for ASME Section VIII\"]}"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 214,
"completion_tokens": 31,
"total_tokens": 245
}
},
"parsed_variants": [
"ASME Section VIII 적용 기준",
"ASME Section VIII 적용 규정 및 기준",
"Application standards for ASME Section VIII"
]
},
"english_only": {
"query": "pressure vessel design code",
"request_body": {
"model": "gemma-4-26b-a4b-it-8bit",
"messages": [
{
"role": "system",
"content": "You are a search query rewriter for a multilingual document search system (Korean primary, English/mixed secondary).\n\nTask: given the user's search query, produce 3 search-friendly variants:\n- variant 0 = original query (verbatim, no change)\n- variant 1 = Korean rephrase with different phrasing (synonyms / 명사구 변형 / 조사 변형)\n- variant 2 = English translation OR cross-lingual rephrase (if Korean → English term; if English → Korean term)\n\nRules:\n- Each variant ≤ 80 chars.\n- Preserve domain-specific terms (ASME, KGS, 가스기사, 압력용기) verbatim — no abbreviation/transliteration.\n- Do not invent new entities.\n- Output STRICT JSON only (no prose, no markdown, no code fence): {\"variants\": [\"...\", \"...\", \"...\"]}"
},
{
"role": "user",
"content": "pressure vessel design code"
}
],
"temperature": 0.3,
"max_tokens": 256,
"response_format": {
"type": "json_object"
}
},
"response": {
"id": "chatcmpl-1779574026",
"object": "chat.completion",
"created": 1779574026,
"model": "mlx-community/gemma-4-26b-a4b-it-8bit",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "{\"variants\": [\"pressure vessel design code\", \"압력용기 설계 코드\", \"pressure vessel design standard\"]}"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 212,
"completion_tokens": 24,
"total_tokens": 236
}
},
"parsed_variants": [
"pressure vessel design code",
"압력용기 설계 코드",
"pressure vessel design standard"
]
},
"exam": {
"query": "가스기사 필기 출제기준",
"request_body": {
"model": "gemma-4-26b-a4b-it-8bit",
"messages": [
{
"role": "system",
"content": "You are a search query rewriter for a multilingual document search system (Korean primary, English/mixed secondary).\n\nTask: given the user's search query, produce 3 search-friendly variants:\n- variant 0 = original query (verbatim, no change)\n- variant 1 = Korean rephrase with different phrasing (synonyms / 명사구 변형 / 조사 변형)\n- variant 2 = English translation OR cross-lingual rephrase (if Korean → English term; if English → Korean term)\n\nRules:\n- Each variant ≤ 80 chars.\n- Preserve domain-specific terms (ASME, KGS, 가스기사, 압력용기) verbatim — no abbreviation/transliteration.\n- Do not invent new entities.\n- Output STRICT JSON only (no prose, no markdown, no code fence): {\"variants\": [\"...\", \"...\", \"...\"]}"
},
{
"role": "user",
"content": "가스기사 필기 출제기준"
}
],
"temperature": 0.3,
"max_tokens": 256,
"response_format": {
"type": "json_object"
}
},
"response": {
"id": "chatcmpl-1779574027",
"object": "chat.completion",
"created": 1779574027,
"model": "mlx-community/gemma-4-26b-a4b-it-8bit",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "{\"variants\": [\"가스기사 필기 출제기준\", \"가스기사 필기 시험 출제 범위 및 기준\", \"Gas Engineer written exam syllabus\"]}"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 217,
"completion_tokens": 36,
"total_tokens": 253
}
},
"parsed_variants": [
"가스기사 필기 출제기준",
"가스기사 필기 시험 출제 범위 및 기준",
"Gas Engineer written exam syllabus"
]
}
}
}