test(search): Phase 0.2 baseline 측정 결과

23개 쿼리에 대한 현재 검색(FTS+ILIKE+Vector hybrid) baseline.
Phase 1+ 개선 비교 기준점으로 보존.

전체: Recall@10 0.788 / NDCG@10 0.705 / Top-3 0.95 / p95 1695ms

핵심 약점 (Phase 1+ 타겟):
- news_crosslingual catastrophic (Recall 0.14) → domain-aware 필수
- failure-case precision 0/3 → confidence threshold 부재
- p95 1695ms (목표 500ms의 3배) → trigram/parallel retrieval
- nl 쿼리 top-3 ordering 약함 → chunk-level + reranker

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Hyungi Ahn
2026-04-07 08:22:53 +09:00
parent 8490cfed10
commit ec36ea3d6d
2 changed files with 98 additions and 0 deletions

View File

@@ -0,0 +1,24 @@
label,id,category,intent,domain_hint,query,relevant_ids,returned_ids_top10,latency_ms,recall_at_10,mrr_at_10,ndcg_at_10,top3_hit,error
single,kw_001,exact_keyword,fact_lookup,document,산업안전보건법 제6장,3856;3868;3879,3856;3868;3879;3912;3950;3908;3915;3911;3858;3853,1696.5,1.000,1.000,1.000,1,
single,kw_002,exact_keyword,fact_lookup,document,중대재해 처벌 등에 관한 법률 제2장 중대산업재해,3917;3921,3921;3917;3916;3922;3920;3918;3919;3923;3874;3946,567.9,1.000,1.000,1.000,1,
single,kw_003,exact_keyword,fact_lookup,document,화학물질관리법 유해화학물질 영업자,3981,3981;3980;3857;3988;3869;3880;3978;3986;3985;3979,1780.5,1.000,1.000,1.000,1,
single,kw_004,exact_keyword,fact_lookup,document,근로기준법 안전과 보건,4041,4041;3853;3860;3883;3858;3865;4036;3885;3901;3888,1686.7,1.000,1.000,1.000,1,
single,kw_005,exact_keyword,fact_lookup,document,산업안전보건기준에 관한 규칙 보호구,3888,3888;3905;3908;3909;3897;3911;3907;3912;3903;3906,539.6,1.000,1.000,1.000,1,
single,nl_001,natural_language_ko,semantic_search,document,기계로 인한 산업재해 관련 법령,3856;3868;3879;3854,3868;3856;3879;3895;3915;3872;3851;3867;3897;3863,545.0,0.750,1.000,0.832,1,
single,nl_002,natural_language_ko,semantic_search,document,사업주가 도급을 줄 때 산업재해를 예방하기 위해 해야 할 일,3855;3867;3878,3855;3867;3878;3863;3917;3872;3854;3896;3861;3886,552.1,1.000,1.000,1.000,1,
single,nl_003,natural_language_ko,semantic_search,document,유해화학물질을 다루는 회사가 지켜야 할 안전 의무,3980;3981;3982,3857;3869;3980;3880;3896;3903;3854;3981;3909;3904,534.2,0.667,0.333,0.383,1,
single,nl_004,natural_language_ko,semantic_search,document,중대재해가 발생했을 때 경영책임자가 처벌받는 기준,3916;3917;3920;3921,3918;3917;3919;3921;3916;3923;3867;3922;3877;3984,529.4,0.750,0.500,0.565,1,
single,nl_005,natural_language_ko,semantic_search,document,안전보건교육은 누가 받아야 하고 어떤 내용을 다루는가,3853;3865,3853;3876;3965;3871;3958;3875;3861;3866;3877;3856,543.4,0.500,1.000,0.613,1,
single,cl_001,crosslingual_ko_en,semantic_search,document,기계 안전 가드 설계 원리,3770;3856,3895;3770;3762;3773;3879;3856;3767;3788;3868;3897,532.2,1.000,0.500,0.605,1,
single,cl_002,crosslingual_ko_en,semantic_search,document,산업 안전 입문서,3755;3775;3776;3777,3755;3816;3775;3851;3896;3853;3876;3871;3776;3863,1446.3,0.750,1.000,0.703,1,
single,cl_003,crosslingual_ko_en,semantic_search,document,전기 안전 위험,3772;3790,3772;3897;3790;4024;4018;4020;4023;4022;4013;4019,1364.3,1.000,1.000,0.920,1,
single,news_001,news_ko,semantic_search,news,이란과 미국의 군사 충돌,4303;4304;4307;4316;4322;4323;4327;4335,4452;4307;4317;4321;4339;4331;4329;4418;4446;4459,535.4,0.125,0.500,0.160,1,
single,news_002,news_ko,semantic_search,news,호르무즈 해협 봉쇄,4316;4320;4322;4327,4349;4199;4346;4320;4322;4327;4340;4304;4316;4260,538.4,1.000,0.250,0.576,0,
single,news_003,news_en,semantic_search,news,Trump Iran ultimatum,4258;4260;4262,4519;4202;4258;4321;4333;4515;4313;4445;4418;4314,533.7,0.333,0.333,0.235,1,
single,news_004,news_fr,semantic_search,news,guerre en Iran,4199;4202;4210;4361;4363;4507;4519;4521,4199;4507;4521;4363;4519;4211;4258;4324;4210;4536,523.8,0.750,1.000,0.822,1,
single,news_005,news_crosslingual,semantic_search,news,이란 미국 전쟁 글로벌 반응,4202;4258;4262;4536;4303;4304;4316,4329;4457;4307;4345;4324;4452;4443;4444;4450;4262,1483.2,0.143,0.100,0.079,1,
single,misc_001,other_domain,fact_lookup,document,강체의 평면 운동학,4063;4065,4071;4063;4064;4058;4066;4065;4068;4060;4062;4059,568.0,1.000,0.500,0.605,1,
single,misc_002,other_domain,semantic_search,document,질점의 운동역학,4060;4061;4062,4062;4060;4061;4058;4070;4059;4069;4071;4063;4067,552.0,1.000,1.000,1.000,1,
single,fail_001,failure_expected,semantic_search,document,Rust async runtime tokio scheduler 내부 구조,,4069;3789;4067;4070;4060;4061;4071;4062;3807;4433,543.8,0.000,0.000,0.000,1,
single,fail_002,failure_expected,semantic_search,document,양자컴퓨터 큐비트 디코히어런스,,4068;4058;4064;4060;4065;4063;4061;3899;4067;4196,527.0,0.000,0.000,0.000,1,
single,fail_003,failure_expected,semantic_search,news,재즈 보컬리스트 빌리 홀리데이,,4289;4281;4205;4116;4100;4077;4316;4343;4235;4504,533.0,0.000,0.000,0.000,1,
1 label id category intent domain_hint query relevant_ids returned_ids_top10 latency_ms recall_at_10 mrr_at_10 ndcg_at_10 top3_hit error
2 single kw_001 exact_keyword fact_lookup document 산업안전보건법 제6장 3856;3868;3879 3856;3868;3879;3912;3950;3908;3915;3911;3858;3853 1696.5 1.000 1.000 1.000 1
3 single kw_002 exact_keyword fact_lookup document 중대재해 처벌 등에 관한 법률 제2장 중대산업재해 3917;3921 3921;3917;3916;3922;3920;3918;3919;3923;3874;3946 567.9 1.000 1.000 1.000 1
4 single kw_003 exact_keyword fact_lookup document 화학물질관리법 유해화학물질 영업자 3981 3981;3980;3857;3988;3869;3880;3978;3986;3985;3979 1780.5 1.000 1.000 1.000 1
5 single kw_004 exact_keyword fact_lookup document 근로기준법 안전과 보건 4041 4041;3853;3860;3883;3858;3865;4036;3885;3901;3888 1686.7 1.000 1.000 1.000 1
6 single kw_005 exact_keyword fact_lookup document 산업안전보건기준에 관한 규칙 보호구 3888 3888;3905;3908;3909;3897;3911;3907;3912;3903;3906 539.6 1.000 1.000 1.000 1
7 single nl_001 natural_language_ko semantic_search document 기계로 인한 산업재해 관련 법령 3856;3868;3879;3854 3868;3856;3879;3895;3915;3872;3851;3867;3897;3863 545.0 0.750 1.000 0.832 1
8 single nl_002 natural_language_ko semantic_search document 사업주가 도급을 줄 때 산업재해를 예방하기 위해 해야 할 일 3855;3867;3878 3855;3867;3878;3863;3917;3872;3854;3896;3861;3886 552.1 1.000 1.000 1.000 1
9 single nl_003 natural_language_ko semantic_search document 유해화학물질을 다루는 회사가 지켜야 할 안전 의무 3980;3981;3982 3857;3869;3980;3880;3896;3903;3854;3981;3909;3904 534.2 0.667 0.333 0.383 1
10 single nl_004 natural_language_ko semantic_search document 중대재해가 발생했을 때 경영책임자가 처벌받는 기준 3916;3917;3920;3921 3918;3917;3919;3921;3916;3923;3867;3922;3877;3984 529.4 0.750 0.500 0.565 1
11 single nl_005 natural_language_ko semantic_search document 안전보건교육은 누가 받아야 하고 어떤 내용을 다루는가 3853;3865 3853;3876;3965;3871;3958;3875;3861;3866;3877;3856 543.4 0.500 1.000 0.613 1
12 single cl_001 crosslingual_ko_en semantic_search document 기계 안전 가드 설계 원리 3770;3856 3895;3770;3762;3773;3879;3856;3767;3788;3868;3897 532.2 1.000 0.500 0.605 1
13 single cl_002 crosslingual_ko_en semantic_search document 산업 안전 입문서 3755;3775;3776;3777 3755;3816;3775;3851;3896;3853;3876;3871;3776;3863 1446.3 0.750 1.000 0.703 1
14 single cl_003 crosslingual_ko_en semantic_search document 전기 안전 위험 3772;3790 3772;3897;3790;4024;4018;4020;4023;4022;4013;4019 1364.3 1.000 1.000 0.920 1
15 single news_001 news_ko semantic_search news 이란과 미국의 군사 충돌 4303;4304;4307;4316;4322;4323;4327;4335 4452;4307;4317;4321;4339;4331;4329;4418;4446;4459 535.4 0.125 0.500 0.160 1
16 single news_002 news_ko semantic_search news 호르무즈 해협 봉쇄 4316;4320;4322;4327 4349;4199;4346;4320;4322;4327;4340;4304;4316;4260 538.4 1.000 0.250 0.576 0
17 single news_003 news_en semantic_search news Trump Iran ultimatum 4258;4260;4262 4519;4202;4258;4321;4333;4515;4313;4445;4418;4314 533.7 0.333 0.333 0.235 1
18 single news_004 news_fr semantic_search news guerre en Iran 4199;4202;4210;4361;4363;4507;4519;4521 4199;4507;4521;4363;4519;4211;4258;4324;4210;4536 523.8 0.750 1.000 0.822 1
19 single news_005 news_crosslingual semantic_search news 이란 미국 전쟁 글로벌 반응 4202;4258;4262;4536;4303;4304;4316 4329;4457;4307;4345;4324;4452;4443;4444;4450;4262 1483.2 0.143 0.100 0.079 1
20 single misc_001 other_domain fact_lookup document 강체의 평면 운동학 4063;4065 4071;4063;4064;4058;4066;4065;4068;4060;4062;4059 568.0 1.000 0.500 0.605 1
21 single misc_002 other_domain semantic_search document 질점의 운동역학 4060;4061;4062 4062;4060;4061;4058;4070;4059;4069;4071;4063;4067 552.0 1.000 1.000 1.000 1
22 single fail_001 failure_expected semantic_search document Rust async runtime tokio scheduler 내부 구조 4069;3789;4067;4070;4060;4061;4071;4062;3807;4433 543.8 0.000 0.000 0.000 1
23 single fail_002 failure_expected semantic_search document 양자컴퓨터 큐비트 디코히어런스 4068;4058;4064;4060;4065;4063;4061;3899;4067;4196 527.0 0.000 0.000 0.000 1
24 single fail_003 failure_expected semantic_search news 재즈 보컬리스트 빌리 홀리데이 4289;4281;4205;4116;4100;4077;4316;4343;4235;4504 533.0 0.000 0.000 0.000 1