gpu-services

Author	SHA1	Message	Date
Hyungi Ahn	74f8df48fc	fix: Synology UX — "🤔 생각 중..." + route시 "📝 더 깊이 살펴볼게요..." Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 12:47:02 +09:00
Hyungi Ahn	53d3e8e056	fix: Synology 응답 길이 1500→4000자 (모닝 브리핑 대비) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 12:45:21 +09:00
Hyungi Ahn	21f6869898	feat: EXAONE 분류기 — direct/route/clarify 라우팅 + 대화 기억 - EXAONE: 분류기+프롬프트엔지니어+직접응답 (JSON 출력) - 간단한 질문은 EXAONE이 직접 답변 (파이프라인 스킵) - 복잡한 질문은 AI 최적화 프롬프트로 Gemma에 전달 - 모호한 질문은 사용자에게 추가 질문 (clarify) - user별 최근 대화 기억 (최대 10개, 1시간 TTL) - ModelAdapter: messages 직접 전달 옵션 추가 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 12:40:39 +09:00
Hyungi Ahn	1ac4832bdc	fix: 프롬프트 튜닝 v2 — 자기 인식 + rewrite 과잉 방지 - reasoner: EXAONE+Gemma4 파이프라인 자기 인식 추가 - rewriter: 간단한 질문/인사는 원문 그대로 통과, 복잡한 것만 재구성 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 12:36:43 +09:00
Hyungi Ahn	9b8059ca38	fix: 시스템 프롬프트 튜닝 — 상냥하고 간결한 대화 스타일 - reasoner: "이드" 페르소나, 간결+상냥, 불필요한 구조화 금지 - rewriter: 인사/잡담은 그대로 통과 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 12:33:39 +09:00
Hyungi Ahn	193c3249fc	fix: python-multipart 추가 — form parsing 의존성 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 12:27:23 +09:00
Hyungi Ahn	a44f6446cf	feat: NanoClaude Phase 3 — Synology Chat 연동 - POST /webhook/synology: outgoing webhook 수신 + token 검증 - 파이프라인 완료 시 incoming webhook으로 응답 자동 전송 - "분석 중..." typing 메시지 선전송 - 응답 길이 1500자 제한 (Synology Chat 제한 대응) - 에러/실패 시에도 사용자에게 알림 메시지 전송 - 중복 요청 방지 (30초 TTL dedup) - Synology에서 rewrite 이벤트 숨김 (SSE에서만 노출) - callback 구조로 확장 가능 (Slack, Discord 등) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 12:25:48 +09:00
Hyungi Ahn	9f0c527442	fix: health_check timeout 3→5초 — MLX cold start 대응 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 12:08:28 +09:00
Hyungi Ahn	2b4d182b24	fix: job_queue 모듈 import 방식 수정 — None 참조 해결 모듈 레벨 변수를 직접 import하면 init_queue() 이후에도 None 참조가 유지됨. 모듈 자체를 import하여 접근. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 12:06:48 +09:00
Hyungi Ahn	c4c32170f1	feat: NanoClaude Phase 2 — EXAONE→Gemma 파이프라인, 큐, 상태 API - ModelAdapter: 범용 OpenAI-compat 어댑터 (stream/complete/health) - BackendRegistry: rewriter(EXAONE) + reasoner(Gemma4) 헬스체크 루프 - 2단계 파이프라인: EXAONE rewrite → Gemma reasoning (SSE rewrite 이벤트 노출) - Fallback: 맥미니 다운 시 EXAONE 단독 모드, stream 중간 실패 시 자동 전환 - Cancel-safe: rewrite 전/후, streaming loop 내, fallback 경로 모두 체크 - Rewrite heartbeat: complete_chat 대기 중 2초 간격 processing 이벤트 - JobQueue: Semaphore(3) 기반 동시성 제한, 정확한 queue position - GET /chat/{job_id}/status, GET /queue/stats 엔드포인트 - DB: rewrite_model, reasoning_model, rewritten_message 컬럼 추가 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 12:04:15 +09:00
Hyungi Ahn	8c41a5dead	fix: CancelledError/Exception 경로에도 DB log_completion 추가 task.cancel()로 인한 CancelledError는 streaming loop 바깥에서 잡히므로 별도 log_completion 필요 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 11:24:29 +09:00
Hyungi Ahn	72c488d85d	fix: cancel된 job도 DB에 상태 기록 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 11:21:07 +09:00
Hyungi Ahn	e970ebdbea	feat: NanoClaude 프로덕션 통합 — Docker, Caddy, aiosqlite 로깅 - docker-compose에 nanoclaude 서비스 추가 (포트 8100) - Caddy /nano/* → nanoclaude 리버스 프록시 (SSE flush) - aiosqlite 요청/응답 로깅 (request_logs 테이블) - .env.example, CLAUDE.md 업데이트 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 11:19:15 +09:00
Hyungi Ahn	1e427bc98a	fix: EXAONE 모델 ID 수정 — exaone3.5:7.8b-instruct-q8_0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 11:12:51 +09:00
Hyungi Ahn	d946b769e5	feat: NanoClaude Phase 1 — 비동기 job 기반 AI Gateway 코어 구현 POST /chat → job_id ACK, GET /chat/{job_id}/stream → SSE 스트리밍, EXAONE Ollama adapter, JobManager, StateStream, Worker 구조 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 11:12:04 +09:00

15 Commits