feat: kb_writer 마이크로서비스 + mail_bridge 추가

- kb_writer.py: DEVONthink AppleScript 브릿지 → 마크다운 파일 기반 전환 (포트 8095)
- knowledge-base/ 디렉토리 구조 (note, chat-memory, news)
- Handle Note: kb_writer 파일 저장 + Qdrant 임베딩 추가
- Embed & Save Memory: DEVONthink → kb_writer 교체
- mail_bridge.py: IMAP 날짜 기반 메일 조회 (포트 8094)
- mail-processing-pipeline: IMAP Trigger → Schedule + mail_bridge + dedup
- docker-compose, manage_services, LaunchAgent plist 업데이트

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Hyungi Ahn
2026-03-18 14:46:57 +09:00
parent 66a8b63cad
commit 852f5cb648
16 changed files with 445 additions and 38 deletions

View File

@@ -48,6 +48,7 @@ IMAP_HOST=192.168.1.227
IMAP_PORT=21680
IMAP_USER=hyungi
IMAP_PASSWORD=changeme
IMAP_SSL=true
# DEVONthink (devonthink_bridge.py — 지식 저장소)
DEVONTHINK_APP_NAME=DEVONthink
@@ -61,3 +62,4 @@ HEIC_CONVERTER_URL=http://host.docker.internal:8090
CHAT_BRIDGE_URL=http://host.docker.internal:8091
CALDAV_BRIDGE_URL=http://host.docker.internal:8092
DEVONTHINK_BRIDGE_URL=http://host.docker.internal:8093
MAIL_BRIDGE_URL=http://host.docker.internal:8094

View File

@@ -42,13 +42,14 @@ bot-n8n (맥미니 Docker) — 51노드 파이프라인
⑦ [비동기] 선택적 메모리 (Qwen 판단 → 가치 있으면 벡터화 + DEVONthink 저장)
별도 워크플로우:
Mail Processing Pipeline (7노드) — MailPlus IMAP 폴링 → 분류 → mail_logs
Mail Processing Pipeline (9노드) — mail_bridge 날짜 기반 조회 → dedup → 분류 → mail_logs
네이티브 서비스 (맥미니):
heic_converter (:8090) — HEIC→JPEG 변환 (macOS sips)
chat_bridge (:8091) — DSM Chat API 브릿지 (사진 폴링/다운로드)
caldav_bridge (:8092) — CalDAV REST 래퍼 (Synology Calendar)
devonthink_bridge (:8093) — DEVONthink AppleScript 래퍼
mail_bridge (:8094) — IMAP 날짜 기반 메일 조회 (MailPlus)
inbox_processor (5분) — OmniFocus Inbox 폴링 (LaunchAgent)
news_digest (매일 07:00) — 뉴스 번역·요약 (LaunchAgent)
@@ -73,6 +74,7 @@ DEVONthink 4 (맥미니):
| chat_bridge | 네이티브 (맥미니) | 8091 | DSM Chat API 브릿지 (사진 폴링/다운로드) |
| caldav_bridge | 네이티브 (맥미니) | 8092 | CalDAV REST 래퍼 (Synology Calendar) |
| devonthink_bridge | 네이티브 (맥미니) | 8093 | DEVONthink AppleScript 래퍼 |
| mail_bridge | 네이티브 (맥미니) | 8094 | IMAP 날짜 기반 메일 조회 (MailPlus) |
| inbox_processor | 네이티브 (맥미니) | — | OmniFocus Inbox 폴링 (LaunchAgent, 5분) |
| news_digest | 네이티브 (맥미니) | — | 뉴스 번역·요약 (LaunchAgent, 매일 07:00) |
| Synology Chat | NAS (192.168.1.227) | — | 사용자 인터페이스 |

View File

@@ -6,12 +6,10 @@
<string>com.syn-chat-bot.heic-converter</string>
<key>ProgramArguments</key>
<array>
<string>/Users/hyungi/Documents/code/syn-chat-bot/.venv/bin/uvicorn</string>
<string>heic_converter:app</string>
<string>--host</string>
<string>127.0.0.1</string>
<string>--port</string>
<string>8090</string>
<string>/opt/homebrew/opt/python@3.14/bin/python3.14</string>
<string>-S</string>
<string>-c</string>
<string>import sys; sys.path.insert(0,'/Users/hyungi/Documents/code/syn-chat-bot/.venv/lib/python3.14/site-packages'); sys.path.insert(0,'/Users/hyungi/Documents/code/syn-chat-bot'); import uvicorn; uvicorn.run('heic_converter:app',host='127.0.0.1',port=8090)</string>
</array>
<key>WorkingDirectory</key>
<string>/Users/hyungi/Documents/code/syn-chat-bot</string>

View File

@@ -0,0 +1,25 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.syn-chat-bot.kb-writer</string>
<key>ProgramArguments</key>
<array>
<string>/opt/homebrew/opt/python@3.14/bin/python3.14</string>
<string>-S</string>
<string>-c</string>
<string>import sys; sys.path.insert(0,'/Users/hyungi/Documents/code/syn-chat-bot/.venv/lib/python3.14/site-packages'); sys.path.insert(0,'/Users/hyungi/Documents/code/syn-chat-bot'); import uvicorn; uvicorn.run('kb_writer:app',host='127.0.0.1',port=8095)</string>
</array>
<key>WorkingDirectory</key>
<string>/Users/hyungi/Documents/code/syn-chat-bot</string>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/tmp/kb-writer.log</string>
<key>StandardErrorPath</key>
<string>/tmp/kb-writer.err</string>
</dict>
</plist>

View File

@@ -0,0 +1,25 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.syn-chat-bot.mail-bridge</string>
<key>ProgramArguments</key>
<array>
<string>/opt/homebrew/opt/python@3.14/bin/python3.14</string>
<string>-S</string>
<string>-c</string>
<string>import sys; sys.path.insert(0,'/Users/hyungi/Documents/code/syn-chat-bot/.venv/lib/python3.14/site-packages'); sys.path.insert(0,'/Users/hyungi/Documents/code/syn-chat-bot'); import uvicorn; uvicorn.run('mail_bridge:app',host='127.0.0.1',port=8094)</string>
</array>
<key>WorkingDirectory</key>
<string>/Users/hyungi/Documents/code/syn-chat-bot</string>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/tmp/mail-bridge.log</string>
<key>StandardErrorPath</key>
<string>/tmp/mail-bridge.err</string>
</dict>
</plist>

View File

@@ -32,6 +32,8 @@ services:
- CHAT_BRIDGE_URL=http://host.docker.internal:8091
- CALDAV_BRIDGE_URL=http://host.docker.internal:8092
- DEVONTHINK_BRIDGE_URL=http://host.docker.internal:8093
- MAIL_BRIDGE_URL=http://host.docker.internal:8094
- KB_WRITER_URL=http://host.docker.internal:8095
- NODE_FUNCTION_ALLOW_BUILTIN=crypto,http,https,url
volumes:
- ./n8n/data:/home/node/.n8n

5
init/migrate-v4.sql Normal file
View File

@@ -0,0 +1,5 @@
-- migrate-v4.sql: 메일 파이프라인 dedup — message_id 컬럼 추가
-- 실행: docker exec -i bot-postgres psql -U bot -d chatbot < init/migrate-v4.sql
ALTER TABLE mail_logs ADD COLUMN IF NOT EXISTS message_id VARCHAR(500);
CREATE UNIQUE INDEX IF NOT EXISTS idx_mail_message_id ON mail_logs(message_id);

106
kb_writer.py Normal file
View File

@@ -0,0 +1,106 @@
"""Knowledge Base Writer — 마크다운 파일 저장 마이크로서비스 (port 8095)
DEVONthink AppleScript 브릿지 대체. 순수 파일 I/O로 knowledge-base/ 에 마크다운 저장.
DEVONthink에서는 인덱스 그룹으로 읽기만 하면 됨.
"""
import logging
import re
import unicodedata
from datetime import datetime, timezone, timedelta
from pathlib import Path
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
logger = logging.getLogger("kb_writer")
KST = timezone(timedelta(hours=9))
BASE_DIR = Path(__file__).resolve().parent / "knowledge-base"
app = FastAPI()
def _slugify(text: str, max_len: int = 50) -> str:
"""한글 + 영문 친화적 슬러그 생성."""
text = unicodedata.normalize("NFC", text)
text = re.sub(r"[^\w가-힣\s-]", "", text)
text = re.sub(r"[\s_]+", "-", text).strip("-")
return text[:max_len] or "untitled"
@app.post("/save")
async def save(request: Request):
body = await request.json()
title = body.get("title", "Untitled")
content = body.get("content", "")
doc_type = body.get("type", "note") # note | chat-memory | news
tags = body.get("tags", [])
username = body.get("username", "unknown")
topic = body.get("topic", "general")
source = body.get("source", "synology-chat")
now = datetime.now(KST)
month_dir = now.strftime("%Y-%m")
date_str = now.strftime("%Y-%m-%d")
iso_str = now.isoformat()
# 파일명 생성
slug = _slugify(title)
if doc_type == "chat-memory":
time_str = now.strftime("%H%M")
filename = f"{date_str}T{time_str}-{_slugify(topic)}.md"
elif doc_type == "news":
filename = f"{date_str}-{_slugify(source)}-{slug}.md"
else:
filename = f"{date_str}-{slug}.md"
# 디렉토리 생성
type_dir = BASE_DIR / doc_type / month_dir
type_dir.mkdir(parents=True, exist_ok=True)
# 중복 파일명 처리
filepath = type_dir / filename
counter = 1
while filepath.exists():
stem = filename.rsplit(".", 1)[0]
filepath = type_dir / f"{stem}-{counter}.md"
counter += 1
# YAML frontmatter + 본문
tags_yaml = ", ".join(f'"{t}"' for t in tags)
qdrant_id = int(now.timestamp() * 1000)
md_content = f"""---
title: "{title}"
date: "{iso_str}"
source: {source}
type: {doc_type}
tags: [{tags_yaml}]
username: {username}
topic: {topic}
qdrant_id: {qdrant_id}
---
{content}
"""
try:
filepath.write_text(md_content, encoding="utf-8")
logger.info(f"Saved: {filepath.relative_to(BASE_DIR)}")
return JSONResponse({
"success": True,
"path": str(filepath.relative_to(BASE_DIR)),
"filename": filepath.name,
"qdrant_id": qdrant_id,
})
except Exception as e:
logger.error(f"Save failed: {e}")
return JSONResponse({"success": False, "error": str(e)}, status_code=500)
@app.get("/health")
async def health():
kb_exists = BASE_DIR.is_dir()
return {"status": "ok", "knowledge_base_exists": kb_exists}

6
knowledge-base/.gitignore vendored Normal file
View File

@@ -0,0 +1,6 @@
# 마크다운 콘텐츠는 로컬 생성물 — git에 포함하지 않음
# rsync로 DS1525+에 백업
*/*.md
*/*/*.md
!.gitignore
!*/.gitkeep

View File

View File

View File

180
mail_bridge.py Normal file
View File

@@ -0,0 +1,180 @@
"""Mail Bridge — IMAP 날짜 기반 메일 조회 서비스 (port 8094)"""
import email
import email.header
import email.utils
import imaplib
import logging
import os
from datetime import datetime, timedelta
from dotenv import load_dotenv
from fastapi import FastAPI, Query
from fastapi.responses import JSONResponse
load_dotenv()
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
logger = logging.getLogger("mail_bridge")
IMAP_HOST = os.getenv("IMAP_HOST", "192.168.1.227")
IMAP_PORT = int(os.getenv("IMAP_PORT", "993"))
IMAP_USER = os.getenv("IMAP_USER", "")
IMAP_PASSWORD = os.getenv("IMAP_PASSWORD", "")
IMAP_SSL = os.getenv("IMAP_SSL", "true").lower() == "true"
IMAP_FOLDERS = [f.strip() for f in os.getenv("IMAP_FOLDERS", "INBOX,Gmail,Technicalkorea").split(",") if f.strip()]
app = FastAPI()
def _connect() -> imaplib.IMAP4:
if IMAP_SSL:
conn = imaplib.IMAP4_SSL(IMAP_HOST, IMAP_PORT)
else:
conn = imaplib.IMAP4(IMAP_HOST, IMAP_PORT)
conn.login(IMAP_USER, IMAP_PASSWORD)
return conn
def _decode_header(raw: str | None) -> str:
if not raw:
return ""
parts = email.header.decode_header(raw)
decoded = []
for data, charset in parts:
if isinstance(data, bytes):
decoded.append(data.decode(charset or "utf-8", errors="replace"))
else:
decoded.append(data)
return " ".join(decoded)
def _get_text(msg: email.message.Message) -> str:
if msg.is_multipart():
for part in msg.walk():
ct = part.get_content_type()
if ct == "text/plain":
payload = part.get_payload(decode=True)
if payload:
charset = part.get_content_charset() or "utf-8"
return payload.decode(charset, errors="replace")
# fallback: text/html
for part in msg.walk():
ct = part.get_content_type()
if ct == "text/html":
payload = part.get_payload(decode=True)
if payload:
charset = part.get_content_charset() or "utf-8"
return payload.decode(charset, errors="replace")
return ""
payload = msg.get_payload(decode=True)
if payload:
charset = msg.get_content_charset() or "utf-8"
return payload.decode(charset, errors="replace")
return ""
def _dedup_key(mail: dict) -> str:
if mail["messageId"]:
return mail["messageId"]
return f"{mail['subject']}|{mail['date']}|{mail['from']}"
@app.get("/recent")
def recent_mails(days: int = Query(default=1, ge=1, le=30)):
try:
conn = _connect()
except Exception as e:
logger.error(f"IMAP connection failed: {e}")
return JSONResponse({"success": False, "error": f"IMAP connection failed: {e}"}, status_code=502)
try:
since = (datetime.now() - timedelta(days=days)).strftime("%d-%b-%Y")
seen_keys: set[str] = set()
mails: list[dict] = []
for folder in IMAP_FOLDERS:
try:
status, _ = conn.select(folder, readonly=True)
if status != "OK":
logger.warning(f"Cannot select folder '{folder}', skipping")
continue
except Exception as e:
logger.warning(f"Failed to select folder '{folder}': {e}")
continue
_, msg_nums = conn.search(None, f"SINCE {since}")
nums = msg_nums[0].split() if msg_nums[0] else []
for num in nums:
_, data = conn.fetch(num, "(RFC822)")
if not data or not data[0]:
continue
raw = data[0][1]
msg = email.message_from_bytes(raw)
message_id = msg.get("Message-ID", "").strip()
from_addr = _decode_header(msg.get("From", ""))
subject = _decode_header(msg.get("Subject", ""))
date_str = msg.get("Date", "")
text = _get_text(msg)
parsed_date = email.utils.parsedate_to_datetime(date_str).isoformat() if date_str else ""
mail_entry = {
"messageId": message_id,
"from": from_addr,
"subject": subject,
"text": text,
"date": parsed_date,
"folder": folder,
}
key = _dedup_key(mail_entry)
if key not in seen_keys:
seen_keys.add(key)
mails.append(mail_entry)
logger.info(f"Folder '{folder}': {len(nums)} messages since {since}")
mails.sort(key=lambda m: m["date"], reverse=True)
logger.info(f"Total {len(mails)} unique mails across {len(IMAP_FOLDERS)} folders (SINCE {since})")
return {"success": True, "count": len(mails), "mails": mails}
except Exception as e:
logger.error(f"IMAP fetch error: {e}")
return JSONResponse({"success": False, "error": str(e)}, status_code=500)
finally:
try:
conn.logout()
except Exception:
pass
@app.get("/health")
def health():
folders_status: dict[str, bool] = {}
try:
conn = _connect()
for folder in IMAP_FOLDERS:
try:
status, _ = conn.select(folder, readonly=True)
folders_status[folder] = status == "OK"
except Exception:
folders_status[folder] = False
conn.logout()
except Exception as e:
logger.warning(f"IMAP health check failed: {e}")
for folder in IMAP_FOLDERS:
folders_status.setdefault(folder, False)
any_ok = any(folders_status.values())
all_ok = all(folders_status.values())
if all_ok:
status = "ok"
elif any_ok:
status = "degraded"
else:
status = "error"
return {"status": status, "imap_reachable": any_ok, "folders": folders_status}

View File

@@ -7,6 +7,8 @@ SERVICES=(
"com.syn-chat-bot.heic-converter"
"com.syn-chat-bot.caldav-bridge"
"com.syn-chat-bot.devonthink-bridge"
"com.syn-chat-bot.kb-writer"
"com.syn-chat-bot.mail-bridge"
"com.syn-chat-bot.inbox-processor"
"com.syn-chat-bot.news-digest"
)

View File

@@ -4,25 +4,19 @@
"nodes": [
{
"parameters": {
"mailbox": "INBOX",
"postProcessAction": "read",
"options": {
"customEmailConfig": "{ \"host\": \"{{$env.IMAP_HOST || '192.168.1.227'}}\", \"port\": {{$env.IMAP_PORT || 993}}, \"secure\": true, \"auth\": { \"user\": \"{{$env.IMAP_USER}}\", \"pass\": \"{{$env.IMAP_PASSWORD}}\" } }"
},
"pollTimes": {
"item": [
"rule": {
"interval": [
{
"mode": "everyX",
"value": 15,
"unit": "minutes"
"field": "minutes",
"minutesInterval": 15
}
]
}
},
"id": "m1000001-0000-0000-0000-000000000001",
"name": "IMAP Trigger",
"type": "n8n-nodes-base.imapEmail",
"typeVersion": 2,
"id": "m1000001-0000-0000-0000-000000000010",
"name": "Schedule Trigger",
"type": "n8n-nodes-base.scheduleTrigger",
"typeVersion": 1.2,
"position": [
0,
300
@@ -30,14 +24,52 @@
},
{
"parameters": {
"jsCode": "const items = $input.all();\nconst results = [];\nfor (const item of items) {\n const j = item.json;\n const from = j.from?.text || j.from || '';\n const subject = (j.subject || '').substring(0, 500);\n const body = (j.text || j.textPlain || j.html || '').substring(0, 5000)\n .replace(/<[^>]*>/g, '').replace(/\\s+/g, ' ').trim();\n const mailDate = j.date || new Date().toISOString();\n results.push({ json: { from, subject, body, mailDate, messageId: j.messageId || '' } });\n}\nreturn results;"
"method": "GET",
"url": "={{ $env.MAIL_BRIDGE_URL }}/recent?days=1",
"options": {
"timeout": 15000
}
},
"id": "m1000001-0000-0000-0000-000000000011",
"name": "Fetch Mails",
"type": "n8n-nodes-base.httpRequest",
"typeVersion": 4.2,
"position": [
220,
300
]
},
{
"parameters": {
"operation": "executeQuery",
"query": "SELECT message_id FROM mail_logs WHERE mail_date >= NOW() - INTERVAL '2 days'",
"options": {}
},
"id": "m1000001-0000-0000-0000-000000000012",
"name": "Get Existing IDs",
"type": "n8n-nodes-base.postgres",
"typeVersion": 2.5,
"position": [
440,
300
],
"credentials": {
"postgres": {
"id": "KaxU8iKtraFfsrTF",
"name": "bot-postgres"
}
}
},
{
"parameters": {
"jsCode": "const mailsResponse = $('Fetch Mails').first().json;\nconst mails = mailsResponse.mails || [];\nconst existingRows = $input.all();\nconst existingIds = new Set(existingRows.map(r => r.json.message_id).filter(Boolean));\n\nconst results = [];\nfor (const mail of mails) {\n const mid = mail.messageId || '';\n if (!mid || existingIds.has(mid)) continue;\n results.push({ json: {\n from: mail.from || '',\n subject: (mail.subject || '').substring(0, 500),\n body: (mail.text || '').substring(0, 5000).replace(/<[^>]*>/g, '').replace(/\\s+/g, ' ').trim(),\n mailDate: mail.date || new Date().toISOString(),\n messageId: mid\n }});\n}\nreturn results;"
},
"id": "m1000001-0000-0000-0000-000000000002",
"name": "Parse Mail",
"name": "Parse & Filter New",
"type": "n8n-nodes-base.code",
"typeVersion": 1,
"position": [
220,
660,
300
]
},
@@ -50,14 +82,14 @@
"type": "n8n-nodes-base.code",
"typeVersion": 1,
"position": [
440,
880,
300
]
},
{
"parameters": {
"operation": "executeQuery",
"query": "=INSERT INTO mail_logs (from_address,subject,summary,label,has_events,has_tasks,mail_date) VALUES ('{{ ($json.from||'').replace(/'/g,\"''\").substring(0,255) }}','{{ ($json.subject||'').replace(/'/g,\"''\").substring(0,500) }}','{{ ($json.summary||'').replace(/'/g,\"''\").substring(0,2000) }}','{{ $json.label }}',{{ $json.has_events }},{{ $json.has_tasks }},'{{ $json.mailDate }}')",
"query": "=INSERT INTO mail_logs (from_address,subject,summary,label,has_events,has_tasks,mail_date,message_id) VALUES ('{{ ($json.from||'').replace(/'/g,\"''\").substring(0,255) }}','{{ ($json.subject||'').replace(/'/g,\"''\").substring(0,500) }}','{{ ($json.summary||'').replace(/'/g,\"''\").substring(0,2000) }}','{{ $json.label }}',{{ $json.has_events }},{{ $json.has_tasks }},'{{ $json.mailDate }}','{{ ($json.messageId||'').replace(/'/g,\"''\").substring(0,500) }}') ON CONFLICT (message_id) DO NOTHING",
"options": {}
},
"id": "m1000001-0000-0000-0000-000000000004",
@@ -65,7 +97,7 @@
"type": "n8n-nodes-base.postgres",
"typeVersion": 2.5,
"position": [
660,
1100,
300
],
"credentials": {
@@ -84,7 +116,7 @@
"type": "n8n-nodes-base.code",
"typeVersion": 1,
"position": [
880,
1320,
300
]
},
@@ -125,7 +157,7 @@
"type": "n8n-nodes-base.if",
"typeVersion": 2.2,
"position": [
1100,
1540,
300
]
},
@@ -145,24 +177,46 @@
"type": "n8n-nodes-base.httpRequest",
"typeVersion": 4.2,
"position": [
1320,
1760,
200
]
}
],
"connections": {
"IMAP Trigger": {
"Schedule Trigger": {
"main": [
[
{
"node": "Parse Mail",
"node": "Fetch Mails",
"type": "main",
"index": 0
}
]
]
},
"Parse Mail": {
"Fetch Mails": {
"main": [
[
{
"node": "Get Existing IDs",
"type": "main",
"index": 0
}
]
]
},
"Get Existing IDs": {
"main": [
[
{
"node": "Parse & Filter New",
"type": "main",
"index": 0
}
]
]
},
"Parse & Filter New": {
"main": [
[
{
@@ -222,4 +276,4 @@
"settings": {
"executionOrder": "v1"
}
}
}

File diff suppressed because one or more lines are too long