# Agent 部署最佳实践 **版本:** 1.0 **创建日期:** 2026-02-23 **作者:** Eason (陈医生) 👨‍⚕️ **基于:** 张大师 (life) 部署经验总结 --- ## 📋 部署前检查清单 ### 1. 架构规划 - [ ] **确定 Agent 类型**: 独立 Gateway vs 路由模式 - 独立 Gateway:隔离性好,需要单独配置所有 Skills - 路由模式:共享配置,资源节省 - [ ] **端口规划**: 确保端口不冲突(主 Gateway 18789,张大师 18790) - [ ] **数据库隔离**: Mem0 collection 命名(如 `mem0_v4_life`) ### 2. 配置文件结构 ``` 新 Agent 部署结构: ├── ~/.openclaw-{agent-id}/ # 独立配置目录 │ ├── openclaw.json # Gateway 配置 │ ├── agents/ # Agent 配置 │ ├── credentials/ # 凭证文件 │ └── telegram/ # Telegram 状态 ├── ~/.config/systemd/user/ # systemd 服务 │ └── openclaw-gateway-{agent-id}.service └── ~/.openclaw/workspace/ # 共享 workspace ├── agents/{agent-id}-agent.json # Agent 定义 ├── skills/ # Skills(共享) └── logs/agents/{agent-id}/ # 日志目录 ``` --- ## ⚠️ 常见错误与解决方案 ### 错误 1: Skill 配置字段错误 **问题:** ```json // ❌ 错误 - openclaw.json 中不支持 description 字段 "skills": { "entries": { "chinese-almanac": { "enabled": true, "description": "黄历查询" // 不支持! } } } ``` **错误信息:** ``` skills.entries.xxx: Unrecognized key: "description" Config invalid ``` **正确配置:** ```json // ✅ 正确 - 只使用支持的字段 "skills": { "entries": { "chinese-almanac": { "enabled": true, "config": { // 技能特定配置放在 config 中 "tavily_api_key": "tvly-xxx" } } } } ``` **教训:** - `openclaw.json` 中 Skill 配置只支持 `enabled` 和 `config` 字段 - `description`、`name` 等元数据应放在 `skill.json` 中 - 配置验证失败会导致 Gateway 无法启动 --- ### 错误 2: Python Skill 在 Node.js 环境中调用 **问题:** ```json // ❌ 错误 - Python 脚本无法在 Node.js Gateway 中直接调用 { "name": "google-calendar", // Python 实现 "handler": "google_calendar.handle_calendar_command" } ``` **症状:** - Skill 加载失败 - Agent 报告"功能未配置"或"需要 MCP 连接" - 命令行测试成功,但 Gateway 中失败 **解决方案 A: 创建 Node.js 包装器(推荐)** ``` skills/google-calendar-node/ ├── calendar.js // Node.js 接口 │ └── spawn('python3', ['google_calendar.py', command]) └── skill.json └── "handler": "calendar.getCalendarInfo" // Node.js 模块 ``` **解决方案 B: 纯 Node.js 实现** ```javascript // 使用 googleapis npm 包 const { google } = require('googleapis'); ``` **教训:** - OpenClaw Gateway 是 Node.js 环境 - Python Skills 需要 Node.js 包装器才能集成 - 测试时不要只测试 Python 脚本,要测试 Gateway 集成 --- ### 错误 3: Systemd Watchdog 配置 **问题:** ```ini # ❌ 错误 - OpenClaw 不支持 systemd watchdog 通知 [Service] WatchdogSec=60s ``` **症状:** ``` Watchdog timeout (limit 1min)! Killing process with signal SIGABRT Main process exited, code=dumped, status=6/ABRT ``` **正确配置:** ```ini # ✅ 正确 - 移除 WatchdogSec [Service] Restart=always RestartSec=10s MemoryMax=1G CPUQuota=50% # 不要设置 WatchdogSec ``` **教训:** - OpenClaw Gateway 不发送 systemd watchdog 通知 - 设置 WatchdogSec 会导致服务被误杀 - 使用 `Restart=always` 实现自动恢复 --- ### 错误 4: Gateway 绑定地址 **问题:** ```json // ❌ 错误 - loopback 绑定导致 Telegram pairing 失败 "gateway": { "bind": "loopback" } ``` **错误信息:** ``` Error: Gateway is only bound to loopback. Set gateway.bind=lan, enable tailscale serve, or configure plugins.entries.device-pair.config.publicUrl. ``` **正确配置:** ```json // ✅ 正确 - LAN 绑定支持 Telegram pairing "gateway": { "bind": "lan", "port": 18790, "auth": { "mode": "token", "token": "your-token" } } ``` **安全考虑:** - 绑定 LAN 后,确保防火墙限制访问 - 仅暴露 80/443 端口(通过 Nginx 反向代理) - 使用 token 认证 --- ### 错误 5: Agent 配置与 Gateway 配置不一致 **问题:** ```json // life-agent.json { "name": "google-calendar", // ❌ Python 版本 "enabled": true } // openclaw-life.json { "skills": { "entries": { "google-calendar-node": { // ✅ Node.js 版本 "enabled": true } } } } ``` **症状:** - Agent 认为功能未配置 - System prompt 与实际可用工具不符 **解决方案:** ```json // ✅ 保持一致 // life-agent.json { "name": "google-calendar-node", "enabled": true } // openclaw-life.json { "skills": { "entries": { "google-calendar-node": { "enabled": true } } } } // System Prompt 中明确说明 "## 可用工具\n\n### Google Calendar\n- 使用 google-calendar-node skill\n- 已配置完成,无需 MCP 连接" ``` **教训:** - `agent.json` 中的 skills 列表必须与 `openclaw.json` 一致 - System prompt 应准确描述可用工具 - 更新配置后重启 Gateway --- ### 错误 6: 硬编码数据 vs 动态计算 **问题:** ```javascript // ❌ 错误 - 硬编码农历日期 const query = `2026 年 2 月 24 日 农历黄历 宜忌 正月初八`; ``` **症状:** - 日期变化后数据错误 - 不同数据源返回不同结果 **正确做法:** ```javascript // ✅ 正确 - 动态计算 const springFestival = new Date('2026-02-17'); // 春节 const lunarDay = Math.floor((targetDate - springFestival) / (1000*60*60*24)) + 1; const lunarDateStr = `农历正月初${lunarDay}`; ``` **教训:** - 避免硬编码日期、时间等动态数据 - 使用权威数据源(API)而非内部推算 - 在 system prompt 中强调使用工具查询 --- ## 📝 标准部署流程 ### 步骤 1: 创建配置目录 ```bash mkdir -p ~/.openclaw-{agent-id}/{agents,credentials,telegram} mkdir -p ~/.openclaw/workspace/logs/agents/{agent-id}/ ``` ### 步骤 2: 复制并修改 Gateway 配置 ```bash cp ~/.openclaw/openclaw.json ~/.openclaw-{agent-id}/openclaw.json # 修改: # - gateway.port # - gateway.bind (lan for Telegram) # - channels.telegram.botToken # - skills.entries (添加/移除 skills) ``` ### 步骤 3: 创建 systemd 服务 ```bash cat > ~/.config/systemd/user/openclaw-gateway-{agent-id}.service << EOF [Unit] Description=OpenClaw Gateway - {Agent Name} After=network.target openclaw-gateway.service [Service] Type=simple User=root Environment="XDG_RUNTIME_DIR=/run/user/0" Environment="DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/0/bus" Environment="NODE_ENV=production" Environment="TZ=Asia/Shanghai" WorkingDirectory=/root/.openclaw-{agent-id} ExecStart=/www/server/nodejs/v24.13.1/bin/openclaw --profile {agent-id} gateway Restart=always RestartSec=10s MemoryMax=1G CPUQuota=50% TimeoutStopSec=30s StandardOutput=journal StandardError=journal SyslogIdentifier=openclaw-gateway-{agent-id} [Install] WantedBy=default.target EOF ``` **注意:** 不要设置 `WatchdogSec`! ### 步骤 4: 创建 Agent 定义 ```bash cat > ~/.openclaw/workspace/agents/{agent-id}-agent.json << EOF { "id": "{agent-id}", "name": "{Agent Name}", "role": "{Agent Role}", "system_prompt": "你是{Agent Name},...", "skills": [ { "name": "skill-name", "enabled": true, "config": { ... } } ] } EOF ``` ### 步骤 5: 启用并启动服务 ```bash # 启用 linger(允许用户服务在后台运行) loginctl enable-linger $(whoami) # 设置环境变量 export XDG_RUNTIME_DIR=/run/user/0 export DBUS_SESSION_BUS_ADDRESS="unix:path=/run/user/0/bus" # 启用并启动服务 systemctl --user daemon-reload systemctl --user enable openclaw-gateway-{agent-id}.service systemctl --user start openclaw-gateway-{agent-id}.service # 验证状态 systemctl --user status openclaw-gateway-{agent-id}.service journalctl --user -u openclaw-gateway-{agent-id}.service -f ``` ### 步骤 6: 配置 Telegram Pairing ```bash # 发送 pairing 命令 curl -X POST https://api.telegram.org/bot{BOT_TOKEN}/sendMessage \ -d "chat_id={USER_CHAT_ID}" \ -d "text=/pair {PAIRING_CODE}" # 验证配对状态 cat ~/.openclaw-{agent-id}/credentials/telegram-default-allowFrom.json ``` ### 步骤 7: 更新 Registry ```bash # 更新 agents/registry.md # 添加新 Agent 信息 ``` ### 步骤 8: 提交 Git ```bash cd ~/.openclaw/workspace git add agents/{agent-id}-agent.json agents/registry.md git commit -m "feat: 部署 {Agent Name} - {agent-id}" ``` --- ## 🔧 故障排查 ### Gateway 无法启动 ```bash # 检查配置 openclaw --profile {agent-id} doctor # 查看日志 journalctl --user -u openclaw-gateway-{agent-id}.service --since "10 minutes ago" # 检查端口 ss -tlnp | grep {port} # 检查进程 ps aux | grep openclaw | grep {agent-id} ``` ### Skill 加载失败 ```bash # 检查 skill.json 是否存在 ls -la ~/.openclaw/workspace/skills/{skill-name}/ # 检查 openclaw.json 配置 cat ~/.openclaw-{agent-id}/openclaw.json | python3 -m json.tool # 查看 Gateway 日志 journalctl --user -u openclaw-gateway-{agent-id}.service | grep -i skill ``` ### Telegram 不回复 ```bash # 检查配对状态 cat ~/.openclaw-{agent-id}/credentials/telegram-default-allowFrom.json # 检查 Bot Token curl -X POST https://api.telegram.org/bot{BOT_TOKEN}/getMe # 检查 Gateway 绑定 cat ~/.openclaw-{agent-id}/openclaw.json | grep bind ``` --- ## 📊 配置模板 ### openclaw.json 模板 ```json { "meta": { "lastTouchedVersion": "2026.2.22-2", "lastTouchedAt": "2026-02-23T00:00:00.000Z" }, "env": { "TAVILY_API_KEY": "tvly-xxx" }, "models": { "mode": "merge", "providers": { "bailian": { "baseUrl": "https://coding.dashscope.aliyuncs.com/v1", "apiKey": "sk-sp-xxx", "api": "openai-completions" } } }, "agents": { "defaults": { "model": { "primary": "bailian/qwen3.5-plus" }, "workspace": "/root/.openclaw/workspace/agents/{agent-id}-workspace" }, "list": [ { "id": "{agent-id}", "name": "{Agent Name}", "workspace": "/root/.openclaw/workspace/agents/{agent-id}-workspace" } ] }, "channels": { "telegram": { "enabled": true, "dmPolicy": "pairing", "botToken": "{BOT_TOKEN}", "groupPolicy": "allowlist", "streaming": "partial" } }, "gateway": { "port": 18790, "mode": "local", "bind": "lan", "auth": { "mode": "token", "token": "{GATEWAY_TOKEN}" }, "trustedProxies": ["127.0.0.1", "::1"] }, "memory": { "backend": "qmd", "citations": "auto", "qmd": { "includeDefaultMemory": true, "update": { "interval": "5m", "debounceMs": 15000 } } }, "skills": { "install": { "nodeManager": "npm" }, "entries": { "tavily": { "enabled": true, "apiKey": "tvly-xxx" }, "mem0-integration": { "enabled": true, "config": { "agent_id": "{agent-id}", "user_id": "{user-id}", "collection_name": "mem0_v4_{agent-id}" } } } }, "plugins": { "entries": { "telegram": { "enabled": true } } } } ``` ### systemd 服务模板 ```ini [Unit] Description=OpenClaw Gateway - {Agent Name} Documentation=https://docs.openclaw.ai After=network.target openclaw-gateway.service [Service] Type=simple User=root Environment="XDG_RUNTIME_DIR=/run/user/0" Environment="DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/0/bus" Environment="NODE_ENV=production" Environment="TZ=Asia/Shanghai" WorkingDirectory=/root/.openclaw-{agent-id} ExecStart=/www/server/nodejs/v24.13.1/bin/openclaw --profile {agent-id} gateway Restart=always RestartSec=10s MemoryMax=1G CPUQuota=50% TimeoutStopSec=30s StandardOutput=journal StandardError=journal SyslogIdentifier=openclaw-gateway-{agent-id} [Install] WantedBy=default.target ``` --- ## ⚠️ QMD 内存后端已知风险 OpenClaw 使用 `qmd` 作为 agent workspace 的内存索引后端。此组件有一个**已知的安装兼容性问题**,在迁移或升级时很容易触发。 ### 问题根因 `qmd` 由 OpenClaw 从 GitHub 下载到 cache 目录(`/www/server/nodejs/v24.13.1/cache/@GH@tobi-qmd-*/`),**不是**标准 npm 包全局安装。 两种失效模式: | 情况 | 错误 | 原因 | |------|------|------| | bun 安装的 qmd | `better-sqlite3 bindings.node` 报错 | native addon 为 bun 编译,不兼容 node v24 | | cache 版未编译 | `spawn qmd ENOENT` 或 `dist/qmd.js not found` | TypeScript 源码未编译成 dist/ | ### 触发时机 - ✓ 新服务器迁移后(cache 目录不存在 dist/) - ✓ `openclaw update` 后(cache hash 变化,旧 symlink 失效) - ✓ Node.js 版本升级后(路径变化) ### 快速诊断 ```bash # 1. 检查 symlink 是否正常 ls -la /www/server/nodejs/v24.13.1/bin/qmd # 2. 实际运行测试(必须输出 "Usage:") /www/server/nodejs/v24.13.1/bin/qmd --help 2>&1 | head -2 # 3. 查看 gateway 日志 journalctl --user -u openclaw-gateway-{agent-id} -n 20 | grep qmd ``` ### 修复(迁移/升级后标准步骤) ```bash QMD_CACHE=$(ls -dt /www/server/nodejs/v24.13.1/cache/@GH@tobi-qmd-*/ | head -1) cd "$QMD_CACHE" && npm install && npm run build ln -sf "$QMD_CACHE/qmd" /www/server/nodejs/v24.13.1/bin/qmd /www/server/nodejs/v24.13.1/bin/qmd collection list # 验证 ``` > 详细步骤见 `SERVER_MIGRATION_GUIDE.md § Step 4.5` ### 模型配置注意(MiniMax-M2.5) MiniMax-M2.5 在 OpenClaw 中如配置 `"reasoning": true` 或未明确禁用,会进入 extended thinking 模式,导致**响应只有 thinking block、用户收不到任何回复**。 ```json // openclaw.json 中 default_llm/MiniMax-M2.5 必须加: { "id": "MiniMax-M2.5", ..., "reasoning": false } ``` --- ## 🎯 检查清单(部署后) - [ ] Gateway 服务运行正常(`systemctl --user status`) - [ ] 端口监听正确(`ss -tlnp | grep {port}`) - [ ] Telegram Bot 已连接(日志中显示 `starting provider`) - [ ] Telegram Pairing 完成(`allowFrom` 包含用户 ID) - [ ] Skills 加载成功(日志无错误) - [ ] **QMD 正常**:`/www/server/nodejs/v24.13.1/bin/qmd collection list` 无报错 - [ ] **Gateway 日志无 qmd ENOENT**:`journalctl --user -u ... | grep qmd` - [ ] Mem0 collection 已创建(独立 collection 名) - [ ] 日志目录已创建(`/logs/agents/{agent-id}/`) - [ ] Registry 已更新(`agents/registry.md`) - [ ] Git 已提交(配置备份) - [ ] 功能测试通过(实际发送消息测试) --- ## 📚 相关文档 - [OpenClaw 官方文档](https://docs.openclaw.ai) - [张大师部署日志](../logs/agents/life/2026-02-23-deployment-check.log) - [张大师问题修复报告](../logs/agents/life/2026-02-23-issue-fixes.md) - [Agent Registry](../agents/registry.md) --- **最后更新:** 2026-02-23 **维护者:** Eason (陈医生) 👨‍⚕️