Made-with: Cursormaster
parent
a677a89f97
commit
caccbebc7e
1 changed files with 596 additions and 0 deletions
@ -0,0 +1,596 @@ |
|||||||
|
# Server Migration Guide |
||||||
|
|
||||||
|
> Last updated: 2026-03-26 |
||||||
|
> Covers: `~/.openclaw`, `~/.openclaw-tongge`, `~/.mem0`, Docker stack (`qdrant-master`, `dozzle`) |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Overview |
||||||
|
|
||||||
|
This guide documents how to migrate the full OpenClaw stack (main gateway, Tongge sub-gateway, and Mem0 memory service) to a new Ubuntu VPS while preserving structure, configuration, and memories. |
||||||
|
|
||||||
|
### Systemd layout (read this before Step 3) |
||||||
|
|
||||||
|
On the reference deployment, **two gateways run as user systemd units** (root’s `~/.config/systemd/user/`), while the **agent monitor runs as a system unit**: |
||||||
|
|
||||||
|
| Unit | Scope | Role | |
||||||
|
|------|--------|------| |
||||||
|
| `openclaw-gateway.service` | **user** (`systemctl --user`) | Main gateway (port 18789) | |
||||||
|
| `openclaw-gateway-tongge.service` | **user** (`systemctl --user`) | Tongge gateway (port 18790) | |
||||||
|
| `openclaw-agent-monitor.service` | **system** (`systemctl`) | Health monitor | |
||||||
|
|
||||||
|
`/etc/systemd/system/openclaw-gateway.service` may exist as a legacy or alternate template; the **running** main gateway is typically the **user** unit above. Migrating only `/etc/systemd/system/` and running `systemctl start openclaw-gateway` **does not** restore the active user-managed gateways unless you intentionally switch to system units. |
||||||
|
|
||||||
|
For user services to start at boot (no interactive login), **root linger** must be enabled once: |
||||||
|
|
||||||
|
```bash |
||||||
|
loginctl enable-linger root |
||||||
|
``` |
||||||
|
|
||||||
|
### When to stop the two gateways on the **old** server |
||||||
|
|
||||||
|
The guide assumes **two user-scoped gateways**: `openclaw-gateway` (main) and `openclaw-gateway-tongge` (桐哥). |
||||||
|
|
||||||
|
| Strategy | When to stop them on the old box | Trade-off | |
||||||
|
|----------|----------------------------------|-----------| |
||||||
|
| **A — Default (recommended)** | Only **after** the new server is verified (Telegram, health, logs). See **Stopping the Old Server After Migration** at the end of this doc. | Old and new may both run briefly during migration → **risk of duplicate Telegram bot replies** if both are online. | |
||||||
|
| **B — Cutover window** | Stop them **immediately before** you run **Step 6** `enable --now` / `./deploy.sh start` on the **new** server. | Short downtime, minimizes duplicate bots. | |
||||||
|
|
||||||
|
**Old-server commands (equivalent):** |
||||||
|
|
||||||
|
```bash |
||||||
|
# Option 1 — matches this repo’s deploy script (stops agents in agents.yaml + monitor) |
||||||
|
cd /root/.openclaw/workspace && ./deploy.sh stop |
||||||
|
|
||||||
|
# Option 2 — gateways only (monitor keeps running unless you stop it separately) |
||||||
|
systemctl --user stop openclaw-gateway openclaw-gateway-tongge |
||||||
|
``` |
||||||
|
|
||||||
|
Use strategy **B** if duplicate bots are unacceptable; use **A** if you want a safe rollback window while testing the new machine. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Step 1 — Prepare the New Server |
||||||
|
|
||||||
|
Install Node.js v24 (must match current version: **v24.13.1**). |
||||||
|
|
||||||
|
**aaPanel method** (recommended — matches `/www/server/nodejs/v24.13.1/` path): |
||||||
|
Install Node.js 24.x from the aaPanel software store. |
||||||
|
|
||||||
|
**Manual method:** |
||||||
|
```bash |
||||||
|
curl -fsSL https://deb.nodesource.com/setup_24.x | bash - |
||||||
|
apt install -y nodejs |
||||||
|
``` |
||||||
|
|
||||||
|
Install global npm packages (must match versions on old server): |
||||||
|
```bash |
||||||
|
npm install -g openclaw clawhub mcporter pnpm |
||||||
|
# Optional (present on original server): |
||||||
|
npm install -g @steipete/oracle@0.8.6 bun@1.3.9 |
||||||
|
``` |
||||||
|
|
||||||
|
Verify: |
||||||
|
```bash |
||||||
|
/www/server/nodejs/v24.13.1/bin/npm list -g --depth=0 |
||||||
|
``` |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Step 2 — Copy the Three Data Directories |
||||||
|
|
||||||
|
Run on the **old server**. Replace `NEW_SERVER_IP`: |
||||||
|
|
||||||
|
```bash |
||||||
|
rsync -avz --progress -e 'ssh -p 3322' \ |
||||||
|
/root/.openclaw \ |
||||||
|
/root/.openclaw-tongge \ |
||||||
|
/root/.mem0 \ |
||||||
|
root@NEW_SERVER_IP:/root/ |
||||||
|
``` |
||||||
|
|
||||||
|
Alternative (if rsync unavailable): |
||||||
|
```bash |
||||||
|
tar czf - /root/.openclaw /root/.openclaw-tongge /root/.mem0 \ |
||||||
|
| ssh root@NEW_SERVER_IP 'tar xzf - -C /' |
||||||
|
``` |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Step 3 — Copy systemd Service Files |
||||||
|
|
||||||
|
Run on the **old server**. Copy **both** system units and **user** units (gateways). |
||||||
|
|
||||||
|
### 3a — System units (agent monitor) |
||||||
|
|
||||||
|
```bash |
||||||
|
scp -P 3322 /etc/systemd/system/openclaw-agent-monitor.service \ |
||||||
|
root@NEW_SERVER_IP:/etc/systemd/system/ |
||||||
|
|
||||||
|
# Optional: legacy/alternate main gateway system unit (if you use it instead of the user unit) |
||||||
|
scp -P 3322 /etc/systemd/system/openclaw-gateway.service \ |
||||||
|
root@NEW_SERVER_IP:/etc/systemd/system/ |
||||||
|
``` |
||||||
|
|
||||||
|
### 3b — User units (main + Tongge gateways) — **required** for the standard layout |
||||||
|
|
||||||
|
On the **new** server, ensure the directory exists (scp does not create parent paths): |
||||||
|
|
||||||
|
```bash |
||||||
|
ssh root@NEW_SERVER_IP 'mkdir -p /root/.config/systemd/user' |
||||||
|
``` |
||||||
|
|
||||||
|
Then from the **old** server: |
||||||
|
|
||||||
|
```bash |
||||||
|
scp -P 3322 /root/.config/systemd/user/openclaw-gateway.service \ |
||||||
|
/root/.config/systemd/user/openclaw-gateway-tongge.service \ |
||||||
|
root@NEW_SERVER_IP:/root/.config/systemd/user/ |
||||||
|
``` |
||||||
|
|
||||||
|
Canonical copies also live under the repo (if you need to recreate units without scp): |
||||||
|
|
||||||
|
- `/root/.openclaw/workspace/systemd/openclaw-gateway-user.service` — reference for main gateway (compare with `~/.config/systemd/user/openclaw-gateway.service`) |
||||||
|
- `/root/.openclaw/workspace/systemd/openclaw-gateway-tongge.service` — Tongge gateway |
||||||
|
|
||||||
|
**`openclaw-gateway.service` (user)** — main gateway |
||||||
|
- ExecStart: typically `node` + `openclaw/dist/index.js` `gateway --port 18789` (see actual unit on disk) |
||||||
|
- `EnvironmentFile=-/root/.openclaw/workspace/systemd/gateway.env` |
||||||
|
|
||||||
|
**`openclaw-gateway-tongge.service` (user)** — Tongge gateway |
||||||
|
- `WorkingDirectory=/root/.openclaw-tongge` |
||||||
|
- `ExecStart`: `/www/server/nodejs/v24.13.1/bin/openclaw --profile tongge gateway` |
||||||
|
- `EnvironmentFile=-/root/.openclaw/workspace/systemd/tongge-gateway.env` (path under **`~/.openclaw`**, migrated with the `.openclaw` tree) |
||||||
|
|
||||||
|
**`openclaw-agent-monitor.service` (system)** — agent health monitor |
||||||
|
- `WorkingDirectory`: `/root/.openclaw/workspace` |
||||||
|
- `ExecStart`: `/usr/bin/node /root/.openclaw/workspace/agent-monitor.js` |
||||||
|
- `ReadWritePaths`: `/root/.openclaw/workspace/logs` |
||||||
|
- `MemoryMax`: 512M, `CPUQuota`: 20% |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Step 4 — Fix Node.js Paths on the New Server |
||||||
|
|
||||||
|
If the new server uses a different Node.js install path, update **all** units that reference `/www/server/nodejs/v24.13.1/`: |
||||||
|
|
||||||
|
```bash |
||||||
|
# Check actual paths |
||||||
|
which node |
||||||
|
which openclaw |
||||||
|
|
||||||
|
# Edit if needed |
||||||
|
nano /root/.config/systemd/user/openclaw-gateway.service |
||||||
|
nano /root/.config/systemd/user/openclaw-gateway-tongge.service |
||||||
|
nano /etc/systemd/system/openclaw-agent-monitor.service |
||||||
|
# Optional system gateway template: |
||||||
|
nano /etc/systemd/system/openclaw-gateway.service |
||||||
|
``` |
||||||
|
|
||||||
|
Key fields to verify: |
||||||
|
- `ExecStart=` — correct paths to `openclaw` and `node` binaries |
||||||
|
- `Environment=PATH=` (if present) — must include the Node.js `bin/` directory |
||||||
|
|
||||||
|
Also update **`memory.qmd.command`** in `/root/.openclaw-tongge/openclaw.json` if the absolute path to `qmd` changes (see Step 4.5). |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Step 4.5 — 安装 QMD 内存后端(关键) |
||||||
|
|
||||||
|
> **为什么需要此步骤**:OpenClaw 使用 `qmd`(Quick Markdown Database)作为 Agent 工作区内存后端。 |
||||||
|
> qmd 必须独立通过 npm 安装,**不依赖 openclaw 缓存,不依赖 bun**。 |
||||||
|
> openclaw 内置缓存版本的 `better-sqlite3` 是用 bun runtime 编译的,与 node v24 ABI 不兼容,会导致 `bindings.node` 错误。 |
||||||
|
|
||||||
|
### 安装(一条命令) |
||||||
|
|
||||||
|
```bash |
||||||
|
/www/server/nodejs/v24.13.1/bin/npm install -g @tobilu/qmd |
||||||
|
``` |
||||||
|
|
||||||
|
安装完成后 npm 会自动创建全局 symlink: |
||||||
|
``` |
||||||
|
/www/server/nodejs/v24.13.1/bin/qmd → ../lib/node_modules/@tobilu/qmd/bin/qmd |
||||||
|
``` |
||||||
|
|
||||||
|
### 验证 |
||||||
|
|
||||||
|
```bash |
||||||
|
# 检查 symlink 是否指向 npm 全局包(不是 cache 目录) |
||||||
|
ls -la /www/server/nodejs/v24.13.1/bin/qmd |
||||||
|
|
||||||
|
# 验证 qmd 可正常运行 |
||||||
|
/www/server/nodejs/v24.13.1/bin/qmd --help 2>&1 | head -3 |
||||||
|
|
||||||
|
# 验证 collection 命令可用(实际使用前) |
||||||
|
/www/server/nodejs/v24.13.1/bin/qmd collection list |
||||||
|
``` |
||||||
|
|
||||||
|
### 常见错误排查(安装与命令行阶段) |
||||||
|
|
||||||
|
此时尚未启动 Gateway,**没有** systemd/journal 中的网关日志;请以 **`qmd collection list`** 与 **`qmd --help`** 的终端输出为准。 |
||||||
|
|
||||||
|
错误信号及原因(命令行或后续网关日志中可能出现): |
||||||
|
- `spawn .../bin/qmd ENOENT` — symlink 断开或 npm 包未安装,重新执行 `npm install -g @tobilu/qmd` |
||||||
|
- `bindings.node` / `better-sqlite3` errors — 使用了 openclaw 缓存中 bun 编译的版本,需覆盖安装 npm 包 |
||||||
|
- `Cannot find module '.../dist/qmd.js'` — dist 未构建,npm 包版本过旧 |
||||||
|
|
||||||
|
### openclaw 升级后 |
||||||
|
|
||||||
|
`openclaw update` **不会影响** npm 全局安装的 qmd(两者完全独立)。升级后无需重新处理 qmd。 |
||||||
|
|
||||||
|
若 openclaw 升级后出现 qmd 问题,检查 symlink 是否被 openclaw 覆盖: |
||||||
|
```bash |
||||||
|
ls -la /www/server/nodejs/v24.13.1/bin/qmd |
||||||
|
# 应指向 ../lib/node_modules/@tobilu/qmd/bin/qmd |
||||||
|
# 若指向 cache/ 目录,重新执行: |
||||||
|
/www/server/nodejs/v24.13.1/bin/npm install -g @tobilu/qmd |
||||||
|
``` |
||||||
|
|
||||||
|
### 关键信息 |
||||||
|
|
||||||
|
| 项目 | 值 | |
||||||
|
|------|-----| |
||||||
|
| 安装命令 | `/www/server/nodejs/v24.13.1/bin/npm install -g @tobilu/qmd` | |
||||||
|
| npm 包名 | `@tobilu/qmd` | |
||||||
|
| 当前版本 | `2.0.1` | |
||||||
|
| Symlink 位置 | `/www/server/nodejs/v24.13.1/bin/qmd` | |
||||||
|
| Symlink 目标 | `../lib/node_modules/@tobilu/qmd/bin/qmd` | |
||||||
|
| 桐哥 gateway 配置 | `memory.qmd.command: "/www/server/nodejs/v24.13.1/bin/qmd"`(绝对路径) | |
||||||
|
| 为何不用 openclaw cache 版本 | cache 版的 `better-sqlite3` 是 bun ABI 编译,与 node v24 不兼容 | |
||||||
|
| openclaw 升级影响 | 无(npm 全局包独立于 openclaw cache) | |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Attachment(环境依赖) |
||||||
|
|
||||||
|
本附录汇总迁移后启动前必须确认的“基础建设/环境依赖”,避免只要启动 gateway 就开始排错。 |
||||||
|
|
||||||
|
### A. OneAPI LLM 网关(基础设施,非 mem0) |
||||||
|
|
||||||
|
OneAPI 网关部署目录: |
||||||
|
- `/root/.openclaw/workspace/infrastructure/oneapi/` |
||||||
|
|
||||||
|
关键要点: |
||||||
|
- `docker-compose.yml` 中容器端口绑定为 `TAILSCALE_IP:3000:3000` |
||||||
|
- `/root/.openclaw/workspace/infrastructure/oneapi/.env` 只需要把 `TAILSCALE_IP` 改成新机真实值 |
||||||
|
- 启动后应可访问 OneAPI 管理后台:`http://<TAILSCALE_IP>:3000`(默认 `root / 123456`) |
||||||
|
|
||||||
|
与 OpenClaw 的对齐检查(两处都要改/核对): |
||||||
|
- 主网关:`/root/.openclaw/.env` 的 `LLM_BASE_URL`、`LLM_API_KEY`(以及对应 `openclaw.json` 的默认 provider 引用) |
||||||
|
- 桐哥网关:`/root/.openclaw-tongge/.env` 的 `LLM_BASE_URL`、`LLM_API_KEY` |
||||||
|
|
||||||
|
迁移时如果 `LLM_BASE_URL` 带有 `/v1`,客户端会按约定拼接到 `/v1/chat/completions`;否则会拼接 `/v1/chat/completions`,避免 404 的 `/v1/v1/...` 问题。 |
||||||
|
|
||||||
|
### B. Control UI 访问依赖(Tailscale Serve + allowedOrigins) |
||||||
|
|
||||||
|
迁移常见故障:Control UI 打不开,多半是新机的 Tailscale 主机名/IP 和旧配置不一致。 |
||||||
|
|
||||||
|
需要核对两份 `openclaw.json`: |
||||||
|
- `gateway.controlUi.allowedOrigins`:加入新机对应的 HTTPS Origin(含端口,若非 443) |
||||||
|
- `gateway.trustedProxies`:加入新机 Tailscale IP(或与你当前访问路径一致的代理集合) |
||||||
|
|
||||||
|
相关参考见:`/root/.openclaw/workspace/docs/CONTROL_UI_ACCESS_AND_SECURITY.md` |
||||||
|
|
||||||
|
### C. Telegram 插件依赖(两个网关各自需要 token) |
||||||
|
|
||||||
|
`openclaw.json` 的 `plugins.allow` 包含 `telegram`,token 来自各自 env 文件: |
||||||
|
- 主网关 env:`/root/.openclaw/workspace/systemd/gateway.env` 中的 `TELEGRAM_BOT_TOKEN` |
||||||
|
- 桐哥 env:`/root/.openclaw/workspace/systemd/tongge-gateway.env` 中的 `TELEGRAM_BOT_TOKEN` |
||||||
|
|
||||||
|
若新旧服务器在切换窗口同时在线,可能出现重复响应;见文末“停止旧服务器”策略。 |
||||||
|
|
||||||
|
### D. Python3 依赖(mem0-integration) |
||||||
|
|
||||||
|
桐哥的 `openclaw.json` 中 mem0-integration 使用: |
||||||
|
- `pythonPath: /usr/bin/python3` |
||||||
|
|
||||||
|
因此迁移到新机时必须确认 `/usr/bin/python3` 存在并可执行。 |
||||||
|
|
||||||
|
### E. qmd 命令路径一致性(避免 ENOENT / better-sqlite3/bindings.node) |
||||||
|
|
||||||
|
qmd 不是跟着 openclaw 一起安装,而是需要: |
||||||
|
- 用 node 前缀安装:`npm install -g @tobilu/qmd` |
||||||
|
- `openclaw-tongge/openclaw.json` 里的 `memory.qmd.command` 指向新机实际 `qmd` 的绝对路径 |
||||||
|
|
||||||
|
同时,systemd unit 的 PATH/ExecStart 前缀要确保能找到对应的 qmd/qmd 依赖(尤其是 `better-sqlite3` ABI)。 |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Step 5 — Restore the Cron Jobs |
||||||
|
|
||||||
|
Cron jobs for the tongge gateway (daily fortune, active learning) are stored in `/root/.openclaw-tongge/cron/jobs.json` and managed by OpenClaw's built-in scheduler. They are migrated automatically as part of the `.openclaw-tongge/` directory copy — **no manual crontab setup required**. |
||||||
|
|
||||||
|
Verify the jobs loaded after gateway start: |
||||||
|
```bash |
||||||
|
# Gateway 启动后约 30s 检查(jobs.json 会被 Gateway 写入 nextRunAtMs) |
||||||
|
cat /root/.openclaw-tongge/cron/jobs.json | python3 -m json.tool | grep -E "name|nextRunAt|enabled" |
||||||
|
``` |
||||||
|
|
||||||
|
Expected output shows `tongge-daily-fortune` and `tongge-active-learning` with `nextRunAtMs` values set. |
||||||
|
|
||||||
|
### mem0 cleanup — `/etc/cron.d/mem0-cleanup` |
||||||
|
|
||||||
|
This file is **not** under `~/.mem0`; copy it from the old server. |
||||||
|
|
||||||
|
From the **old** server: |
||||||
|
|
||||||
|
```bash |
||||||
|
scp -P 3322 /etc/cron.d/mem0-cleanup root@NEW_SERVER_IP:/etc/cron.d/mem0-cleanup |
||||||
|
``` |
||||||
|
|
||||||
|
On the **new** server, ensure ownership and mode are valid for `cron.d` (typical: root, `0644`): |
||||||
|
|
||||||
|
```bash |
||||||
|
chown root:root /etc/cron.d/mem0-cleanup |
||||||
|
chmod 0644 /etc/cron.d/mem0-cleanup |
||||||
|
``` |
||||||
|
|
||||||
|
The job runs `memory_cleanup.py` under `/root/.openclaw/workspace/skills/mem0-integration/` and logs to `/root/.openclaw/workspace/logs/security/cleanup-cron.log` — those paths come over with the `.openclaw` rsync. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Step 5.5 — `deploy.sh` and `agents.yaml` (how this repo manages gateways) |
||||||
|
|
||||||
|
Production control plane lives under **`/root/.openclaw/workspace/deploy.sh`**. It reads **`/root/.openclaw/workspace/agents.yaml`** (via `scripts/parse_agents.py`) to know which gateways exist: |
||||||
|
|
||||||
|
- **main** — `local-cli`: start/stop uses `openclaw gateway start` / `gateway status` (paths are **hardcoded in `agents.yaml`**). |
||||||
|
- **tongge** — `local-systemd`: unit `openclaw-gateway-tongge.service` installed from **`workspace/systemd/openclaw-gateway-tongge.service`**. |
||||||
|
|
||||||
|
| Command | What it does | |
||||||
|
|---------|----------------| |
||||||
|
| `./deploy.sh install` | `loginctl` linger, copies **`openclaw-gateway-user.service` → `~/.config/systemd/user/openclaw-gateway.service`**, installs tongge unit from template, installs **system** `openclaw-agent-monitor`, runs `fix-service`, **starts** all services. | |
||||||
|
| `./deploy.sh start` / `stop` / `restart` | Start/stop/restart all agents from `agents.yaml` + monitor. | |
||||||
|
| `./deploy.sh health` | Health check (user units + monitor + disk/memory/linger). | |
||||||
|
| `./deploy.sh fix-service` | Re-inject `EnvironmentFile=` into units after OpenClaw UI upgrade (see script header). | |
||||||
|
|
||||||
|
**Migration implications:** |
||||||
|
|
||||||
|
1. **If you `scp`’d live units from the old server** — they may differ from templates. Either keep using **manual** `systemctl --user enable --now` (Step 6), **or** merge your path edits into `workspace/systemd/*.service` and **`agents.yaml`**, then run `./deploy.sh install` on a **clean** tree (backup first). **`install` overwrites** `~/.config/systemd/user/openclaw-gateway.service` from `openclaw-gateway-user.service`. |
||||||
|
2. **Always update `agents.yaml`** after changing Node prefix: fields like `service.check_cmd` and `service.start_cmd` under **`main`** must point at the same `node`/`openclaw` paths as on disk (e.g. `/www/server/nodejs/v24.14.0/bin/openclaw`). Otherwise `./deploy.sh stop` / `health` / `start` will call the wrong binaries. |
||||||
|
3. **Tongge** does not use `start_cmd` in yaml; it uses the **systemd unit** only — edit **`ExecStart`** in the installed user unit (or template) for new Node paths. |
||||||
|
|
||||||
|
**Suggested workflow on the new server:** finish Steps 1–5 and the **pre-flight checklist** below; then either **Step 6** (explicit `systemctl`) **or** `cd /root/.openclaw/workspace && ./deploy.sh start` (after units exist and `agents.yaml` paths match). For a full reinstall of units from repo templates, use `./deploy.sh install` **only** when you intend to replace user units with templates. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Step 5.9 — Pre-flight checklist (before starting gateways) |
||||||
|
|
||||||
|
Complete **before** `systemctl --user enable --now …` or `./deploy.sh start` / `install` to reduce boot-loop / missing-binary errors: |
||||||
|
|
||||||
|
**Environment & secrets** |
||||||
|
|
||||||
|
- [ ] `/root/.openclaw/workspace/systemd/gateway.env` and `tongge-gateway.env` exist (came with rsync). |
||||||
|
- [ ] `/root/.openclaw/.env` and `/root/.openclaw-tongge/.env` present if your setup expects them. |
||||||
|
|
||||||
|
**Node / OpenClaw / qmd paths (must be consistent)** |
||||||
|
|
||||||
|
- [ ] `which node`, `which openclaw`, `which qmd` on the new server — **one** prefix (e.g. all under `/www/server/nodejs/v24.x.x/`). |
||||||
|
- [ ] User systemd units: `ExecStart` paths updated (Step 4). |
||||||
|
- [ ] `/root/.openclaw-tongge/openclaw.json` → `memory.qmd.command` matches the real `qmd` binary (Step 4.5). |
||||||
|
- [ ] If using **`deploy.sh`**: `/root/.openclaw/workspace/agents.yaml` → **`main.service.check_cmd` / `main.service.start_cmd`** updated to the same `openclaw` path as above. |
||||||
|
|
||||||
|
**Network & dependencies** |
||||||
|
|
||||||
|
- [ ] If the new host has a **new Tailscale IP** or LLM moved: edit both **`openclaw.json`** files (Step 6.5). |
||||||
|
- [ ] If agents use Mem0 + local Qdrant: Docker stack in `/opt/mem0-center/` is **up** and `localhost:6333` reachable **before** expecting mem0 in gateway (order: Docker section can run **before** Step 6 if needed). |
||||||
|
- [ ] `python3` available for `parse_agents.py`, cron `memory_cleanup.py`, and optional `json.tool` checks. |
||||||
|
|
||||||
|
**systemd user session** |
||||||
|
|
||||||
|
- [ ] `loginctl enable-linger root` will be run (Step 6) so user gateways survive reboot. |
||||||
|
- [ ] `/root/.config/systemd/user/` contains the two gateway units (or you will run `./deploy.sh install` to generate them). |
||||||
|
|
||||||
|
**Optional: old server** |
||||||
|
|
||||||
|
- [ ] If using **cutover strategy** (see Overview): old server gateways stopped **now**, then proceed to Step 6 on the new server immediately. |
||||||
|
|
||||||
|
### 运行前确认(环境依赖与配置对齐) |
||||||
|
|
||||||
|
在真正执行 Step 6(enable/start)之前,再做一次“跨组件对齐”确认,确保 Attachment(环境依赖)里的关键依赖都已就绪: |
||||||
|
|
||||||
|
- OneAPI:`openclaw-llm-gateway` 容器已启动,管理后台可达 `http://<TAILSCALE_IP>:3000`;主/桐哥的 `LLM_BASE_URL` 与 `LLM_API_KEY` 与 OneAPI 后台创建的 Key 对齐 |
||||||
|
- Telegram:`TELEGRAM_BOT_TOKEN` 分别存在于主网关 env(`workspace/systemd/gateway.env`)和桐哥 env(`workspace/systemd/tongge-gateway.env`) |
||||||
|
- Control UI:若新机的 Tailscale IP/主机名变化,已同步两份 `openclaw.json` 的 `gateway.controlUi.allowedOrigins` 与 `gateway.trustedProxies` |
||||||
|
- Python3:`/usr/bin/python3` 存在并可执行(mem0-integration 需要) |
||||||
|
- qmd:`openclaw-tongge/openclaw.json` 里的 `memory.qmd.command` 指向新机实际 `qmd` 的绝对路径 |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Step 6 — Enable and Start Services |
||||||
|
|
||||||
|
Complete **Step 5.9** first. Below is the **manual** `systemctl` path; equivalent using **`./deploy.sh`**: after units exist and **`agents.yaml`** paths match, run `cd /root/.openclaw/workspace && ./deploy.sh start` (or `./deploy.sh install` only when intentionally (re)installing units from templates — see Step 5.5). |
||||||
|
|
||||||
|
```bash |
||||||
|
# User linger (once per machine): required for user-scoped gateways at boot |
||||||
|
loginctl enable-linger root |
||||||
|
|
||||||
|
systemctl daemon-reload |
||||||
|
systemctl enable --now openclaw-agent-monitor |
||||||
|
|
||||||
|
systemctl --user daemon-reload |
||||||
|
systemctl --user enable --now openclaw-gateway openclaw-gateway-tongge |
||||||
|
``` |
||||||
|
|
||||||
|
Verify: |
||||||
|
|
||||||
|
```bash |
||||||
|
systemctl status openclaw-agent-monitor |
||||||
|
systemctl --user status openclaw-gateway |
||||||
|
systemctl --user status openclaw-gateway-tongge |
||||||
|
|
||||||
|
# Logs (pick one) |
||||||
|
journalctl --user -u openclaw-gateway -f |
||||||
|
journalctl --user -u openclaw-gateway-tongge -f |
||||||
|
``` |
||||||
|
|
||||||
|
**qmd 相关(仅在 Gateway 已启动后)**:若 Step 4.5 的命令行验证通过,但运行时仍怀疑 qmd 问题,再从 journal 中筛选(无输出表示近期无匹配行,属正常): |
||||||
|
|
||||||
|
```bash |
||||||
|
journalctl --user -u openclaw-gateway -n 80 | grep -i qmd |
||||||
|
journalctl --user -u openclaw-gateway-tongge -n 80 | grep -i qmd |
||||||
|
# 若仍使用 system 级 openclaw-gateway: |
||||||
|
journalctl -u openclaw-gateway -n 80 | grep -i qmd |
||||||
|
``` |
||||||
|
|
||||||
|
If you intentionally use **system** `openclaw-gateway.service` instead of the user unit, enable that unit and **avoid** running two main gateways at once. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Step 6.5 — Update IPs and bind lists (if the new host differs) |
||||||
|
|
||||||
|
`openclaw.json` in both **`~/.openclaw`** and **`~/.openclaw-tongge`** may hardcode the previous machine’s Tailscale IP (e.g. `100.115.94.1`) under: |
||||||
|
|
||||||
|
- `models.providers.*.baseUrl` (e.g. local LLM / gateway at `:3000`) |
||||||
|
- `gateway.bind` / allowlists (e.g. `:18789`, `:18790`) |
||||||
|
|
||||||
|
If the new VPS gets a **different Tailscale address** or you change where the LLM API runs, search and update those URLs consistently in **both** config files. Skip this if the new server reuses the same Tailscale IP and service topology. |
||||||
|
|
||||||
|
**Mem0 skill** (`/root/.openclaw/workspace/skills/mem0-integration/config.yaml`) uses `localhost:6333` for Qdrant; keep Qdrant on the same host as OpenClaw or change `host`/`port` to match your Docker/Qdrant deployment. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Step 7 — Verify Telegram Bots |
||||||
|
|
||||||
|
Both bots should respond within a minute of the gateways starting. Bot tokens are stored in `.env` files and systemd `EnvironmentFile`s — no changes needed when paths stay `/root/...`. Check: |
||||||
|
|
||||||
|
- Main gateway bot: `/root/.openclaw/.env` and `/root/.openclaw/workspace/systemd/gateway.env` |
||||||
|
- Tongge bot: `/root/.openclaw-tongge/.env` and `/root/.openclaw/workspace/systemd/tongge-gateway.env` |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## What Is Portable (No Changes Needed) |
||||||
|
|
||||||
|
| Item | Location | |
||||||
|
|------|----------| |
||||||
|
| Main gateway config & agents | `/root/.openclaw/openclaw.json` | |
||||||
|
| Tongge gateway config & agents | `/root/.openclaw-tongge/openclaw.json` | |
||||||
|
| Main gateway secrets (systemd) | `/root/.openclaw/workspace/systemd/gateway.env` | |
||||||
|
| Tongge gateway secrets (systemd) | `/root/.openclaw/workspace/systemd/tongge-gateway.env` | |
||||||
|
| Bot tokens in tree | `/root/.openclaw/.env`, `/root/.openclaw-tongge/.env` | |
||||||
|
| Mem0 SQLite memory history | `/root/.mem0/history.db` | |
||||||
|
| Mem0 user identity | `/root/.mem0/config.json` | |
||||||
|
| Qdrant migration metadata (if present) | `/root/.mem0/migrations_qdrant/` | |
||||||
|
| Workspace scripts, skills, agents | `/root/.openclaw/workspace/` | |
||||||
|
| Tongge workspace | `/root/.openclaw-tongge/workspace/` | |
||||||
|
| Credentials & delivery queue | `credentials/`, `delivery-queue/` | |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## What May Need Updating |
||||||
|
|
||||||
|
| Item | Action | |
||||||
|
|------|--------| |
||||||
|
| Node.js binary paths in **user + system** `.service` files | Update if install path differs from `/www/server/nodejs/v24.13.1/` | |
||||||
|
| `openclaw-gateway.service` (user or system) | Re-apply custom `EnvironmentFile` / `ExecStart` after some `openclaw update` flows; `gateway.env` survives if referenced | |
||||||
|
| Tailscale IPs / LLM base URLs in `openclaw.json` | See Step 6.5 when the host or upstream API address changes | |
||||||
|
| Qdrant vector store (if used) | See Docker stack section below | |
||||||
|
| `node_modules` in `~/.openclaw-tongge/` | Re-run `npm install` in that directory if any native modules break | |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Docker Stack Migration (Qdrant + Dozzle) |
||||||
|
|
||||||
|
Install Docker Engine and the Compose plugin on the new server before bringing the stack up (`docker compose` v2). Both services are managed by `/opt/mem0-center/docker-compose.yml`. |
||||||
|
|
||||||
|
### Step A — Copy the compose stack |
||||||
|
|
||||||
|
```bash |
||||||
|
rsync -avz --progress /opt/mem0-center/ root@NEW_SERVER_IP:/opt/mem0-center/ |
||||||
|
``` |
||||||
|
|
||||||
|
This includes `docker-compose.yml`, `qdrant_storage/` (vector data), and `snapshots/`. |
||||||
|
|
||||||
|
### Step B — Update Dozzle port binding on the new server |
||||||
|
|
||||||
|
Dozzle **must** bind to the new server's Tailscale IP. Edit the compose file after copying: |
||||||
|
|
||||||
|
```bash |
||||||
|
# Get new server's Tailscale IP |
||||||
|
tailscale ip -4 |
||||||
|
|
||||||
|
# Update the port binding (replace OLD_IP with the value above) |
||||||
|
nano /opt/mem0-center/docker-compose.yml |
||||||
|
# Change: "100.115.94.1:9999:8080" |
||||||
|
# To: "<NEW_TAILSCALE_IP>:9999:8080" |
||||||
|
``` |
||||||
|
|
||||||
|
> Binding to `127.0.0.1` breaks remote access. Binding to `0.0.0.0` exposes the port publicly. |
||||||
|
> Always bind to the Tailscale IP specifically. See `DOZZLE_LOG_OBSERVABILITY.md` for details. |
||||||
|
|
||||||
|
### Step C — Start the stack |
||||||
|
|
||||||
|
```bash |
||||||
|
cd /opt/mem0-center |
||||||
|
docker compose up -d |
||||||
|
``` |
||||||
|
|
||||||
|
Verify both containers are healthy: |
||||||
|
```bash |
||||||
|
docker ps --format "table {{.Names}}\t{{.Status}}" |
||||||
|
# Expected: qdrant-master Up ... (healthy) |
||||||
|
# dozzle Up ... (healthy) |
||||||
|
``` |
||||||
|
|
||||||
|
Access Dozzle at `http://<NEW_TAILSCALE_IP>:9999`. |
||||||
|
|
||||||
|
### Healthcheck note |
||||||
|
|
||||||
|
Both containers use non-standard healthchecks (no `wget`/`curl` in images): |
||||||
|
- **Dozzle**: `["CMD", "/dozzle", "healthcheck"]` — built-in binary, no shell needed |
||||||
|
- **Qdrant**: `CMD-SHELL` with bash `/dev/tcp` TCP probe |
||||||
|
|
||||||
|
If you upgrade either image and healthcheck breaks, refer to `DOZZLE_LOG_OBSERVABILITY.md §四`. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Post-Migration Checklist |
||||||
|
|
||||||
|
- [ ] **Step 5.9 pre-flight** completed (paths, env files, optional Docker Qdrant, `agents.yaml` if using `deploy.sh`) |
||||||
|
- [ ] Node.js v24 installed and at correct path |
||||||
|
- [ ] Global npm packages installed (`openclaw`, `clawhub`, `mcporter`, `pnpm` — match Step 1) |
||||||
|
- [ ] `~/.openclaw` copied and permissions intact (`chmod 700`) |
||||||
|
- [ ] `~/.openclaw-tongge` copied and permissions intact (`chmod 700`) |
||||||
|
- [ ] `~/.mem0` copied (including `history.db` and `migrations_qdrant/` if present) |
||||||
|
- [ ] **`agents.yaml`** `main.service.*` paths match real `openclaw` if using `./deploy.sh` (Step 5.5) |
||||||
|
- [ ] Systemd: `openclaw-agent-monitor.service` + user units `openclaw-gateway.service` and `openclaw-gateway-tongge.service` installed; paths verified |
||||||
|
- [ ] `loginctl enable-linger root` (for user-scoped gateways at boot) |
||||||
|
- [ ] `systemctl daemon-reload` and `systemctl --user daemon-reload` run |
||||||
|
- [ ] Gateways + monitor started (**Step 6** manual **or** `./deploy.sh start` / `install` per Step 5.5) |
||||||
|
- [ ] **QMD 内存后端已安装并验证**(Step 4.5): |
||||||
|
- [ ] 执行 `/www/server/nodejs/v24.13.1/bin/npm install -g @tobilu/qmd` |
||||||
|
- [ ] `ls -la /www/server/nodejs/v24.13.1/bin/qmd` — 指向 `../lib/node_modules/@tobilu/qmd/bin/qmd`(非 cache 目录) |
||||||
|
- [ ] `/www/server/nodejs/v24.13.1/bin/qmd collection list` — 无报错 |
||||||
|
- [ ] 网关日志中无 `qmd ENOENT` 或 `better-sqlite3` 错误 |
||||||
|
- [ ] Tongge cron jobs loaded (check `nextRunAtMs` in `/root/.openclaw-tongge/cron/jobs.json`) |
||||||
|
- [ ] mem0 cleanup cron restored (`/etc/cron.d/mem0-cleanup`) |
||||||
|
- [ ] Telegram bots responding |
||||||
|
- [ ] `/opt/mem0-center/` copied to new server |
||||||
|
- [ ] Dozzle port binding updated to new server's Tailscale IP |
||||||
|
- [ ] `docker compose up -d` run in `/opt/mem0-center/` |
||||||
|
- [ ] `qdrant-master` and `dozzle` both show `(healthy)` |
||||||
|
- [ ] Dozzle accessible at `http://<NEW_TAILSCALE_IP>:9999` |
||||||
|
- [ ] Old server services stopped (to avoid duplicate bot responses) |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Stopping the Old Server After Migration |
||||||
|
|
||||||
|
**When:** After the **new** server is confirmed working (strategy **A** in Overview). If you already stopped gateways on the old box before Step 6 (strategy **B**), only disable what is still running and skip duplicate `stop` commands. |
||||||
|
|
||||||
|
Stop the old services to avoid duplicate Telegram bot responses: |
||||||
|
|
||||||
|
```bash |
||||||
|
# On the OLD server — preferred: deploy script stops agents.yaml + monitor |
||||||
|
cd /root/.openclaw/workspace && ./deploy.sh stop |
||||||
|
|
||||||
|
# Or stop user units + monitor manually: |
||||||
|
systemctl --user stop openclaw-gateway openclaw-gateway-tongge |
||||||
|
systemctl --user disable openclaw-gateway openclaw-gateway-tongge |
||||||
|
|
||||||
|
systemctl stop openclaw-agent-monitor |
||||||
|
systemctl disable openclaw-agent-monitor |
||||||
|
|
||||||
|
# If you still had a system-level main gateway enabled: |
||||||
|
systemctl stop openclaw-gateway 2>/dev/null || true |
||||||
|
systemctl disable openclaw-gateway 2>/dev/null || true |
||||||
|
``` |
||||||
Loading…
Reference in new issue