# MEMORY.md - Long-term Memory This file contains curated long-term memories and important context. ## Memory Management Strategy - **MEMORY.md**: Curated long-term memories, important decisions, security templates, and key configurations - **QMD System**: Automated memory backend with semantic search, auto-updates every 5 minutes - **Usage**: Write significant learnings to MEMORY.md; rely on QMD for daily context and automation - **Access**: MEMORY.md loaded only in main sessions (direct chats) for security ## QMD Configuration - Backend: qmd - Auto-update: every 5 minutes - Include default memory: true - Last verified: 2026-02-20 ## Server Security Hardening Template (2026-02-20) ### Environment - **Server**: Ubuntu 24.04 LTS VPS (KVM) - **Panel**: 宝塔面板 (BT-Panel) on port 888 - **Public IP**: 204.12.203.203 ### Security Configuration Applied 1. **Port Exposure Minimization**: - Only ports 80 (HTTP) and 443 (HTTPS) publicly accessible - SSH (port 22) restricted to internal/network access only - OpenClaw gateway (port 18789) bound to localhost only - All other services (MySQL, custom apps) internal-only 2. **OpenClaw Secure Deployment**: - Gateway configured with `bind: "localhost"` instead of `"lan"` - Access exclusively through Nginx reverse proxy with HTTPS - Token-based authentication enabled - WebSocket support properly configured in Nginx 3. **Firewall Management**: - Use 宝塔面板 (BT-Panel) built-in firewall for port management - Alternative: system-level firewall (ufw/iptables) if no panel available - Regular external port scanning to verify exposure 4. **Critical Security Principles**: - Never expose sensitive services directly to public internet - Always use reverse proxy with TLS termination for web services - Implement defense in depth (firewall + service binding + authentication) - Regular security audits using `openclaw security audit --deep` ### Migration Checklist for New Servers - [ ] Install and configure 宝塔面板 or equivalent server management panel - [ ] Set up Nginx reverse proxy with proper WebSocket support - [ ] Configure OpenClaw with localhost binding only - [ ] Restrict public ports to 80/443 only via firewall - [ ] Enable automatic security updates - [ ] Run initial security audit and document baseline - [ ] Schedule periodic security audits via OpenClaw cron ### Lessons Learned - Panel-based firewalls (宝塔/aapanel) must be verified with external port scans - Direct service exposure (like OpenClaw on 0.0.0.0) creates critical security risks - Nginx reverse proxy configuration is essential for secure OpenClaw deployment ## Agent Operations Logging Practice (2026-02-20) ### Log Directory Structure - `/root/.openclaw/workspace/logs/operations/` - Manual operations and important changes - `/root/.openclaw/workspace/logs/system/` - System-generated logs - `/root/.openclaw/workspace/logs/agents/` - Individual agent logs - `/root/.openclaw/workspace/logs/security/` - Security operations and audits ### Automatic Logging Triggers 1. **Configuration Changes**: Any modification to config files (.json, .yaml, etc.) 2. **Security Modifications**: Firewall rules, authentication changes, port modifications 3. **Agent Lifecycle**: Deployment, updates, removal of agents 4. **System Optimizations**: Performance tuning, resource allocation changes 5. **Troubleshooting**: Error diagnosis and resolution procedures 6. **Memory Updates**: Significant changes to MEMORY.md or memory management ### Log Format Standard - **Filename**: `YYYY-MM-DD-HH-MM-SS-description.log` - **Timestamp**: UTC time format - **Content**: `[TIMESTAMP] [OPERATION_TYPE] [AGENT/USER] Description with before/after state` ### Implementation Guidelines - Always log before making changes (capture current state) - Include rollback instructions when applicable - Redact sensitive information (passwords, tokens, private keys) - Reference related MEMORY.md entries for context - Use QMD for routine operational context, MEMORY.md for strategic decisions ## Agent Health Monitoring & Alerting System (2026-02-20) ### Features Implemented 1. **Crash Detection**: Monitors uncaught exceptions and unhandled rejections 2. **Health Checks**: Periodic service health verification (every 30 seconds) 3. **Multi-Channel Notifications**: Telegram alerts for critical events 4. **Automatic Logging**: All alerts logged to `/logs/agents/health-YYYY-MM-DD.log` 5. **Extensible Design**: Easy to add new notification channels ### Components Created - **Skill**: `agent-monitor/SKILL.md` - Documentation and usage guide - **Monitor Script**: `agent-monitor.js` - Core monitoring logic - **Startup Script**: `start-agent-monitor.sh` - Easy deployment - **Log Directory**: `/logs/agents/` - Dedicated logging location ### Alert Severity Levels - **CRITICAL**: Process crashes, uncaught exceptions - **ERROR**: Unhandled rejections, failed operations - **WARNING**: Health check failures, performance issues - **INFO**: Service status updates, recovery notifications ### Integration Points - Automatically integrated with existing Telegram channel - Compatible with OpenClaw's agent architecture - Works alongside existing logging and memory systems - Can monitor any Node.js-based agent process ### Usage Instructions 1. Source the startup script: `source /root/.openclaw/workspace/start-agent-monitor.sh` 2. Call `startAgentMonitor("agent-name", healthCheckFunction)` 3. Monitor automatically sends alerts on errors/crashes 4. Check logs in `/logs/agents/` for detailed information