openClaw_agent_dm/MEMORY.md

# MEMORY.md - Long-term Memory

This file contains curated long-term memories and important context.

## Memory Management Strategy
- **MEMORY.md**: Curated long-term memories, important decisions, security templates, and key configurations
- **QMD System**: Automated memory backend with semantic search, auto-updates every 5 minutes
- **Usage**: Write significant learnings to MEMORY.md; rely on QMD for daily context and automation
- **Access**: MEMORY.md loaded only in main sessions (direct chats) for security

## QMD Configuration
- Backend: qmd
- Auto-update: every 5 minutes
- Include default memory: true
- Last verified: 2026-02-20

## Server Security Hardening Template (2026-02-20)

### Environment
- **Server**: Ubuntu 24.04 LTS VPS (KVM)
- **Panel**: 宝塔面板 (BT-Panel) on port 888
- **Public IP**: 204.12.203.203

### Security Configuration Applied
1. **Port Exposure Minimization**:
   - Only ports 80 (HTTP) and 443 (HTTPS) publicly accessible
   - SSH (port 22) restricted to internal/network access only
   - OpenClaw gateway (port 18789) bound to localhost only
   - All other services (MySQL, custom apps) internal-only

2. **OpenClaw Secure Deployment**:
   - Gateway configured with `bind: "localhost"` instead of `"lan"`
   - Access exclusively through Nginx reverse proxy with HTTPS
   - Token-based authentication enabled
   - WebSocket support properly configured in Nginx

3. **Firewall Management**:
   - Use 宝塔面板 (BT-Panel) built-in firewall for port management
   - Alternative: system-level firewall (ufw/iptables) if no panel available
   - Regular external port scanning to verify exposure

4. **Critical Security Principles**:
   - Never expose sensitive services directly to public internet
   - Always use reverse proxy with TLS termination for web services
   - Implement defense in depth (firewall + service binding + authentication)
   - Regular security audits using `openclaw security audit --deep`

### Migration Checklist for New Servers
- [ ] Install and configure 宝塔面板 or equivalent server management panel
- [ ] Set up Nginx reverse proxy with proper WebSocket support
- [ ] Configure OpenClaw with localhost binding only
- [ ] Restrict public ports to 80/443 only via firewall
- [ ] Enable automatic security updates
- [ ] Run initial security audit and document baseline
- [ ] Schedule periodic security audits via OpenClaw cron

### Lessons Learned
- Panel-based firewalls (宝塔/aapanel) must be verified with external port scans
- Direct service exposure (like OpenClaw on 0.0.0.0) creates critical security risks
- Nginx reverse proxy configuration is essential for secure OpenClaw deployment

## Agent Operations Logging Practice (2026-02-20)

### Log Directory Structure
- `/root/.openclaw/workspace/logs/operations/` - Manual operations and important changes
- `/root/.openclaw/workspace/logs/system/` - System-generated logs  
- `/root/.openclaw/workspace/logs/agents/` - Individual agent logs
- `/root/.openclaw/workspace/logs/security/` - Security operations and audits

### Automatic Logging Triggers
1. **Configuration Changes**: Any modification to config files (.json, .yaml, etc.)
2. **Security Modifications**: Firewall rules, authentication changes, port modifications
3. **Agent Lifecycle**: Deployment, updates, removal of agents
4. **System Optimizations**: Performance tuning, resource allocation changes
5. **Troubleshooting**: Error diagnosis and resolution procedures
6. **Memory Updates**: Significant changes to MEMORY.md or memory management

### Log Format Standard
- **Filename**: `YYYY-MM-DD-HH-MM-SS-description.log`
- **Timestamp**: UTC time format
- **Content**: `[TIMESTAMP] [OPERATION_TYPE] [AGENT/USER] Description with before/after state`

### Implementation Guidelines
- Always log before making changes (capture current state)
- Include rollback instructions when applicable
- Redact sensitive information (passwords, tokens, private keys)
- Reference related MEMORY.md entries for context
- Use QMD for routine operational context, MEMORY.md for strategic decisions

## Agent Health Monitoring & Alerting System (2026-02-20)

### Features Implemented
1. **Crash Detection**: Monitors uncaught exceptions and unhandled rejections
2. **Health Checks**: Periodic service health verification (every 30 seconds)
3. **Multi-Channel Notifications**: Telegram alerts for critical events
4. **Automatic Logging**: All alerts logged to `/logs/agents/health-YYYY-MM-DD.log`
5. **Extensible Design**: Easy to add new notification channels

### Components Created
- **Skill**: `agent-monitor/SKILL.md` - Documentation and usage guide
- **Monitor Script**: `agent-monitor.js` - Core monitoring logic
- **Startup Script**: `start-agent-monitor.sh` - Easy deployment
- **Log Directory**: `/logs/agents/` - Dedicated logging location

### Alert Severity Levels
- **CRITICAL**: Process crashes, uncaught exceptions
- **ERROR**: Unhandled rejections, failed operations  
- **WARNING**: Health check failures, performance issues
- **INFO**: Service status updates, recovery notifications

### Integration Points
- Automatically integrated with existing Telegram channel
- Compatible with OpenClaw's agent architecture
- Works alongside existing logging and memory systems
- Can monitor any Node.js-based agent process

### Usage Instructions
1. Source the startup script: `source /root/.openclaw/workspace/start-agent-monitor.sh`
2. Call `startAgentMonitor("agent-name", healthCheckFunction)` 
3. Monitor automatically sends alerts on errors/crashes
4. Check logs in `/logs/agents/` for detailed information
Initial commit: OpenClaw workspace baseline with memory architecture 1 month ago			`# MEMORY.md - Long-term Memory`

			`This file contains curated long-term memories and important context.`

			`## Memory Management Strategy`
			`- MEMORY.md: Curated long-term memories, important decisions, security templates, and key configurations`
			`- QMD System: Automated memory backend with semantic search, auto-updates every 5 minutes`
			`- Usage: Write significant learnings to MEMORY.md; rely on QMD for daily context and automation`
			`- Access: MEMORY.md loaded only in main sessions (direct chats) for security`

			`## QMD Configuration`
			`- Backend: qmd`
			`- Auto-update: every 5 minutes`
			`- Include default memory: true`
			`- Last verified: 2026-02-20`

			`## Server Security Hardening Template (2026-02-20)`

			`### Environment`
			`- Server: Ubuntu 24.04 LTS VPS (KVM)`
			`- Panel: 宝塔面板 (BT-Panel) on port 888`
			`- Public IP: 204.12.203.203`

			`### Security Configuration Applied`
			`1. Port Exposure Minimization:`
			`- Only ports 80 (HTTP) and 443 (HTTPS) publicly accessible`
			`- SSH (port 22) restricted to internal/network access only`
			`- OpenClaw gateway (port 18789) bound to localhost only`
			`- All other services (MySQL, custom apps) internal-only`

			`2. OpenClaw Secure Deployment:`
			- Gateway configured with `bind: "localhost"` instead of `"lan"`
			`- Access exclusively through Nginx reverse proxy with HTTPS`
			`- Token-based authentication enabled`
			`- WebSocket support properly configured in Nginx`

			`3. Firewall Management:`
			`- Use 宝塔面板 (BT-Panel) built-in firewall for port management`
			`- Alternative: system-level firewall (ufw/iptables) if no panel available`
			`- Regular external port scanning to verify exposure`

			`4. Critical Security Principles:`
			`- Never expose sensitive services directly to public internet`
			`- Always use reverse proxy with TLS termination for web services`
			`- Implement defense in depth (firewall + service binding + authentication)`
			- Regular security audits using `openclaw security audit --deep`

			`### Migration Checklist for New Servers`
			`- [ ] Install and configure 宝塔面板 or equivalent server management panel`
			`- [ ] Set up Nginx reverse proxy with proper WebSocket support`
			`- [ ] Configure OpenClaw with localhost binding only`
			`- [ ] Restrict public ports to 80/443 only via firewall`
			`- [ ] Enable automatic security updates`
			`- [ ] Run initial security audit and document baseline`
			`- [ ] Schedule periodic security audits via OpenClaw cron`

			`### Lessons Learned`
			`- Panel-based firewalls (宝塔/aapanel) must be verified with external port scans`
			`- Direct service exposure (like OpenClaw on 0.0.0.0) creates critical security risks`
			`- Nginx reverse proxy configuration is essential for secure OpenClaw deployment`

			`## Agent Operations Logging Practice (2026-02-20)`

			`### Log Directory Structure`
			- `/root/.openclaw/workspace/logs/operations/` - Manual operations and important changes
			- `/root/.openclaw/workspace/logs/system/` - System-generated logs
			- `/root/.openclaw/workspace/logs/agents/` - Individual agent logs
			- `/root/.openclaw/workspace/logs/security/` - Security operations and audits

			`### Automatic Logging Triggers`
			`1. Configuration Changes: Any modification to config files (.json, .yaml, etc.)`
			`2. Security Modifications: Firewall rules, authentication changes, port modifications`
			`3. Agent Lifecycle: Deployment, updates, removal of agents`
			`4. System Optimizations: Performance tuning, resource allocation changes`
			`5. Troubleshooting: Error diagnosis and resolution procedures`
			`6. Memory Updates: Significant changes to MEMORY.md or memory management`

			`### Log Format Standard`
			- Filename: `YYYY-MM-DD-HH-MM-SS-description.log`
			`- Timestamp: UTC time format`
			- Content: `[TIMESTAMP] [OPERATION_TYPE] [AGENT/USER] Description with before/after state`

			`### Implementation Guidelines`
			`- Always log before making changes (capture current state)`
			`- Include rollback instructions when applicable`
			`- Redact sensitive information (passwords, tokens, private keys)`
			`- Reference related MEMORY.md entries for context`
			`- Use QMD for routine operational context, MEMORY.md for strategic decisions`

			`## Agent Health Monitoring & Alerting System (2026-02-20)`

			`### Features Implemented`
			`1. Crash Detection: Monitors uncaught exceptions and unhandled rejections`
			`2. Health Checks: Periodic service health verification (every 30 seconds)`
			`3. Multi-Channel Notifications: Telegram alerts for critical events`
			4. Automatic Logging: All alerts logged to `/logs/agents/health-YYYY-MM-DD.log`
			`5. Extensible Design: Easy to add new notification channels`

			`### Components Created`
			- Skill: `agent-monitor/SKILL.md` - Documentation and usage guide
			- Monitor Script: `agent-monitor.js` - Core monitoring logic
			- Startup Script: `start-agent-monitor.sh` - Easy deployment
			- Log Directory: `/logs/agents/` - Dedicated logging location

			`### Alert Severity Levels`
			`- CRITICAL: Process crashes, uncaught exceptions`
			`- ERROR: Unhandled rejections, failed operations`
			`- WARNING: Health check failures, performance issues`
			`- INFO: Service status updates, recovery notifications`

			`### Integration Points`
			`- Automatically integrated with existing Telegram channel`
			`- Compatible with OpenClaw's agent architecture`
			`- Works alongside existing logging and memory systems`
			`- Can monitor any Node.js-based agent process`

			`### Usage Instructions`
			1. Source the startup script: `source /root/.openclaw/workspace/start-agent-monitor.sh`
			2. Call `startAgentMonitor("agent-name", healthCheckFunction)`
			`3. Monitor automatically sends alerts on errors/crashes`
			4. Check logs in `/logs/agents/` for detailed information