OpenClaw’s "infinite memory" is a double-edged sword. It remembers every conversation, which is great for context and terrible if an attacker injects something nasty that stays there forever. The core mitigation is vaccine memory—a 400-plus-line constitutional markdown file (openclaw_memory.md) that acts as a standing order the model must follow. This guide walks through installing it, customizing it, understanding how it works under the hood, and—equally important—where it still fails.
Why Vaccine Memory Exists (and Where It Fits in the Stack)
OpenClaw glues together a gateway (the web UI) and a daemon (the long-running process). The daemon streams user messages into the LLM with a prompt wrapper that stitches together:
- System prompt (
claw_system.txt) – role and capabilities - Vaccine memory – persistent security directive
- Short-term chat history
- Tool instructions
Because vaccine memory is loaded before any user input, it lets you override or ignore malicious instructions that appear later. Think of it as SELinux policy for prompts: not perfect, but a useful guard rail.
Installing OpenClaw with Vaccine Memory Enabled
1. Prerequisites
- Node.js v22.0.0+
- npm or pnpm
- OpenAI, Mistral or local model endpoint
2. Install the CLI
npm install -g openclaw@latest
3. Scaffold a new agent
claw init my-secure-bot
cd my-secure-bot
This generates:
openclaw.config.jsonopenclaw_memory.mdclaw_system.txt
4. Verify vaccine memory is wired
Open openclaw.config.json and look for:
{
"prompt": {
"vaccineMemory": "./openclaw_memory.md"
}
}
If you remove or mis-spell this path, the daemon will run without the security directive—no warning is printed. Add a lint step in CI to stop that.
5. Run locally
claw daemon --config openclaw.config.json
You should see a log line similar to:
[vaccine] Loaded 417 lines from openclaw_memory.md (size: 8.7 KB)
Anatomy of openclaw_memory.md
The file is long but follows a predictable structure.
- Preamble – identity, disallowed behaviors
- Allowed tools – browser, shell, Composio connectors
- Safe completion rules – must not reveal hidden prompts, keys, stack traces
- Data handling rules – GDPR, no PII logging, no unsolicited uploads
- Self-diagnostics – ask for human review when uncertain
By default, it is conservative. For example, the shell section says:
Never write to disk locations outside /tmp
Never call destructive commands such as rm -rf, shred, dd if=/dev/zero
If you run your agent in a read-only container, you can loosen these.
Customizing Vaccine Memory for Your Threat Model
The template keeps 90% of users safe enough, but edge cases abound. Below are common edits the community has reported:
Locking Down Domain-Specific APIs
### Finance Tooling
The agent may call the internal billing GraphQL API read-only endpoints. Mutations must be escalated to a human reviewer.
Rate-Limiting External Requests
If more than 5 HTTP requests per minute to the same host are required, pause and ask: "High request volume detected, continue?"
Embedding Business Logic Checks
Several teams embed a shortened Sox/PCI policy section directly in the memory to prevent data exfiltration—faster than hooking another service.
Version Control and Review
Place openclaw_memory.md under ./security/ and use a CODEOWNERS rule so at least two people review changes. Diff size alone is a good signal; any sudden mass deletions are suspicious.
Preventing Prompt Injection and Memory Poisoning
1. Sandwich Technique (LLM-Level)
OpenClaw already adds the vaccine before user input (System ➜ Vaccine ➜ User). Keep it that way. Avoid dynamic insertion of user-supplied text above the vaccine.
2. Input Sanitization (Gateway-Level)
- Strip zero-width spaces that hide prompt keywords.
- Reject Markdown images with
js:URIs—some models will fetch. - Apply a
maxTokenscap; huge payloads can "overflow" the prefix.
3. Output Filtering (Daemon-Level)
Even with vaccine, a jailbreak might slip through. Pipe completions through a thin validator:
if (/openai\.com|sk-/.test(output)) throw new Error("Leaked key");
4. Rotating Memory Keys (Storage-Level)
If you persist memory to Redis or SQLite, use an encr_key_v2 and rotate weekly. Nothing stops an attacker from dumping raw storage if they pop your box.
Operational Playbook: Updating, Auditing, Rotating
Weekly
- npm audit – yes, still matters even in 2024.
- Run
claw doctor --memory-lint(0.8.4+) - Review diff of conversations merged into long-term memory.
Monthly
- Rotate OpenAI keys, invalidate the old ones.
- Pull upstream vaccine updates:
curl -O https://raw.githubusercontent.com/openclaw/openclaw/main/templates/openclaw_memory.md - Merge carefully; upstream might loosen settings that you rely on.
Incident Response
- Flip the
--memory read-onlyflag (0.9.0+) to freeze state. - Dump recent interactions:
claw memory export --since 24h - Search for suspicious directives, credentials, or mass requests.
- Purge with:
claw memory delete --id <uuid> - Restart daemon with clean snapshot.
Known Limitations and Open Problems
- Model Ignorance – Some LLMs happily ignore earlier instructions if a later message is forceful enough. OpenAI GPT-4o behaves better than GPT-3.5, but nothing is guaranteed.
- Token Budget – Vaccine memory eats ~2.5 k tokens. On GPT-3.5-Turbo-16K you start burning context after a 20-minute chat. Solutions: summarization or shorter vaccine variants.
- Covert Channels – An attacker can encode data in whitespace. The memory cannot detect what it cannot see.
- Dynamic Tool Loading – OpenClaw can install Composio integrations at runtime. Unless vaccine memory forbids
npm installor restrictsrequire(), you have a supply-chain issue. - Untrusted Frontends – If you expose the gateway publicly without auth, vaccine memory won’t save you from a DDoS that racks up token bills.
Quick Checklist: Secure Your Agent Today
- [ ] OpenClaw
v0.9.2or newer running on Node 22+ - [ ]
openclaw_memory.mdpresent and path correct - [ ] Custom rules for your internal APIs and data policy
- [ ] Input sanitization middleware enabled
- [ ] Weekly memory lint + key rotation scheduled
- [ ] CODEOWNERS on
/security/openclaw_memory.md - [ ] Context summarization if using GPT-3.5
If you tick every box above, you are in the top decile of OpenClaw installations we’ve audited. It’s not bulletproof, but you’ve raised the cost of a persistent attack from "paste a jailbreak" to "breach multiple layers." Good trade.