OpenClaw Vaccine Memory Security: How to Prevent Persistent Attacks

OpenClaw’s "infinite memory" is a double-edged sword. It remembers every conversation, which is great for context and terrible if an attacker injects something nasty that stays there forever. The core mitigation is vaccine memory—a 400-plus-line constitutional markdown file (openclaw_memory.md) that acts as a standing order the model must follow. This guide walks through installing it, customizing it, understanding how it works under the hood, and—equally important—where it still fails.

Why Vaccine Memory Exists (and Where It Fits in the Stack)

OpenClaw glues together a gateway (the web UI) and a daemon (the long-running process). The daemon streams user messages into the LLM with a prompt wrapper that stitches together:

System prompt (claw_system.txt) – role and capabilities
Vaccine memory – persistent security directive
Short-term chat history
Tool instructions

Because vaccine memory is loaded before any user input, it lets you override or ignore malicious instructions that appear later. Think of it as SELinux policy for prompts: not perfect, but a useful guard rail.

Installing OpenClaw with Vaccine Memory Enabled

1. Prerequisites

Node.js v22.0.0+
npm or pnpm
OpenAI, Mistral or local model endpoint

2. Install the CLI

npm install -g openclaw@latest

3. Scaffold a new agent

claw init my-secure-bot
cd my-secure-bot

This generates:

openclaw.config.json
openclaw_memory.md
claw_system.txt

4. Verify vaccine memory is wired

Open openclaw.config.json and look for:

{
  "prompt": {
    "vaccineMemory": "./openclaw_memory.md"
  }
}

If you remove or mis-spell this path, the daemon will run without the security directive—no warning is printed. Add a lint step in CI to stop that.

5. Run locally

claw daemon --config openclaw.config.json

You should see a log line similar to:

[vaccine] Loaded 417 lines from openclaw_memory.md (size: 8.7 KB)

Anatomy of openclaw_memory.md

The file is long but follows a predictable structure.

Preamble – identity, disallowed behaviors
Allowed tools – browser, shell, Composio connectors
Safe completion rules – must not reveal hidden prompts, keys, stack traces
Data handling rules – GDPR, no PII logging, no unsolicited uploads
Self-diagnostics – ask for human review when uncertain

By default, it is conservative. For example, the shell section says:

Never write to disk locations outside /tmp
Never call destructive commands such as rm -rf, shred, dd if=/dev/zero

If you run your agent in a read-only container, you can loosen these.

Customizing Vaccine Memory for Your Threat Model

The template keeps 90% of users safe enough, but edge cases abound. Below are common edits the community has reported:

Locking Down Domain-Specific APIs

### Finance Tooling
The agent may call the internal billing GraphQL API read-only endpoints. Mutations must be escalated to a human reviewer.

Rate-Limiting External Requests

If more than 5 HTTP requests per minute to the same host are required, pause and ask: "High request volume detected, continue?"

Embedding Business Logic Checks

Several teams embed a shortened Sox/PCI policy section directly in the memory to prevent data exfiltration—faster than hooking another service.

Version Control and Review

Place openclaw_memory.md under ./security/ and use a CODEOWNERS rule so at least two people review changes. Diff size alone is a good signal; any sudden mass deletions are suspicious.

Preventing Prompt Injection and Memory Poisoning

1. Sandwich Technique (LLM-Level)

OpenClaw already adds the vaccine before user input (System ➜ Vaccine ➜ User). Keep it that way. Avoid dynamic insertion of user-supplied text above the vaccine.

2. Input Sanitization (Gateway-Level)

Strip zero-width spaces that hide prompt keywords.
Reject Markdown images with js: URIs—some models will fetch.
Apply a maxTokens cap; huge payloads can "overflow" the prefix.

3. Output Filtering (Daemon-Level)

Even with vaccine, a jailbreak might slip through. Pipe completions through a thin validator:

if (/openai\.com|sk-/.test(output)) throw new Error("Leaked key");

4. Rotating Memory Keys (Storage-Level)

If you persist memory to Redis or SQLite, use an encr_key_v2 and rotate weekly. Nothing stops an attacker from dumping raw storage if they pop your box.

Operational Playbook: Updating, Auditing, Rotating

Weekly

npm audit – yes, still matters even in 2024.
Run claw doctor --memory-lint (0.8.4+)
Review diff of conversations merged into long-term memory.

Monthly

Rotate OpenAI keys, invalidate the old ones.
Pull upstream vaccine updates: curl -O https://raw.githubusercontent.com/openclaw/openclaw/main/templates/openclaw_memory.md
Merge carefully; upstream might loosen settings that you rely on.

Incident Response

Flip the --memory read-only flag (0.9.0+) to freeze state.
Dump recent interactions: claw memory export --since 24h
Search for suspicious directives, credentials, or mass requests.
Purge with: claw memory delete --id <uuid>
Restart daemon with clean snapshot.

Known Limitations and Open Problems

Model Ignorance – Some LLMs happily ignore earlier instructions if a later message is forceful enough. OpenAI GPT-4o behaves better than GPT-3.5, but nothing is guaranteed.
Token Budget – Vaccine memory eats ~2.5 k tokens. On GPT-3.5-Turbo-16K you start burning context after a 20-minute chat. Solutions: summarization or shorter vaccine variants.
Covert Channels – An attacker can encode data in whitespace. The memory cannot detect what it cannot see.
Dynamic Tool Loading – OpenClaw can install Composio integrations at runtime. Unless vaccine memory forbids npm install or restricts require(), you have a supply-chain issue.
Untrusted Frontends – If you expose the gateway publicly without auth, vaccine memory won’t save you from a DDoS that racks up token bills.

Quick Checklist: Secure Your Agent Today

[ ] OpenClaw v0.9.2 or newer running on Node 22+
[ ] openclaw_memory.md present and path correct
[ ] Custom rules for your internal APIs and data policy
[ ] Input sanitization middleware enabled
[ ] Weekly memory lint + key rotation scheduled
[ ] CODEOWNERS on /security/openclaw_memory.md
[ ] Context summarization if using GPT-3.5

If you tick every box above, you are in the top decile of OpenClaw installations we’ve audited. It’s not bulletproof, but you’ve raised the cost of a persistent attack from "paste a jailbreak" to "breach multiple layers." Good trade.