You already know how weird it feels when a generic AI writes “Hey fam!” on LinkedIn. The fix is simple: teach the model how you actually write. This guide shows exactly how I got OpenClaw (v0.43.2) to learn my style from 2,317 historical posts and start spitting out drafts that my own friends can’t spot. We’ll cover data export, persona config, iterative refinement, and a fully automated pipeline that pushes platform-specific variants to LinkedIn, X/Twitter, and Instagram.
Why style learning in OpenClaw beats manual prompting
You could paste a few examples into a prompt every time, but:
- Context windows run out fast. My corpus alone is 28 KB.
- Manual copy/paste destroys flow when you’re batching weekly content.
- Consistency dies once the prompt hits Slack, Trello, and three co-workers.
OpenClaw’s vector memory + persona system solves all three. You ingest once, then reference it with a single {{style_memory}} tag. The agent fetches relevant snippets automatically.
Exporting your historical posts (the only painful step)
LinkedIn export
LinkedIn lets you request a JSON download:
- Settings > Data privacy > Get a copy of your data.
- Select Posts and Articles. LinkedIn emails you a ZIP in ~10 minutes.
The JSON arrives as one giant array. I wrote a 12-line script to flatten it:
#!/usr/bin/env node
import fs from 'node:fs/promises';
const raw = JSON.parse(await fs.readFile('LinkedIn_Posts.json', 'utf8'));
const lines = raw.map(p => JSON.stringify({
text: p.text.body,
platform: 'linkedin',
timestamp: p.createdAt
}));
await fs.writeFile('linkedin.jsonl', lines.join('\n'));
X / Twitter export
Twitter’s archive is a tarball full of HTML. I used the community script referenced in issue #892:
npx @clawcloud/tw2json my-archive.tar.gz --out twitter.jsonl
Instagram export
Meta packs everything into a monstrous messages.json. I ignored DMs and parsed only media_posts:
jq -r '.media_posts[] | {text: .caption, platform:"instagram", timestamp: .creation_time}' messages.json \
| jq -c '.' > instagram.jsonl
At this point I had three .jsonl files totaling 5.3 MB.
Consolidate and clean the corpus
I like small files. Also, typos teach the model to repeat typos. Quick cleaning:
cat *.jsonl > posts.raw.jsonl
# remove URLs, emojis, trailing whitespace
sed -E 's_https?://[^ ]+__g; s_[\x{1F600}-\x{1F64F}]__g; s/ +$//' posts.raw.jsonl > posts.clean.jsonl
The final file ended up with 2,317 lines. That’s still plenty for style capture.
Ingesting the corpus into OpenClaw memory
Create a dedicated vector namespace
OpenClaw supports Pinecone, Chroma, and local SQLite. I’m cheap, so SQLite:
# .claw/config.yaml
vectorStore:
provider: "sqlite"
path: "~/.claw/vector.db"
Batch import
The CLI’s claw ingest subcommand landed in v0.41.0. Usage:
npx openclaw ingest \
--file posts.clean.jsonl \
--namespace my-style \
--textKey text \
--metadataKeys platform,timestamp
On my M2 Air it processed ~180 records/sec. Done in 13 seconds.
Setting up the persona
The persona governs system and user prompts. Mine is ~/.claw/personas/drpete.yaml:
name: "Pete Social Writer"
role: "Writes social media posts that sound like Dr. Pete."
styleMemoryNamespace: "my-style"
voiceGuidelines: |
- Avoid exclamation points.
- Use parenthetical asides sparingly.
- Prefer concise sentences (< 20 words).
- Start LinkedIn posts with a hook line, end with a question.
- On X: 260 characters max, no hashtags unless ironic.
- On Instagram: one emoji allowed, at the end.
Notice the last three platform-specific bullets. We’ll select them later.
Testing the style with an interactive chat
npx openclaw chat --persona drpete
Prompt: “Write a LinkedIn post announcing my team’s move to four-day weeks.”
First draft looked eerily like me—short sentences, a bracketed thought mid-line, no exclamation points. Good sign.
I iterate inside chat:
User> Tone is a bit stiff. Loosen first sentence.
The memory kicks in, retrieves “Big news: I’m off caffeine.” from last year, and uses that cadence. After three tweaks I hit /save. Claw stores the final draft as a memory.draft item so the agent reuses the phrasing style later.
Version-controlled content pipeline
Repo layout
content/
weekly-2024-05-13/
topic.md
linkedin.out.md
x.out.md
insta.out.md
persona/
drpete.yaml
.claw/
config.yaml
topic.md holds raw notes. A 50-line GitHub Action (abridged below) converts it into three platform drafts nightly.
GitHub Action
# .github/workflows/social.yml
name: social-drafts
on:
schedule: [cron: '0 2 * * 1']
workflow_dispatch:
jobs:
generate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
- run: npm ci
- name: Run agent
run: |
npx openclaw run \# calls persona
--persona persona/drpete.yaml \
--input "$(cat content/weekly-${{github.run_number}}/topic.md)" \
--outputDir content/weekly-${{github.run_number}}
Prompt template
Stored at prompts/social.hbs:
{{#each platforms}}
Write a {{this}} post using the user's style memory.
Follow voice guidelines strictly.
Content:\n{{{topic}}}
{{/each}}
Tone adaptation per platform
The persona knows guidelines, but I still need platform runtime variables. run lets me pass JSON:
npx openclaw run \
--persona persona/drpete.yaml \
--vars '{"platforms":["LinkedIn","X","Instagram"]}'
Behind the curtain, the template loops and the agent writes three drafts in one shot. Length checks are done with {{#if (gt (length output) 260)}} helper to auto-trim X posts.
Scheduling posts directly from OpenClaw
Composio integration (v0.27) gives us 800+ tools. I wired up:
linkedin.publishPosttwitter.createTweetinstagram.scheduleMedia
Example task in scheduler.yaml:
- when: "0 14 * * 2" # Tuesdays at 14:00 UTC
run:
persona: "drpete"
template: "prompts/social.hbs"
vars:
platforms: ["LinkedIn"]
tools:
- "linkedin.publishPost"
Tested in staging first—no embarrassing midnight blasts.
Quality guardrails: no cringe allowed
I don’t trust any model unsupervised. Two tricks help:
- Sentiment filter. Pass drafts through
openai-moderation. Block if score > 0.8. - Deterministic temperature. Drafts use
temperature=0.7, scheduled posts re-generate at0.3to reduce variance.
Fail case counts (# blocked posts) get logged to Datadog. Last month the angry score tripped twice—both times me ranting about ticketing systems.
Feedback loop: letting your audience fine-tune the style
Not everything can be pre-trained. I added a Zapier zap: comment reacts “👍” trigger a webhook that stores the post as memory.positive. The next draft call weighs those memories higher via embedding_score * 1.2. Result: phrases with good engagement bubble up.
What this setup still gets wrong
A few trade-offs worth mentioning:
- Humor misses. Sarcasm embeddings are still weak; 4-5% of drafts sound deadpan.
- Image selection. Instagram requires an image. I still pick one manually.
- Latency. Local SQLite vector store slows to ~900 ms retrieval at 10k memories; Pinecone is faster but costs $.
- No multi-lingual yet. My Spanish posts confuse the persona; I keep a separate namespace.
Next steps: your five-minute starter checklist
- Export last year of posts from LinkedIn, X, Instagram.
- Clean + combine into
posts.clean.jsonl. npx openclaw ingest --file posts.clean.jsonl --namespace my-style- Create
persona.yamlwithstyleMemoryNamespace: my-styleand voice guidelines. - Test in
npx openclaw chat --persona MY_PERSONA.
Once the drafts stop sounding like a corporate intern, wire up the scheduler and forget about Sunday night content panic. Your future self will thank you.