I burned the first week of April 2023 manually unzipping email attachments, renaming PDFs, and filling yet another "2023-Business-Expenses.xlsx". Never again. This season I pushed the entire workflow into OpenClaw and spent the saved time finding more deductible subscriptions. This post is exactly what I would have killed for twelve months ago: concrete setup notes, categorization logic, and the bits that glue OpenClaw to QuickBooks and my accountant’s inbox.
Why bother automating tax prep with OpenClaw?
Every spring looks the same: banks drip 1099-INTs, Stripe spits out monthly statements, AWS invoices pile up. The data is digital; the pain is shuffling it. OpenClaw already sits between my chat apps and half the internet. Turning it loose on tax paperwork required fewer lines of code than I wasted clicking "Download PDF" last year.
- Costs: OpenClaw is MIT-licensed. Running on ClawCloud’s free tier kept infra at zero.
- Reach: Composio adds 800+ connectors (Gmail, Outlook, Dropbox, Notion, QuickBooks, Xero, Google Sheets).
- Extensibility: Node 22+ lets me write categorization logic in plain JavaScript, hot-reloaded into the agent.
- Memory: Built-in vector store remembers file hashes, preventing double-imports.
I’m not pitching "set it and forget it" magic. You will still tweak regexes and occasionally override a GL code. But once the plumbing is live, 90% of the grunt work is gone.
High-level architecture
Everything runs inside one named agent (tax-bot-24) on ClawCloud.
- Gateway — web UI for logs, prompt tweaks, manual triggers.
- Daemon — keeps the worker alive, schedules nightly fetch jobs.
- Composio connectors — Gmail, Dropbox, QuickBooks.
- Code module —
tax.jswith categorization and summary generation. - Persistent memory — stores processed file IDs and category totals.
The only external runtime piece is a QuickBooks sandbox app for API push. If you still mail a shoebox of receipts, you can swap that step for plain CSV export.
1. Connecting OpenClaw to your document sources
Gmail & "tax-doc" label
I centralized incoming docs in Gmail with a server-side filter:
Matches: (subject:(invoice OR statement OR 1099) OR from:(amazonaws.com))
Action: Apply label "tax-doc" • Skip Inbox
Then wired the label to OpenClaw via Composio:
// ClawCloud web UI → Integrations → Gmail → Add
GMAIL_LABEL="tax-doc"
By default Composio hands over message IDs, body, and attachments. I limited scope to https://mail.google.com/ with read-only and let OpenClaw request upgrades if needed.
Dropbox & "Receipts" folder
Field expenses via mobile Dropbox Scan land in /Apps/Receipts/2024. The watchFolder helper (shipped with OpenClaw v0.18.2) emits a file:new event every 15 minutes.
// .openclawrc.json
{
"watchers": [
{
"type": "dropbox-folder",
"path": "/Apps/Receipts/2024",
"interval": 900
}
]
}
That single JSON block replaced the classic cron+Dropbox CLI I hacked last year.
Bank and card portals
Most banks still reject sane APIs. I solved this with browser automation:
// tax.js
exports.fetchChaseStatements = async ({ browser }) => {
const page = await browser.newPage();
await page.goto("https://secure08a.chase.com/web/auth/#/logon/login-index");
await login(page); // 2FA handled via authy Push API
await downloadLatestPDF(page, "/Downloads/Chase/");
};
Schedule weekly; PDF lands in the same Dropbox folder and triggers the normal pipeline.
2. Categorization logic: from file to GL code
This is the part you will tweak the most. My goal: map every transaction or document into one of the 23 expense categories my accountant defined last decade. OpenClaw ships with GPT-4o but also local Llama-3 70B if you bring your own GPU. I observed that large models hallucinate less when primed with deterministic heuristics first.
Step A — deterministic rules
// tax.js
const rules = [
{ match: /amazon web services/i, gl: "6520 – Hosting" },
{ match: /github\.com/i, gl: "6525 – Dev Tools" },
{ match: /uber|lyft/i, gl: "6510 – Travel" },
{ match: /apple\.com.*developer/i,gl: "6525 – Dev Tools" }
];
function ruleBasedCategorize(text) {
const r = rules.find(r => r.match.test(text));
return r ? r.gl : null;
}
Match against file name, email subject, and first 500 bytes of OCR text.
Step B — LLM fallback
async function llmCategorize(text) {
const prompt = `You are a tax accounting assistant. Map the following transaction to one of these GL codes: ${glListString}. Text: "${text}".`;
const res = await openclaw.ai.generate({
model: "gpt-4o",
prompt,
max_tokens: 16,
temperature: 0
});
return extractGL(res);
}
Temperature 0 is critical; otherwise you get "Training Data" jokes instead of ledger codes.
Putting it together
exports.categorize = async (doc) => {
const text = await doc.extractText(); // OCR images, strip whitespace
let gl = ruleBasedCategorize(text);
if (!gl) gl = await llmCategorize(text);
if (!gl) gl = "6999 – Uncategorized";
await doc.addTag(gl);
await memory.upsert({ year: 2024, gl, amount: doc.amount });
};
The memory.upsert helper is part of v0.19.0 and backs everything by SQLite. You could swap for Postgres if you already run one.
3. Tracking deductions & receipt metadata
Beyond category, my accountant loves metadata: vendor, payment method, deductible-yes-no, origin (email, scan, scrape). I store these as JSON in OpenClaw memory.
await doc.setMeta({
vendor: guessVendor(text),
payment: detectCard(text),
isDeductible: checkDeductible(gl),
source: doc.origin
});
Conditional rules for home office vs. general admin run inside checkDeductible. Edge cases (mixed personal/business Amazon orders) still need human judgment; the system flags anything with multiple SKUs as needs-review.
Interactive Slack channel
I added a private #tax-review Slack channel. OpenClaw posts line items with needs-review and waits for a 👍 or 👎 reaction. The message format:
{
"vendor": "Amazon",
"amount": "$47.99",
"preview": "USB-C Hub 8-in-1",
"href": "https://claw.cloud/d/abcd.pdf"
}
Reaction events update the isDeductible flag immediately. No more spreadsheet commenting.
4. Pushing data to accounting software
I use QuickBooks Online Simple Start (yes, the lowest tier). Composio exposes it via OAuth 2. The flow:
- Nightly Lambda in OpenClaw generates a JSON of new/updated docs.
- Transforms line items into
Expenseobjects. - Calls
/v3/company/:id/expenseendpoint.
// tax.js
exports.syncQuickBooks = async () => {
const docs = await memory.getUnpushed();
for (const d of docs) {
await quickbooks.createExpense({
AccountRef: { value: accountMap[d.payment] },
Line: [{
Description: d.vendor,
Amount: d.amount,
DetailType: "AccountBasedExpenseLineDetail",
AccountBasedExpenseLineDetail: {
TaxCodeRef: { value: d.isDeductible ? "TAX" : "NON" },
ExpenseAccountRef: { value: glToQB(d.gl) }
}
}]
});
await memory.markPushed(d.id);
}
};
If you’re on Xero, swap quickbooks for xero. The wrapper APIs are 90% identical.
Generating the accountant packet
On February 1st I run:
openclaw agent run tax-bot-24:generate-packet
The handler:
exports.generatePacket = async () => {
const year = 2024;
const csv = await memory.toCSV({ year });
const zip = await archiveDocs({ year });
await gmail.send({
to: "accountant@example.com",
subject: `2024 Financial Packet – ${csv.total} lines`,
attachments: [csv.path, zip.path]
});
};
Generated packet is a 150 KB CSV (one row per doc) plus a 40 MB ZIP of PDFs, JPGs, and HTML statements. That’s it. The accountant never logs into my systems, which keeps SOC 2 people happier.
5. Scheduling, alerts & year-over-year hygiene
Nightly job definition
// .openclawrc.json (excerpt)
{
"cron": [
{
"expr": "0 2 * * *", // 02:00 UTC nightly
"task": "tax-bot-24:collect"
},
{
"expr": "15 2 * * *",
"task": "tax-bot-24:categorize"
},
{
"expr": "45 2 * * *",
"task": "tax-bot-24:syncQuickBooks"
}
]
}
If any step throws, the built-in notifier pushes to my personal Telegram. Error volume stayed low once I cached 2FA cookies properly.
Rollover script
January 1st:
openclaw agent run tax-bot-24:new-year 2025
Which creates new Dropbox folder, Gmail label, memory partition, and updates cron expressions. I fought for an hour to avoid hard-coding the year everywhere; wrap it in process.env.TAX_YEAR early.
Security & privacy trade-offs
- OAuth scopes — grant read-only where possible. QuickBooks obviously needs write.
- Encrypted memory — ClawCloud at rest encryption is AES-256, but I opted to PGP-encrypt the ZIP packet prior to email because Gmail search still indexes attachment text.
- 2FA bypass haul — Browser automation for banks stores session cookies in the agent vault, rotated every seven days via SMS code. Ugly, but still better than me clicking things.
I would be fired from my own side hustle if data ever leaked. So far the threat model feels acceptable.
Performance notes
Full 2023 set: 1,873 documents, 2.6 GB. Nightly run time on ClawCloud shared CPU instance:
- Collect: 42 s (Gmail bulk API dominates)
- OCR: 5 m 11 s (Tesseract, two concurrent workers)
- LLM categorize: 1 m 02 s (remote GPT-4o)
- QuickBooks push: 16 s
- Total: ~7 min
LLM latency shrinks if you host a local model on a GPU box, but I’m happy paying OpenAI $0.20 per run rather than buying more hardware.
Cost breakdown
| ClawCloud (shared) | $0 |
| Composio Gmail connector | $5/month |
| OpenAI usage (avg) | $3/month |
| QuickBooks Simple Start | $30/month (unchanged) |
The moment this automation saved a single billable hour, the ROI was handled.
Where the setup still hurts
- Bank 2FA — still flaky when SMS arrives late. A push-based authenticator API would fix half my alert noise.
- PDF mutants — some vendors embed text as images inside form XObjects. Tesseract copes, but it adds minutes.
- Edge categories — "Meals & Entertainment" split across personal/business remains a gray zone. No automation will guess intent.
Each annoyance is still cheaper than spreadsheets.
Next step: ship your own agent
If you already have OpenClaw running, clone github.com/yourhandle/openclaw-tax-bot (public MIT) and replace the .env.example file with your keys. New users: npm i -g openclaw@0.19.0, then openclaw login, openclaw deploy. Total cold start to first categorized receipt is under 60 minutes, assuming your accountant emails back the GL list promptly. See you in the #tax-automation thread on Discord—bug reports welcome.