Automating Tax Prep with OpenClaw: Document Collection, Categorization, Summaries

I burned the first week of April 2023 manually unzipping email attachments, renaming PDFs, and filling yet another "2023-Business-Expenses.xlsx". Never again. This season I pushed the entire workflow into OpenClaw and spent the saved time finding more deductible subscriptions. This post is exactly what I would have killed for twelve months ago: concrete setup notes, categorization logic, and the bits that glue OpenClaw to QuickBooks and my accountant’s inbox.

Why bother automating tax prep with OpenClaw?

Every spring looks the same: banks drip 1099-INTs, Stripe spits out monthly statements, AWS invoices pile up. The data is digital; the pain is shuffling it. OpenClaw already sits between my chat apps and half the internet. Turning it loose on tax paperwork required fewer lines of code than I wasted clicking "Download PDF" last year.

Costs: OpenClaw is MIT-licensed. Running on ClawCloud’s free tier kept infra at zero.
Reach: Composio adds 800+ connectors (Gmail, Outlook, Dropbox, Notion, QuickBooks, Xero, Google Sheets).
Extensibility: Node 22+ lets me write categorization logic in plain JavaScript, hot-reloaded into the agent.
Memory: Built-in vector store remembers file hashes, preventing double-imports.

I’m not pitching "set it and forget it" magic. You will still tweak regexes and occasionally override a GL code. But once the plumbing is live, 90% of the grunt work is gone.

High-level architecture

Everything runs inside one named agent (tax-bot-24) on ClawCloud.

Gateway — web UI for logs, prompt tweaks, manual triggers.
Daemon — keeps the worker alive, schedules nightly fetch jobs.
Composio connectors — Gmail, Dropbox, QuickBooks.
Code module — tax.js with categorization and summary generation.
Persistent memory — stores processed file IDs and category totals.

The only external runtime piece is a QuickBooks sandbox app for API push. If you still mail a shoebox of receipts, you can swap that step for plain CSV export.

1. Connecting OpenClaw to your document sources

Gmail & "tax-doc" label

I centralized incoming docs in Gmail with a server-side filter:

Matches: (subject:(invoice OR statement OR 1099) OR from:(amazonaws.com))
Action: Apply label "tax-doc" • Skip Inbox

Then wired the label to OpenClaw via Composio:

// ClawCloud web UI → Integrations → Gmail → Add
GMAIL_LABEL="tax-doc"

By default Composio hands over message IDs, body, and attachments. I limited scope to https://mail.google.com/ with read-only and let OpenClaw request upgrades if needed.

Dropbox & "Receipts" folder

Field expenses via mobile Dropbox Scan land in /Apps/Receipts/2024. The watchFolder helper (shipped with OpenClaw v0.18.2) emits a file:new event every 15 minutes.

// .openclawrc.json
{
  "watchers": [
    {
      "type": "dropbox-folder",
      "path": "/Apps/Receipts/2024",
      "interval": 900
    }
  ]
}

That single JSON block replaced the classic cron+Dropbox CLI I hacked last year.

Bank and card portals

Most banks still reject sane APIs. I solved this with browser automation:

// tax.js
exports.fetchChaseStatements = async ({ browser }) => {
  const page = await browser.newPage();
  await page.goto("https://secure08a.chase.com/web/auth/#/logon/login-index");
  await login(page); // 2FA handled via authy Push API
  await downloadLatestPDF(page, "/Downloads/Chase/");
};

Schedule weekly; PDF lands in the same Dropbox folder and triggers the normal pipeline.

2. Categorization logic: from file to GL code

This is the part you will tweak the most. My goal: map every transaction or document into one of the 23 expense categories my accountant defined last decade. OpenClaw ships with GPT-4o but also local Llama-3 70B if you bring your own GPU. I observed that large models hallucinate less when primed with deterministic heuristics first.

Step A — deterministic rules

// tax.js
const rules = [
  { match: /amazon web services/i,  gl: "6520 – Hosting" },
  { match: /github\.com/i,          gl: "6525 – Dev Tools" },
  { match: /uber|lyft/i,            gl: "6510 – Travel" },
  { match: /apple\.com.*developer/i,gl: "6525 – Dev Tools" }
];

function ruleBasedCategorize(text) {
  const r = rules.find(r => r.match.test(text));
  return r ? r.gl : null;
}

Match against file name, email subject, and first 500 bytes of OCR text.

Step B — LLM fallback

async function llmCategorize(text) {
  const prompt = `You are a tax accounting assistant. Map the following transaction to one of these GL codes: ${glListString}. Text: "${text}".`;
  const res = await openclaw.ai.generate({
    model: "gpt-4o",
    prompt,
    max_tokens: 16,
    temperature: 0
  });
  return extractGL(res);
}

Temperature 0 is critical; otherwise you get "Training Data" jokes instead of ledger codes.

Putting it together

exports.categorize = async (doc) => {
  const text = await doc.extractText(); // OCR images, strip whitespace
  let gl = ruleBasedCategorize(text);
  if (!gl) gl = await llmCategorize(text);
  if (!gl) gl = "6999 – Uncategorized";
  await doc.addTag(gl);
  await memory.upsert({ year: 2024, gl, amount: doc.amount });
};

The memory.upsert helper is part of v0.19.0 and backs everything by SQLite. You could swap for Postgres if you already run one.

3. Tracking deductions & receipt metadata

Beyond category, my accountant loves metadata: vendor, payment method, deductible-yes-no, origin (email, scan, scrape). I store these as JSON in OpenClaw memory.

await doc.setMeta({
  vendor: guessVendor(text),
  payment: detectCard(text),
  isDeductible: checkDeductible(gl),
  source: doc.origin
});

Conditional rules for home office vs. general admin run inside checkDeductible. Edge cases (mixed personal/business Amazon orders) still need human judgment; the system flags anything with multiple SKUs as needs-review.

Interactive Slack channel

I added a private #tax-review Slack channel. OpenClaw posts line items with needs-review and waits for a 👍 or 👎 reaction. The message format:

{
  "vendor": "Amazon",
  "amount": "$47.99",
  "preview": "USB-C Hub 8-in-1",
  "href": "https://claw.cloud/d/abcd.pdf"
}

Reaction events update the isDeductible flag immediately. No more spreadsheet commenting.

4. Pushing data to accounting software

I use QuickBooks Online Simple Start (yes, the lowest tier). Composio exposes it via OAuth 2. The flow:

Nightly Lambda in OpenClaw generates a JSON of new/updated docs.
Transforms line items into Expense objects.
Calls /v3/company/:id/expense endpoint.

// tax.js
exports.syncQuickBooks = async () => {
  const docs = await memory.getUnpushed();
  for (const d of docs) {
    await quickbooks.createExpense({
      AccountRef: { value: accountMap[d.payment] },
      Line: [{
        Description: d.vendor,
        Amount: d.amount,
        DetailType: "AccountBasedExpenseLineDetail",
        AccountBasedExpenseLineDetail: {
          TaxCodeRef: { value: d.isDeductible ? "TAX" : "NON" },
          ExpenseAccountRef: { value: glToQB(d.gl) }
        }
      }]
    });
    await memory.markPushed(d.id);
  }
};

If you’re on Xero, swap quickbooks for xero. The wrapper APIs are 90% identical.

Generating the accountant packet

On February 1st I run:

openclaw agent run tax-bot-24:generate-packet

The handler:

exports.generatePacket = async () => {
  const year = 2024;
  const csv  = await memory.toCSV({ year });
  const zip  = await archiveDocs({ year });
  await gmail.send({
    to: "accountant@example.com",
    subject: `2024 Financial Packet – ${csv.total} lines`,
    attachments: [csv.path, zip.path]
  });
};

Generated packet is a 150 KB CSV (one row per doc) plus a 40 MB ZIP of PDFs, JPGs, and HTML statements. That’s it. The accountant never logs into my systems, which keeps SOC 2 people happier.

5. Scheduling, alerts & year-over-year hygiene

Nightly job definition

// .openclawrc.json (excerpt)
{
  "cron": [
    {
      "expr": "0 2 * * *", // 02:00 UTC nightly
      "task": "tax-bot-24:collect"
    },
    {
      "expr": "15 2 * * *",
      "task": "tax-bot-24:categorize"
    },
    {
      "expr": "45 2 * * *",
      "task": "tax-bot-24:syncQuickBooks"
    }
  ]
}

If any step throws, the built-in notifier pushes to my personal Telegram. Error volume stayed low once I cached 2FA cookies properly.

Rollover script

January 1st:

openclaw agent run tax-bot-24:new-year 2025

Which creates new Dropbox folder, Gmail label, memory partition, and updates cron expressions. I fought for an hour to avoid hard-coding the year everywhere; wrap it in process.env.TAX_YEAR early.

Security & privacy trade-offs

OAuth scopes — grant read-only where possible. QuickBooks obviously needs write.
Encrypted memory — ClawCloud at rest encryption is AES-256, but I opted to PGP-encrypt the ZIP packet prior to email because Gmail search still indexes attachment text.
2FA bypass haul — Browser automation for banks stores session cookies in the agent vault, rotated every seven days via SMS code. Ugly, but still better than me clicking things.

I would be fired from my own side hustle if data ever leaked. So far the threat model feels acceptable.

Performance notes

Full 2023 set: 1,873 documents, 2.6 GB. Nightly run time on ClawCloud shared CPU instance:

Collect: 42 s (Gmail bulk API dominates)
OCR: 5 m 11 s (Tesseract, two concurrent workers)
LLM categorize: 1 m 02 s (remote GPT-4o)
QuickBooks push: 16 s
Total: ~7 min

LLM latency shrinks if you host a local model on a GPU box, but I’m happy paying OpenAI $0.20 per run rather than buying more hardware.

Cost breakdown

ClawCloud (shared)	$0
Composio Gmail connector	$5/month
OpenAI usage (avg)	$3/month
QuickBooks Simple Start	$30/month (unchanged)

The moment this automation saved a single billable hour, the ROI was handled.

Where the setup still hurts

Bank 2FA — still flaky when SMS arrives late. A push-based authenticator API would fix half my alert noise.
PDF mutants — some vendors embed text as images inside form XObjects. Tesseract copes, but it adds minutes.
Edge categories — "Meals & Entertainment" split across personal/business remains a gray zone. No automation will guess intent.

Each annoyance is still cheaper than spreadsheets.

Next step: ship your own agent

If you already have OpenClaw running, clone github.com/yourhandle/openclaw-tax-bot (public MIT) and replace the .env.example file with your keys. New users: npm i -g openclaw@0.19.0, then openclaw login, openclaw deploy. Total cold start to first categorized receipt is under 60 minutes, assuming your accountant emails back the GL list promptly. See you in the #tax-automation thread on Discord—bug reports welcome.