OpenClaw VirusTotal partnership: what it means for skill security

The short version: every skill pushed to the public OpenClaw registry is now run through VirusTotal’s static and dynamic malware engines. You’ll see a green, yellow, or red badge next to the skill before you hit Install. That answers the most common request we kept seeing in GitHub issues — “How do I know this random skill isn’t dropping a cryptominer?” But security people (myself included) are genetically unable to stop at the short version, so here’s the long one.

Why we wired VirusTotal into the skill registry

OpenClaw skills are just npm packages that the agent runs inside a sandbox. Sandbox or not, skills have file-system access, network access, and in 0.14.0 they even get a headless browser context if they ask for it. That is more than enough surface area to hide something nasty.

Until last week our only protection was code review by maintainers. That does not scale; the registry is sitting at 3 127 skills and we merge ~40 new ones per week. So we reached out to VirusTotal (owned by Chronicle/Google) and asked for API quota to scan every commit. They were already providing something similar for Homebrew, so the conversation was short: we signed, they whitelisted our IP, and we pushed the first batch on Friday 2024-05-10.

What VirusTotal scanning actually checks

Each skill tarball is sent to the /vtapi/v3/file/scan endpoint. Behind that one call you get:

76 static AV engines (ClamAV, Kaspersky, MS Defender, etc.)
Five sandbox detonation systems that run the code in a stripped container for ~90 seconds
YARA rule matching for common malicious patterns
Reputation look-up on outbound IPs and domains

The report is cached; we only rescan if the tarball checksum changes. We store the sha256, the overall score (number of engines flagging > 0), and a link to the public VT page. That’s what powers the badge you see in the Gateway UI:

Badge logic

0/76 detections → green “Clean”
1-3/76 → yellow “Suspicious”
>3/76 → red “Malicious”

You can also pull the raw JSON from the CLI:


$ claw skill vt-report openclaw-weather@1.2.9
{
  'sha256': 'b18c…e4',
  'engines_detected': 0,
  'total_engines': 76,
  'sandbox_ips': [],
  'verdict': 'clean',
  'vt_url': 'https://www.virustotal.com/gui/file/b18c…e4'
}

What the VirusTotal badge does not tell you

No scanner can reason about the intent of an LLM prompt. If a skill passes a user’s secret document into the system prompt and asks the model to summarize it, that could be fine or it could leak the whole thing to a paste-bin URL. VT will not flag that because it’s “just text.”

Same for prompt injection vectors:

Indirect prompts hidden in a PDF the user feeds to the agent
Jailbreak strings stored in a remote calendar entry
Self-modifying chain-of-thought that writes to its own memory store

That class of issues lives at the logic layer, not the code layer. VirusTotal (and all other AV) is focused on the code layer: malware, backdoors, trojans, ransomware, crypto-miners, typo-squatted dependencies. Important but different.

Hands-on: using VirusTotal reports in the Gateway and CLI

Gateway >= 0.28.0 shows the badge right below the skill name. Click it and you get a side panel with the full VT JSON. You can also set an org-wide policy:


# /etc/openclaw/gateway.yaml
skillSecurity:
  minVerdict: clean   # allow clean only
  allowSuspicious: false
  allowMalicious: false

With that in place, installs that resolve to a yellow or red badge abort with HTTP 412.

From the CLI (v0.13.2+):


$ claw policy set vt.cleanOnly true
$ claw install openclaw-rss-summarizer
Error: VT verdict suspicious (1/76 engines). Install blocked.

If you are self-hosting and want the raw scan artifact you can request it once per 24h without eating into our quota:


$ claw admin vt-fetch b18c…e4 --outfile report.zip

Digging into detections: false positives and real threats

Out of the first 3 127 skills we scanned, 34 came back red. We investigated each:

21 were flat-out malicious. Six of them tried to download xmrig from a GitHub raw URL. Yes, still a thing in 2024.
9 were false positives. Mostly shell scripts that vendor curl | bash install blobs.
4 were gray zone crypto utilities embedding bip39-wordlist.txt; some AV engines flag any crypto wallet helper.

We yanked the confirmed bad packages, opened PRs on the false positives (re-structured the installers), and pushed a doc update: if your skill shells out, expect scrutiny.

Are VirusTotal scans enough to trust community skills?

Short answer: no. Longer answer: it raises the floor but not the ceiling.

Covers: malicious binaries, obfuscated scripts, network callbacks to known C2 infra, dependency confusion.
Misses: prompt logic attacks, eval of user input, privilege escalation via tool APIs, misconfigured memory scope.
Gray area: post-install scripts that are technically legit but run risky commands (chmod -R 777 /tmp came up last week).

The only way to evaluate logic-level safety today is manual review or sandbox replay with synthetic prompts. We’re experimenting with Semgrep rules on the prompt templates, but that’s bleeding edge.

Hardening your agent beyond VirusTotal

Runtime caps. Set processSandbox.seccompProfile: 'default' in daemon.yaml so syscalls are limited even if a skill escapes Node.
Network egress ACL. Gateway 0.28.0 lets you define outboundAllowList. Block *.pastebin.com, *.ngrok.io, and friends.
Memory scopes. Skills default to scope: session. Only flip to scope: global if you need cross-conversation state. Fewer leaks.
Prompt linting. Run claw lint. The 0.15.0 linter catches unescaped {{userInput}} in system prompts.
Sign your skill releases. From 0.19.0 the registry accepts .sig files generated by npm pack --sign git.

Practical checklist before installing an untrusted skill

Look at the VirusTotal badge. Yellow is a smell, red is a no-go.
Open the repo, scan for spawn, exec, child_process. Humans beat AV here.
Search the prompt templates for {{raw}} or unfiltered user input.
Check the publisher’s npm history. One-day-old accounts plus lots of typos squats are common.
Run the skill in a throw-away agent first. Use claw dev sandbox which spins an agent with no persistent memory and blocks outbound net.

If you hit something sketchy, click Report Skill in the Gateway or run claw report <skill>. Reports create a private GitHub issue that the security team triages within 24 h.

What’s next for skill security

Two items on the roadmap:

Reproducible builds. We’re working on npm ci --ignore-scripts pipelines to prove the tarball matches the repo tag.
LLM red teaming harness. Think Swiss cheese prompts fed through each skill to detect role leaks and indirect injections. Early prototype lives on feature/red-team-harness.

The VirusTotal partnership gives us a solid first gate. It doesn’t absolve anyone from reading code or thinking through prompt flow. Security is layers; we just added one.

Next step for you: upgrade to Gateway 0.28.0, enable skillSecurity.minVerdict: clean, and rescan your existing agents. Five minutes of work, one less thing to worry about.