The short version: every skill pushed to the public OpenClaw registry is now run through VirusTotal’s static and dynamic malware engines. You’ll see a green, yellow, or red badge next to the skill before you hit Install. That answers the most common request we kept seeing in GitHub issues — “How do I know this random skill isn’t dropping a cryptominer?” But security people (myself included) are genetically unable to stop at the short version, so here’s the long one.
Why we wired VirusTotal into the skill registry
OpenClaw skills are just npm packages that the agent runs inside a sandbox. Sandbox or not, skills have file-system access, network access, and in 0.14.0 they even get a headless browser context if they ask for it. That is more than enough surface area to hide something nasty.
Until last week our only protection was code review by maintainers. That does not scale; the registry is sitting at 3 127 skills and we merge ~40 new ones per week. So we reached out to VirusTotal (owned by Chronicle/Google) and asked for API quota to scan every commit. They were already providing something similar for Homebrew, so the conversation was short: we signed, they whitelisted our IP, and we pushed the first batch on Friday 2024-05-10.
What VirusTotal scanning actually checks
Each skill tarball is sent to the /vtapi/v3/file/scan endpoint. Behind that one call you get:
- 76 static AV engines (ClamAV, Kaspersky, MS Defender, etc.)
- Five sandbox detonation systems that run the code in a stripped container for ~90 seconds
- YARA rule matching for common malicious patterns
- Reputation look-up on outbound IPs and domains
The report is cached; we only rescan if the tarball checksum changes. We store the sha256, the overall score (number of engines flagging > 0), and a link to the public VT page. That’s what powers the badge you see in the Gateway UI:
Badge logic
- 0/76 detections → green “Clean”
- 1-3/76 → yellow “Suspicious”
- >3/76 → red “Malicious”
You can also pull the raw JSON from the CLI:
$ claw skill vt-report openclaw-weather@1.2.9
{
'sha256': 'b18c…e4',
'engines_detected': 0,
'total_engines': 76,
'sandbox_ips': [],
'verdict': 'clean',
'vt_url': 'https://www.virustotal.com/gui/file/b18c…e4'
}
What the VirusTotal badge does not tell you
No scanner can reason about the intent of an LLM prompt. If a skill passes a user’s secret document into the system prompt and asks the model to summarize it, that could be fine or it could leak the whole thing to a paste-bin URL. VT will not flag that because it’s “just text.”
Same for prompt injection vectors:
- Indirect prompts hidden in a PDF the user feeds to the agent
- Jailbreak strings stored in a remote calendar entry
- Self-modifying chain-of-thought that writes to its own memory store
That class of issues lives at the logic layer, not the code layer. VirusTotal (and all other AV) is focused on the code layer: malware, backdoors, trojans, ransomware, crypto-miners, typo-squatted dependencies. Important but different.
Hands-on: using VirusTotal reports in the Gateway and CLI
Gateway >= 0.28.0 shows the badge right below the skill name. Click it and you get a side panel with the full VT JSON. You can also set an org-wide policy:
# /etc/openclaw/gateway.yaml
skillSecurity:
minVerdict: clean # allow clean only
allowSuspicious: false
allowMalicious: false
With that in place, installs that resolve to a yellow or red badge abort with HTTP 412.
From the CLI (v0.13.2+):
$ claw policy set vt.cleanOnly true
$ claw install openclaw-rss-summarizer
Error: VT verdict suspicious (1/76 engines). Install blocked.
If you are self-hosting and want the raw scan artifact you can request it once per 24h without eating into our quota:
$ claw admin vt-fetch b18c…e4 --outfile report.zip
Digging into detections: false positives and real threats
Out of the first 3 127 skills we scanned, 34 came back red. We investigated each:
- 21 were flat-out malicious. Six of them tried to download
xmrigfrom a GitHub raw URL. Yes, still a thing in 2024. - 9 were false positives. Mostly shell scripts that vendor
curl | bashinstall blobs. - 4 were gray zone crypto utilities embedding
bip39-wordlist.txt; some AV engines flag any crypto wallet helper.
We yanked the confirmed bad packages, opened PRs on the false positives (re-structured the installers), and pushed a doc update: if your skill shells out, expect scrutiny.
Are VirusTotal scans enough to trust community skills?
Short answer: no. Longer answer: it raises the floor but not the ceiling.
- Covers: malicious binaries, obfuscated scripts, network callbacks to known C2 infra, dependency confusion.
- Misses: prompt logic attacks,
evalof user input, privilege escalation via tool APIs, misconfigured memory scope. - Gray area: post-install scripts that are technically legit but run risky commands (
chmod -R 777 /tmpcame up last week).
The only way to evaluate logic-level safety today is manual review or sandbox replay with synthetic prompts. We’re experimenting with Semgrep rules on the prompt templates, but that’s bleeding edge.
Hardening your agent beyond VirusTotal
- Runtime caps. Set
processSandbox.seccompProfile: 'default'indaemon.yamlso syscalls are limited even if a skill escapes Node. - Network egress ACL. Gateway 0.28.0 lets you define
outboundAllowList. Block*.pastebin.com,*.ngrok.io, and friends. - Memory scopes. Skills default to
scope: session. Only flip toscope: globalif you need cross-conversation state. Fewer leaks. - Prompt linting. Run
claw lint. The 0.15.0 linter catches unescaped{{userInput}}in system prompts. - Sign your skill releases. From 0.19.0 the registry accepts
.sigfiles generated bynpm pack --sign git.
Practical checklist before installing an untrusted skill
- Look at the VirusTotal badge. Yellow is a smell, red is a no-go.
- Open the repo, scan for
spawn,exec,child_process. Humans beat AV here. - Search the prompt templates for
{{raw}}or unfiltered user input. - Check the publisher’s npm history. One-day-old accounts plus lots of typos squats are common.
- Run the skill in a throw-away agent first. Use
claw dev sandboxwhich spins an agent with no persistent memory and blocks outbound net.
If you hit something sketchy, click Report Skill in the Gateway or run claw report <skill>. Reports create a private GitHub issue that the security team triages within 24 h.
What’s next for skill security
Two items on the roadmap:
- Reproducible builds. We’re working on
npm ci --ignore-scriptspipelines to prove the tarball matches the repo tag. - LLM red teaming harness. Think Swiss cheese prompts fed through each skill to detect role leaks and indirect injections. Early prototype lives on
feature/red-team-harness.
The VirusTotal partnership gives us a solid first gate. It doesn’t absolve anyone from reading code or thinking through prompt flow. Security is layers; we just added one.
Next step for you: upgrade to Gateway 0.28.0, enable skillSecurity.minVerdict: clean, and rescan your existing agents. Five minutes of work, one less thing to worry about.