How to Build Custom Claude Code Skills That Actually Work

You tell Claude Code to deploy a Cloudflare Pages site. It asks you for the project name. You tell it. It asks for the account ID. You tell it. It guesses the wrong wrangler flags. You correct it. Next week, same task, same questions, same corrections.

Claude Code has no long-term memory between sessions. Every conversation starts from zero. The agent is capable, but it forgets everything you taught it the moment you close the terminal.

Skills fix this. A Claude Code skill is a markdown file (SKILL.md) that gives the agent persistent, reusable expertise on a specific topic. Write it once, and Claude Code reads it automatically every time the topic comes up. No re-explaining. No correcting the same mistakes.

I've built seven production skills over the past two months: a brand guide enforcer, a frontend design system, a blog publisher, a humanizer that strips AI writing patterns, a Proxmox infrastructure monitor, a smart kettle controller, and an email API integration. Some are 30 lines. Some are 200+. All of them save me from repeating myself.

This post covers how to build Claude Code skills from scratch. You'll get the file structure, the YAML frontmatter format, mode detection patterns, reference file organization, and real code from production skills. By the end you'll be able to write your own.

01 What a Skill Actually Is

A Claude Code skill is a directory containing a SKILL.md file. That file has two parts: YAML frontmatter (metadata the system uses to decide when to load the skill) and markdown body (instructions the agent follows when the skill is active).

The simplest possible skill:

---
name: deploy-helper
description: Deploy sites to Cloudflare Pages. Use when the user
  mentions deploying, publishing, or pushing to production.
---

# Deploy Helper

Deploy to Cloudflare Pages using wrangler.

## Steps
1. Run `wrangler pages deploy ./dist --project-name my-site`
2. Verify the deployment at the Pages URL
3. Report the live URL back to the user

That's a working skill. Put it in a SKILL.md file inside a skill directory, and Claude Code will read it whenever deployment comes up in conversation. The agent follows your instructions instead of guessing.

But simple skills like this only scratch the surface. The real power comes from structure.

02 The YAML Frontmatter

The frontmatter block between the --- markers tells the system two things: what this skill is called, and when to activate it.

---
name: humanizer
description: Strip AI writing patterns from text before sending.
  Run automatically on any user-facing prose longer than 3 sentences.
  Catches stock phrases, structural tells, em dash overuse,
  rule-of-three abuse, performed authenticity, hedging, puffery,
  and superficial analysis. Use when writing blog posts, product
  descriptions, emails, social posts, or any public-facing content.
---

The name field is a short identifier. Keep it lowercase with hyphens.

The description field is the trigger. This is how the system decides whether to load your skill for a given conversation. Be specific about when the skill applies. List the exact scenarios. The description above mentions "blog posts, product descriptions, emails, social posts" so the system knows to activate the humanizer whenever the user is writing any of those.

A vague description like "helps with writing" will either fire too often (wasting context) or not fire when you need it. Specificity matters here.

What NOT to put in the frontmatter

Don't put instructions in the description. The description is for matching, not for behavior. All behavioral instructions go in the markdown body below the frontmatter. If you stuff instructions into the description, the system reads them during matching but may not carry them into the active session properly.

03 Structuring the Markdown Body

The body of your SKILL.md is regular markdown. Claude Code reads it as instructions. The structure you choose determines how well the agent follows your intent.

After building seven skills, I've landed on a pattern that works consistently:

# Skill Name

One-line summary of what this skill does.

## When to Apply
- Bullet list of specific triggers
- "NOT:" list of when to skip this skill

## Process
1. Numbered steps the agent follows
2. Each step is concrete and actionable
3. Include exact commands, file paths, tool names

## Quick Reference
- The most-used patterns, condensed
- Agent can scan this section fast during execution

## File Reference
| File | When to Read | Purpose |
|------|-------------|---------|
| `references/foo.md` | Entering mode X | Detailed instructions for X |

The "When to Apply" section with explicit NOT conditions is something I added after my humanizer skill kept activating during casual Telegram chats. The exclusion list:

## When to Apply

- Blog posts, product descriptions, landing page copy
- Emails, social media posts, forum posts
- Any content that will be read by humans who didn't ask an AI to write it
- NOT: code, configs, internal docs, memory files, Telegram chat replies

That "NOT" line saved me from the agent rewriting its own memory files to remove em dashes.

04 Reference Files: Keeping Skills Lean

A SKILL.md file that's 500 lines long wastes context tokens every time it loads. The fix: put detailed instructions in reference files and tell the agent when to read them.

Here's how the brand guide skill handles this. The SKILL.md is about 120 lines. It contains the high-level logic: how to detect which mode to enter, what the Builder does at a conceptual level, what rules the Enforcer follows. But the detailed interview flow (10 stages with visual preview instructions) lives in a separate file:

## File Reference

| File | When to Read | Purpose |
|------|-------------|---------|
| `references/builder-interview.md` | Entering Builder mode | Full guided interview flow |
| `references/css-generation.md` | Entering Enforcer mode | How to translate JSON into CSS |
| `templates/brand-guide-schema.json` | Builder mode (output) | JSON schema for the guide |
| `templates/example-guide.json` | Builder mode (reference) | Filled-out example |

The agent reads builder-interview.md only when someone asks to create a brand guide. It reads css-generation.md only when generating frontend code with an existing guide. The SKILL.md stays small, and detailed instructions load on demand.

Directory structure for a skill with reference files:

my-skill/
├── SKILL.md              # Main skill file (loaded on match)
├── references/
│   ├── detailed-guide.md # Loaded on demand
│   └── patterns.md       # Loaded on demand
├── scripts/
│   └── check.sh          # Shell scripts the agent can run
└── templates/
    └── schema.json        # Data templates

05 Single-Mode vs. Multi-Mode Skills

Simple skills do one thing. The blog publisher skill follows the same steps every time: write the post, add SEO meta tags, update the index page, update the sitemap, deploy. One mode. Linear flow.

More sophisticated skills change behavior based on context. The brand guide skill operates in two distinct modes, and the SKILL.md explicitly tells the agent how to detect which one to enter:

## Detecting the Mode

**Enter Builder mode when:**
- User says "create/build/develop a brand guide"
- User says "help me define my brand"
- No brand-guide.json exists yet and the user wants
  to build frontends with consistency

**Enter Enforcer mode when:**
- A brand-guide.json file exists (in uploads or referenced)
- User asks to build any frontend (page, component, app)
- User says "follow my brand" or "use my brand guide"

**Both modes in one session:**
- User creates a guide (Builder) then immediately asks
  to build something (Enforcer)
- User builds a frontend, notices inconsistencies,
  and wants to update the guide (Enforcer then Builder)

This detection block is explicit. It lists exact phrases and file conditions the agent should look for. No ambiguity about which mode to pick.

Each mode then has its own section in the SKILL.md with separate rules, separate reference files, and separate workflows. The Enforcer has ten strict rules (colors are locked, typography is locked, spacing follows the scale). The Builder has five principles (show don't ask, start from what they have, opinionated defaults).

When should you use multi-mode? When your skill needs to both create something and enforce something. Or when the same domain has distinct workflows depending on what the user needs. If every invocation follows the same steps, stick with single-mode.

06 Writing Rules the Agent Will Follow

Claude Code is an LLM. It's probabilistic. It will drift from your instructions if those instructions are vague. The fix: write rules that are absolute and auditable.

Bad rule:

Use the brand colors consistently.

Good rule:

Colors are locked. Use ONLY colors defined in the brand guide.
Every color in the output must trace back to a brand token.
No eyeballed hex values. No "close enough."

The good version leaves no room for interpretation. "ONLY," "every," "no" are absolute terms the agent respects better than "try to" or "should."

Another pattern that works well: self-audit instructions. Tell the agent to check its own output before presenting it.

## Self-audit before delivering
Before presenting the output, scan for:
- Any hardcoded color not from the guide
- Any font-family not from the guide
- Any spacing value not on the scale
- Any violation of the "don'ts" list
Fix violations before showing the user.

This creates a feedback loop inside the skill. The agent generates, checks, fixes, then presents. It catches drift that would otherwise reach you.

Negative constraints beat positive ones

My humanizer skill is almost entirely negative constraints. Instead of "write naturally," it says:

"Never use 'it's worth noting.' Just state the thing."
"Kill 'in today's [adjective] world' on sight."
"Max 1 em dash per paragraph. Use commas or periods instead."

Telling the agent what to avoid eliminates specific failure modes. Telling it what to do leaves room for interpretation. "Write naturally" means nothing to an LLM. "Never start a sentence with 'Furthermore'" is unambiguous.

Skip the Trial and Error

The Claude Code Skills Starter Kit includes 6 production-tested skill templates plus a guide covering frontmatter patterns, mode detection, reference file organization, and the debugging techniques I learned building 7+ skills. Each template is annotated with comments explaining why every section exists.

Get the Starter Kit — $24 →

07 Scripts and Automation Inside Skills

Skills can include shell scripts the agent runs as part of its workflow. My humanizer skill has a regression test script that checks for AI writing patterns:

#!/bin/bash
# humanizer-test.sh
# Usage: bash humanizer-test.sh <file>

FILE="${1:-/dev/stdin}"
MATCHES=0

check() {
  local category="$1"
  local pattern="$2"
  local label="$3"
  local hits
  hits=$(grep -niE "$pattern" "$FILE" 2>/dev/null || true)
  if [ -n "$hits" ]; then
    MATCHES=$((MATCHES + 1))
    echo "❌ [$category] $label"
    echo "$hits" | head -5 | sed 's/^/   /'
  fi
}

check "STOCK" "it'?s worth (noting|mentioning)" \
  "It's worth noting"
check "STOCK" "at the end of the day" \
  "At the end of the day"
check "STOCK" "\bdelve(s|d)?\b" "Delve"
check "PUFFERY" "\b(groundbreaking|game-?changing)\b" \
  "Groundbreaking/game-changing"
check "PUFFERY" "\bseamless(ly)?\b" "Seamless"
# ... 40+ more patterns

exit "$MATCHES"

The SKILL.md tells the agent to run this script on every piece of prose before sending it:

## Process
1. Write the content normally
2. Run `scripts/humanizer-test.sh` on the text
3. Auto-fix all matches (don't report them, just rewrite)
4. Run the script again to verify fixes
5. Repeat until 0 matches
6. Only then proceed with publishing

The script turns a subjective judgment ("does this sound like AI?") into an objective test ("does this match any of 40+ known patterns?"). The agent runs it, fixes hits, runs it again. Mechanical. Repeatable.

08 Real Example: The Blog Publisher Skill

This is the skill that publishes the post you're reading right now. It covers the entire pipeline from writing to deployment.

---
name: blog-publisher
description: Write, optimize, and deploy SEO blog posts for
  bmosan.com. Use when creating new blog posts, updating the
  blog index, or deploying blog content to Cloudflare Pages.
---

# Blog Publisher

Publish SEO-optimized blog posts to bmosan.com.

## Site Stack
- Static HTML/CSS (no framework)
- Hosted on Cloudflare Pages (project: `bmosan`)
- Deploy via: `wrangler pages deploy ./bmosan-labs/
  --project-name bmosan`

## Workflow

### 1. Write the Post
- Target 2000-3000 words (10-15 min read)
- Use the HTML template at `assets/post-template.html`
- Give away ~70% of value freely, CTA to Gumroad product

### 2. CTA Placement
- Mid-article CTA: After 40-50% of content
- End-of-article CTA: After the last section
- Max 2 CTAs per post

### 3. SEO Checklist
Every post MUST have:
- <title> with primary keyword
- <meta name="description"> (150-160 chars)
- Open Graph tags
- Twitter Card tags
- JSON-LD Article schema

### 4. Update Blog Index
- Add new .post-card div at TOP of .posts

### 5. Update Sitemap
- Add new <url> entry to sitemap.xml

### 6. Deploy
- Set Cloudflare env vars from 1Password
- Run wrangler pages deploy

This skill is single-mode. Every blog post follows the same steps. The SEO checklist ensures I never forget meta tags. The CTA placement rules keep the sales pitch to exactly two spots. The deploy step includes the exact commands with the right flags.

Without this skill, I'd forget the JSON-LD schema half the time. Or I'd deploy without updating the sitemap. The skill makes those oversights impossible because the agent follows the checklist on every run.

09 Debugging Skills That Don't Work

Three failure modes I've hit repeatedly:

The skill never activates. Your description doesn't match the user's phrasing. Fix: add more trigger phrases to the description. Be redundant. If the skill handles "deploying," also mention "publishing," "pushing to production," "shipping," and "going live."

The agent ignores instructions mid-task. Your SKILL.md is too long. The agent loses track of rules that appear 200 lines into the file. Fix: move detailed instructions to reference files. Keep the SKILL.md under 150 lines. Put the most important rules near the top.

The agent follows the letter but not the spirit. Your rules are too vague. "Make it look professional" gives the agent nothing to work with. Fix: replace every subjective instruction with an objective one. Instead of "use appropriate spacing," write "use the 4px base unit. Valid values: 4, 8, 12, 16, 24, 32, 48, 64."

The context window trap

Every skill loaded into a session consumes context tokens. If you have 15 skills and 8 of them activate for a conversation about frontend development, that's a lot of instructions competing for attention. The agent may follow the first skill's rules and quietly ignore the fifth.

Mitigations: keep descriptions narrow so fewer skills activate simultaneously. Use reference files so the base SKILL.md is small. Put the most important rules first in every file (the agent weighs early content more heavily).

10 Skill Composition: Making Skills Work Together

Some skills are designed to work in combination. My brand guide skill has an explicit section about this:

### Interaction with frontend-design skill

This skill works WITH the frontend-design skill, not against it.
- frontend-design provides creative methodology and layout thinking
- brand-guide constrains the visual palette

When both skills are active, the frontend-design skill's instruction
to "choose bold, unexpected aesthetics" is tempered by: "bold and
unexpected within the brand's defined system."

If you're building skills that might run together, address the interaction directly. Which skill takes priority when rules conflict? Which one handles layout vs. visual style vs. content? Spell it out. The agent can't resolve ambiguity between two conflicting skill files on its own.

11 Skill Directory Structure

For a single skill:

skills/
└── my-skill/
    ├── SKILL.md
    ├── references/
    │   └── detailed-guide.md
    ├── scripts/
    │   └── check.sh
    └── templates/
        └── schema.json

For a collection of skills that share a domain:

skills/
├── blog-publisher/
│   ├── SKILL.md
│   ├── assets/
│   │   └── post-template.html
│   └── references/
│       └── design-tokens.md
├── humanizer/
│   ├── SKILL.md
│   ├── references/
│   │   └── patterns.md
│   └── scripts/
│       └── humanizer-test.sh
└── frontend-design/
    ├── SKILL.md
    └── references/
        └── design-principles.md

Each skill is self-contained. It can reference its own files with relative paths (references/patterns.md, scripts/check.sh). The agent resolves those paths relative to the SKILL.md location.

What We Didn't Cover

This post gives you the fundamentals for building working Claude Code skills. But there's more to the craft that we didn't get into:

Skill testing frameworks that verify your skill produces correct output across different prompt variations
Version control strategies for evolving skills without breaking existing workflows
Advanced mode detection using file existence checks, environment variables, and conversation history patterns
Skill packaging for distribution to teams, including dependency management between skills
Performance profiling to measure how much context each skill consumes and optimize token usage

Get the Claude Code Skills Starter Kit

6 production-tested skill templates (single-mode, multi-mode, script-integrated, reference-heavy, composition-aware, and audit-loop). Each template is annotated with inline comments explaining the reasoning behind every section. Plus a 40-page guide covering the patterns in this post and the advanced techniques we didn't cover here.

Download the Starter Kit — $24 →

Start Building

Pick one task you repeat with Claude Code. Something where you find yourself re-explaining the same context, the same file paths, the same commands. Write a SKILL.md for it. Start with 20 lines. Add the YAML frontmatter with a specific description. Put the steps in the markdown body.

Test it. See where the agent deviates. Tighten the rules. Add a "NOT" list. Move long sections to reference files. Run it again.

A good skill saves you five minutes per session. Over a month of daily use, that's two and a half hours. Over a year, thirty hours. The 20 minutes you spend writing the SKILL.md pays for itself on the first day.