The Skills System

Plugins for Agent Cognition

skills

extensibility

prompts

A tool gives an agent new capabilities – read files, run commands, search the web. A skill gives it new expertise – how to write production-grade frontends, how to debug methodically, how to use the Anthropic SDK correctly across eight languages. Tools change what the agent does; skills change how it thinks. This distinction between capability and expertise is the central design insight of Claude Code’s skills system. Skills are Markdown files that inject domain-specific instructions directly into the system prompt, so the model reasons with that expertise from its first decision.

This post is a deep dive into the skills architecture: the SKILL.md format, the 11 built-in skills, the trigger system, the prompt injection mechanism, the 1% context budget, and the design patterns that make skills a compelling extension point. We covered skills at the survey level in Part III.4: Hooks & Lifecycle. Here we go deep.

%%{init: {'theme': 'neutral', 'flowchart': {'useMaxWidth': false, 'htmlLabels': true, 'padding': 20, 'nodeSpacing': 30, 'rankSpacing': 40}, 'themeVariables': {'primaryColor': '#8B9DAF', 'primaryTextColor': '#ffffff', 'primaryBorderColor': '#6E7F91', 'secondaryColor': '#9CAF88', 'secondaryTextColor': '#ffffff', 'secondaryBorderColor': '#7A8D68', 'tertiaryColor': '#C2856E', 'tertiaryTextColor': '#ffffff', 'tertiaryBorderColor': '#A06A54', 'lineColor': '#B5A99A', 'textColor': '#4A4A4A', 'mainBkg': '#8B9DAF', 'nodeBorder': '#6E7F91', 'clusterBkg': 'rgba(139,157,175,0.12)', 'clusterBorder': '#B5A99A', 'edgeLabelBackground': 'transparent'}}}%%
flowchart TD
  SK["<b>The Skills System</b>"]
  FMT["SKILL.md Format<br>and Discovery"]
  INV["11 Built-in<br>Skills (Inventory)"]
  TRG["Trigger Conditions<br>and Matching"]
  PIJ["Prompt Injection<br>and Token Budget"]
  CMP["Skills vs Tools<br>vs MCP vs Hooks"]

  SK --> FMT
  SK --> INV
  SK --> TRG
  SK --> PIJ
  SK --> CMP
  style SK fill:#8B9DAF,color:#fff,stroke:#6E7F91
  style FMT fill:#9CAF88,color:#fff,stroke:#7A8D68
  style INV fill:#C2856E,color:#fff,stroke:#A06A54
  style TRG fill:#B39EB5,color:#fff,stroke:#8E7A93
  style PIJ fill:#C4A882,color:#fff,stroke:#A08562
  style CMP fill:#8E9B7A,color:#fff,stroke:#6E7B5A

Figure 1: This post’s coverage of the skills system. Five subtopics radiate from the central node: SKILL.md format and discovery, the 11 built-in skills inventory, trigger condition matching (four mechanisms), prompt injection and the 1% token budget, and the comparison of skills with tools, MCP, and hooks. Each subtopic is explored as a standalone section.

How to read this diagram. The central node is “The Skills System,” and five branches radiate outward to the subtopics covered in this post. Each leaf represents a standalone section: SKILL.md Format and Discovery, the 11 Built-in Skills Inventory, Trigger Conditions and Matching, Prompt Injection and Token Budget, and the comparison of Skills versus other extension mechanisms. Read the branches as a table of contents for the post.

Source files covered in this post:

File	Purpose	Size
`src/skills/bundledSkills.ts`	11 built-in skill definitions	~200 LOC
`src/skills/loadSkillsDir.ts`	Skill discovery from `.claude/skills/`	~300 LOC
`src/tools/SkillTool/SkillTool.ts`	Skill invocation and trigger matching	~400 LOC
`src/tools/SkillTool/prompt.ts`	Skill prompt injection templates	~200 LOC
`.claude/skills/`	User-defined skill directory (SKILL.md files)	N custom files

Why Skills Exist: The Expertise Gap

LLMs are trained on broad knowledge but lack situational expertise. Skills fill this gap by injecting curated, domain-specific instructions at prompt assembly time.

Consider a concrete scenario. A developer asks Claude Code to build a React component. The model has been trained on millions of React examples and can produce plausible code. But “plausible” is not “production-grade.” A production React component needs ARIA labels for accessibility, responsive breakpoints, performant rendering patterns, and consistent design tokens. The model knows about these things – it has seen them in training data – but it does not prioritize them unless instructed to.

This is the expertise gap. Training data gives the model knowledge. Skills give it priorities. The frontend-design skill injects instructions that say, in effect: “When building UI, you are a senior frontend engineer. Accessibility is non-negotiable. Use semantic HTML. Follow responsive design patterns. Avoid generic AI aesthetics.” These are not new capabilities – the model could already do all of this. The skill ensures it actually does.

The distinction maps to a concept in cognitive science: declarative knowledge (knowing that) versus procedural knowledge (knowing how). The model’s training data provides declarative knowledge – it knows that ARIA labels exist. Skills provide procedural knowledge – they tell the model when and how to apply ARIA labels. This is why skills are called “plugins for cognition” rather than “plugins for capability.”

The SKILL.md Format: Markdown as Configuration

A skill is a Markdown file with trigger conditions and instructions. Discovery is filesystem-based: Claude Code scans .claude/skills/ for SKILL.md files at startup.

The format is deliberately simple. A SKILL.md file contains:

A trigger block specifying when the skill should activate
Instruction content that gets injected into the system prompt
Optional metadata (name, description, priority)

# Example: custom-api-style.md in .claude/skills/

---
name: custom-api-style
trigger:
  slash_command: /api-style
  keywords: ["REST API", "endpoint", "route handler"]

---

When writing REST API endpoints, follow these conventions:

1. Use explicit HTTP status codes (never rely on framework defaults)
2. Validate all input with zod schemas before processing
3. Return consistent error shapes: { error: string, code: number }
4. Log every request with correlation IDs
5. Write integration tests, not just unit tests

The discovery mechanism is straightforward. The loadSkillsDir.ts function scans the .claude/skills/ directory recursively, parsing each Markdown file into a skill registration. Built-in skills ship in the Claude Code binary itself, loaded from src/skills/ (4,066 lines of code). The SkillTool implementation in tools/SkillTool/ (1,477 lines) handles invocation, trigger matching, and content injection.

%%{init: {'theme': 'neutral', 'flowchart': {'useMaxWidth': false, 'htmlLabels': true, 'padding': 20, 'nodeSpacing': 30, 'rankSpacing': 40}, 'themeVariables': {'primaryColor': '#8B9DAF', 'primaryTextColor': '#ffffff', 'primaryBorderColor': '#6E7F91', 'secondaryColor': '#9CAF88', 'secondaryTextColor': '#ffffff', 'secondaryBorderColor': '#7A8D68', 'tertiaryColor': '#C2856E', 'tertiaryTextColor': '#ffffff', 'tertiaryBorderColor': '#A06A54', 'lineColor': '#B5A99A', 'textColor': '#4A4A4A', 'mainBkg': '#8B9DAF', 'nodeBorder': '#6E7F91', 'clusterBkg': 'rgba(139,157,175,0.12)', 'clusterBorder': '#B5A99A', 'edgeLabelBackground': 'transparent'}}}%%
flowchart LR
  A["Scan built-in skills<br><i>src/skills/</i><br>11 bundled skills"]
  B["Scan .claude/skills/<br><i>user-defined</i><br>N custom skills"]
  C["Register all skills<br>in SkillRegistry"]
  D["Ready for trigger<br>matching on each turn"]

  A --> C
  B --> C
  C --> D
  style A fill:#8B9DAF,color:#fff,stroke:#6E7F91
  style B fill:#9CAF88,color:#fff,stroke:#7A8D68
  style C fill:#C2856E,color:#fff,stroke:#A06A54
  style D fill:#B39EB5,color:#fff,stroke:#8E7A93

Figure 2: Skill discovery pipeline from startup to registration. Built-in skills from src/skills/ (11 bundled) and user-defined skills from .claude/skills/ (N custom) are scanned in parallel and merged into a unified SkillRegistry. Once registered, all skills are available for trigger matching on every conversational turn.

How to read this diagram. Two sources on the left – built-in skills from src/skills/ (11 bundled) and user-defined skills from .claude/skills/ (N custom) – both feed into the central SkillRegistry node. From there, a single arrow leads to the “Ready for trigger matching” state on the right. The key point is that built-in and custom skills are merged into one unified registry at startup, and from that point forward the system treats them identically.

The choice of Markdown is not accidental. Skills are natural language instructions injected into a natural language prompt. Using Markdown means the skill content IS the prompt content – no serialization, no template language, no compilation step. The file you write is the text the model reads. This directness eliminates an entire class of “the template rendered wrong” bugs.

The Built-in Skills Inventory: 11 Bundled Specialists

Claude Code ships with 11 built-in skills, ranging from a 143-fragment API reference to a one-command loop timer. Each skill represents a curated expertise domain that Anthropic determined was valuable enough to bundle.

Skill	Trigger	What It Injects	Category
`claude-api`	Anthropic SDK imports	143 prompt fragments, 8 langs	Reference
`frontend-design`	Web UI building requests	Production-grade UI guide	Methodology
`code-review`	`/code-review` command	Structured review format	Methodology
`debugging`	`/debugging` or stuck pattern	Reproduce-hypothesize-fix	Methodology
`simplify`	`/simplify` command	Reuse/quality/efficiency	Methodology
`loop`	`/loop 5m /foo`	Recurring command execution	Automation
`init-claudemd`	New project setup	Generate CLAUDE.md	Setup
`stuck`	`/stuck` command	Reset approach, alternatives	Methodology
`update-config`	Configuration requests	settings.json guidance	Setup
`create-verifier`	Verification needs	Build validation skills	Automation
`pdf`	PDF operations	Read/combine/split/create	Reference

The skills fall into four functional categories:

Reference skills (claude-api, pdf) inject domain-specific documentation. The claude-api skill is the heavyweight: 143 curated prompt fragments covering the Anthropic SDK across Python, TypeScript, Go, C#, Java, Ruby, PHP, and Kotlin. When activated, it gives the model accurate, version-specific API references rather than relying on potentially outdated training data.

Methodology skills (code-review, debugging, simplify, stuck) inject structured approaches. Instead of letting the model debug however it pleases, the debugging skill enforces a reproduce-hypothesize-test-fix cycle. Instead of an ad hoc code review, code-review mandates checking bugs, security, performance, and maintainability in sequence. These skills impose engineering discipline.

Automation skills (loop, create-verifier) enable recurring or self-referential workflows. The loop skill lets users run commands on intervals – useful for watching CI pipelines or monitoring deploys. The create-verifier skill is meta: it helps build skills that validate Claude Code’s own output.

Setup skills (init-claudemd, update-config) guide project initialization. They are activated during onboarding and help configure Claude Code for new repositories.

Key Insight

The distribution is telling. Four of 11 skills are methodology skills – they inject process, not data. This reflects a core truth about LLM agents: the model’s knowledge base is rarely the bottleneck. Its approach is. Skills correct the approach by injecting expert workflows that training data alone does not reliably produce.

Trigger Conditions and Matching: When Skills Activate

Skills are not always active. Each skill defines trigger conditions that determine when its content should be injected. The trigger system supports four activation mechanisms, and they can be composed.

%%{init: {'theme': 'neutral', 'flowchart': {'useMaxWidth': false, 'htmlLabels': true, 'padding': 20, 'nodeSpacing': 30, 'rankSpacing': 40}, 'themeVariables': {'primaryColor': '#8B9DAF', 'primaryTextColor': '#ffffff', 'primaryBorderColor': '#6E7F91', 'secondaryColor': '#9CAF88', 'secondaryTextColor': '#ffffff', 'secondaryBorderColor': '#7A8D68', 'tertiaryColor': '#C2856E', 'tertiaryTextColor': '#ffffff', 'tertiaryBorderColor': '#A06A54', 'lineColor': '#B5A99A', 'textColor': '#4A4A4A', 'mainBkg': '#8B9DAF', 'nodeBorder': '#6E7F91', 'clusterBkg': 'rgba(139,157,175,0.12)', 'clusterBorder': '#B5A99A', 'edgeLabelBackground': 'transparent'}}}%%
flowchart TD
  UI["<b>User Input</b>"]
  SC["Slash Command Match<br><i>exact string match</i><br>/code-review, /simplify, /loop 5m"]
  ID["Import Detection<br><i>AST/regex scan</i><br>import anthropic, @anthropic-ai/sdk"]
  KP["Keyword Pattern<br><i>fuzzy match</i><br>debug+stuck, PDF+merge, React+UI"]
  TC["Task Context Match<br><i>environment analysis</i><br>New project + no CLAUDE.md"]
  MS["<b>Matched Skills</b><br>0 to N skills can match simultaneously"]

  UI --> SC --> ID --> KP --> TC --> MS
  style UI fill:#8B9DAF,color:#fff,stroke:#6E7F91
  style SC fill:#9CAF88,color:#fff,stroke:#7A8D68
  style ID fill:#C2856E,color:#fff,stroke:#A06A54
  style KP fill:#B39EB5,color:#fff,stroke:#8E7A93
  style TC fill:#C4A882,color:#fff,stroke:#A08562
  style MS fill:#8E9B7A,color:#fff,stroke:#6E7B5A

Figure 3: Trigger matching pipeline from user input to skill activation. Four mechanisms are evaluated in sequence: slash command exact-string matching, AST/regex import detection, fuzzy keyword pattern matching, and environment-aware task context analysis. Zero to N skills can match simultaneously, allowing expertise to compose across domains.

How to read this diagram. Start at the top with User Input and follow the arrows downward through four sequential matching stages: Slash Command Match (exact string), Import Detection (AST/regex scan), Keyword Pattern (fuzzy match), and Task Context Match (environment analysis). Each stage can independently match zero or more skills. All matches accumulate into the “Matched Skills” node at the bottom, where zero to N skills may be active simultaneously.

Slash command triggers are the most explicit. When a user types /code-review, the skill system matches on the exact command string and activates the corresponding skill. This is direct, unambiguous, and user-initiated. Four of the 11 built-in skills use slash commands as their primary trigger: code-review, simplify, debugging, stuck, and loop.

Import detection triggers are the most clever. The claude-api skill activates when the code being worked on imports the Anthropic SDK – detected by scanning recent file reads for patterns like import anthropic, from anthropic, or @anthropic-ai/sdk. This is event-driven activation: the skill engages when the context demands it, not when the user explicitly requests it.

Keyword pattern triggers scan the user’s prompt for relevant terms. A prompt mentioning “React component” and “accessibility” might trigger frontend-design. A prompt mentioning “PDF” and “merge” triggers pdf. These are fuzzy matches – they do not require exact strings, but rather semantic relevance to the skill’s domain.

Task context triggers analyze the broader session state. When Claude Code detects it is working in a new repository with no CLAUDE.md file, init-claudemd activates. When the user asks about configuration, update-config activates. These triggers are the most sophisticated because they reason about the environment, not just the immediate prompt.

Multiple triggers can fire simultaneously. A user writing Anthropic SDK code in a React frontend could activate both claude-api and frontend-design at once, receiving both API reference material and UI design guidelines. The skill system does not enforce mutual exclusion – expertise composes.

System Prompt Injection: Compile-Time, Not Runtime

Skills inject their content at prompt assembly time – before the model sees its first token. This is a fundamental architectural choice that distinguishes skills from every other extension mechanism.

Think of it in compiler terms. The system prompt is assembled from fragments (see Part III.1: Prompt Assembly for the full pipeline). Skills inject at assembly time, meaning their content becomes part of the prompt the model reasons with from the start. This is compile-time injection: the skill modifies the program (the prompt) before it runs (the model generates).

Compare this to hooks, which fire during execution. A PreToolUse hook runs after the model has already decided to use a tool – it intercepts the action, not the reasoning. Skills operate earlier: they shape the reasoning itself.

%%{init: {'theme': 'neutral', 'flowchart': {'useMaxWidth': false, 'htmlLabels': true, 'padding': 20, 'nodeSpacing': 30, 'rankSpacing': 40}, 'themeVariables': {'primaryColor': '#8B9DAF', 'primaryTextColor': '#ffffff', 'primaryBorderColor': '#6E7F91', 'secondaryColor': '#9CAF88', 'secondaryTextColor': '#ffffff', 'secondaryBorderColor': '#7A8D68', 'tertiaryColor': '#C2856E', 'tertiaryTextColor': '#ffffff', 'tertiaryBorderColor': '#A06A54', 'lineColor': '#B5A99A', 'textColor': '#4A4A4A', 'mainBkg': '#8B9DAF', 'nodeBorder': '#6E7F91', 'clusterBkg': 'rgba(139,157,175,0.12)', 'clusterBorder': '#B5A99A', 'edgeLabelBackground': 'transparent'}}}%%
flowchart TD
  S1["S1: Identity<br><i>You are Claude...</i>"]
  S2["S2: Tool Policy<br><i>10 rules</i>"]
  S3["S3: Safety<br><i>anti-patterns, limits</i>"]
  S4["S4: Mode-specific instructions"]
  S5["<b>S5: SKILL INJECTION</b><br>Matched skills inject here"]
  D1["D1: Memory<br><i>CLAUDE.md</i>"]
  D2["D2: Environment<br><i>CWD, git, OS</i>"]
  D3["D3-D7: Language, output style,<br>MCP, scratchpad, tool descriptions"]
  D8["D8-D9: System reminders,<br>conversation history"]
  MODEL["<b>MODEL REASONING BEGINS</b><br>Skills already internalized"]

  S1 --> S2 --> S3 --> S4 --> S5 --> D1 --> D2 --> D3 --> D8 --> MODEL
  style S1 fill:#8B9DAF,color:#fff,stroke:#6E7F91
  style S2 fill:#9CAF88,color:#fff,stroke:#7A8D68
  style S3 fill:#C2856E,color:#fff,stroke:#A06A54
  style S4 fill:#B39EB5,color:#fff,stroke:#8E7A93
  style S5 fill:#C4A882,color:#fff,stroke:#A06A54,stroke-width:3px
  style D1 fill:#8E9B7A,color:#fff,stroke:#6E7B5A
  style D2 fill:#8B9DAF,color:#fff,stroke:#6E7F91
  style D3 fill:#9CAF88,color:#fff,stroke:#7A8D68
  style D8 fill:#C2856E,color:#fff,stroke:#A06A54
  style MODEL fill:#B39EB5,color:#fff,stroke:#8E7A93

Figure 4: Where skills inject in the prompt assembly pipeline. The flow shows 10 assembly stages from S1 (Identity) through D8-D9 (system reminders and conversation history). Skills appear at position S5 (highlighted with a thick border), after core personality (S1-S4) but before dynamic context (D1-D7). This placement ensures skills shape reasoning from the first token while remaining overridable by user CLAUDE.md instructions at D1.

How to read this diagram. Follow the arrow chain from S1 (Identity) at the top through each assembly stage down to MODEL REASONING BEGINS at the bottom. The S5 node – SKILL INJECTION – is highlighted with a thick border, marking where matched skills insert their content. Notice that skills appear after the core personality (S1-S4) but before user-defined CLAUDE.md instructions (D1) and conversation history (D8-D9), meaning skills shape reasoning from the start but can be overridden by user instructions.

The injection position matters. Skills appear after the core personality but before dynamic context. This means:

Skills can override default behaviors (they come after identity)
Skills are overridden by user-specific CLAUDE.md instructions (which come after skills)
Skills are cache-stable (they do not change between turns in the same session)

Point 3 has a direct economic benefit. Because skill content does not change between turns, it benefits from Anthropic’s prompt caching – the 90% input token savings described in Part III.1. This makes skills effectively “free” after the first turn: their tokens are cached, and subsequent turns pay only 10% of the cost.

Pattern Spotted

Skill injection is the Dependency Injection pattern applied to knowledge. In a DI container, you do not hardcode which ILogger implementation a class uses – you inject it at construction time based on configuration. In Claude Code, you do not hardcode which domain expertise the model has – you inject it at prompt assembly time based on trigger conditions. The prompt is the constructor. The skill is the injected dependency. The model is the class that consumes it.

The 1% Context Window Budget: Token Economics of Skills

Skills compete for the same scarce resource as everything else: the context window. Claude Code allocates approximately 1% of the context window as the truncation budget for skill content.

The math is concrete. Claude Sonnet 4.5’s context window is 200,000 tokens. 1% is 2,000 tokens – roughly 1,500 words. That is the total budget for ALL active skills on a given turn. If two skills are active and each wants 1,500 tokens, something has to give.

Region	Tokens	Share
System Prompt (static)	~15,000	7.5%
Tool Descriptions (core 14)	~4,000	2.0%
CLAUDE.md / Memory	~3,000	1.5%
SKILL CONTENT	~2,000	1.0%
MCP Server Instructions	~500	0.25%
System Reminders (volatile)	~500	0.25%
Conversation History	~175,000	87.5%

The 1% skill budget must hold ALL active skills. If a skill exceeds the budget, it is truncated.

This budget constraint has design implications:

Skills must be concise. A skill that dumps 5,000 tokens of documentation into the prompt will be truncated, losing its most important content (which is often at the end). Effective skills are dense: high information per token, no filler, no redundancy with the system prompt.

The claude-api skill is the exception that proves the rule. Its 143 prompt fragments are not all loaded simultaneously. The skill selectively loads only the fragments relevant to the detected programming language. Working in Python? You get Python SDK fragments. Working in TypeScript? You get TypeScript fragments. This selective loading is what makes a 143-fragment skill fit within a 2,000-token budget.

Multiple active skills share the budget. If claude-api and frontend-design are both active, their combined content must fit within the budget. This creates an implicit priority system: skills loaded earlier consume budget that later skills cannot use.

Trade-off

The 1% budget is a deliberate constraint, not a technical limitation. Anthropic could allocate more. But every token given to skills is a token taken from conversation history – which is where the actual work happens. The budget forces skill authors to write tight, actionable instructions rather than encyclopedic documentation. This constraint produces better skills, the same way Twitter’s 140-character limit produced more concise communication.

Skills Are Not Tools: A Fundamental Distinction

Skills do not appear in the tool registry. They cannot be called with tool_use blocks. They do not return results. They modify the model’s reasoning context, not its action space.

This confusion is common enough to warrant an explicit comparison. When people hear “extensibility,” they think plugins that add new functions. Skills are not that. They are closer to configuration than to code. A tool call says “do this thing.” A skill says “when doing things, reason this way.”

Dimension	Skills	Tools	MCP	Hooks
What it is	Prompt content	Executable function	External service	Shell cmd interceptor
Modifies	System prompt	Tool registry	Tool registry + prompt	Execution pipeline
Called via	NOT called. Injected automatically	tool_use blocks from model output	tool_use blocks via mcp__*	Lifecycle events (auto)
Returns results?	Nothing. Shapes reasoning	tool_result message to conversation	tool_result via JSON-RPC	Exit code + stdout as reminder
When active	Prompt assembly (compile-time)	During tool execution (runtime)	During tool execution (runtime, cross-proc)	During execution (runtime)
Can block actions?	No	No	No	YES (exit 2)

The temporal distinction is the key architectural difference:

Skills operate at compile-time (prompt assembly). They shape how the model thinks.
Tools operate at runtime (tool execution). They determine what the model does.
MCP operates at runtime across process boundaries. It extends what the model can reach.
Hooks operate at runtime as interceptors. They enforce what the model must not do.

This gives you a clean decision tree for extensibility:

Need to change HOW the model reasons? –> Skill

Need to give the model a NEW capability? –> Tool or MCP

Need to CONNECT to an external system? –> MCP

Need to ENFORCE a policy on actions? –> Hook

Need a SPECIALIZED persona? –> Custom Agent

The claude-api Skill: A Case Study in Curated Knowledge

The claude-api skill is Claude Code’s most data-intensive skill: 143 prompt fragments providing SDK reference documentation for 8 programming languages. It is a case study in the difference between training data and curated instructions.

The model’s training data includes Anthropic SDK documentation. So why inject it again as a skill? Three reasons:

Freshness. Training data has a cutoff. The SDK evolves. New endpoints, changed parameter names, deprecated methods – the skill carries the current version, not the version the model was trained on.

Accuracy. Training data is noisy. The model has seen correct usage, incorrect usage, outdated tutorials, and hallucinated blog posts. The skill provides a single authoritative reference that overrides the noise.

Specificity. Training data is broad. The skill provides language-specific patterns, not generic API concepts. The Python fragment shows client.messages.create(model="claude-sonnet-4-20250514") with real model IDs. The TypeScript fragment shows the equivalent with proper typing. Each fragment is tailored to the conventions of its language.

%%{init: {'theme': 'neutral', 'flowchart': {'useMaxWidth': false, 'htmlLabels': true, 'padding': 20, 'nodeSpacing': 30, 'rankSpacing': 40}, 'themeVariables': {'primaryColor': '#8B9DAF', 'primaryTextColor': '#ffffff', 'primaryBorderColor': '#6E7F91', 'secondaryColor': '#9CAF88', 'secondaryTextColor': '#ffffff', 'secondaryBorderColor': '#7A8D68', 'tertiaryColor': '#C2856E', 'tertiaryTextColor': '#ffffff', 'tertiaryBorderColor': '#A06A54', 'lineColor': '#B5A99A', 'textColor': '#4A4A4A', 'mainBkg': '#8B9DAF', 'nodeBorder': '#6E7F91', 'clusterBkg': 'rgba(139,157,175,0.12)', 'clusterBorder': '#B5A99A', 'edgeLabelBackground': 'transparent'}}}%%
flowchart TD
  SKILL["<b>claude-api skill</b>"]
  TRIGGER["Trigger: import detection<br><i>import anthropic, @anthropic-ai/sdk</i>"]
  FRAGS["143 prompt fragments"]

  SKILL --> TRIGGER
  SKILL --> FRAGS

  FRAGS --> PY["Python (18)"]
  FRAGS --> TS["TypeScript (18)"]
  FRAGS --> GO["Go (18)"]
  FRAGS --> CS["C# (18)"]
  FRAGS --> JAVA["Java (18)"]
  FRAGS --> RUBY["Ruby (18)"]
  FRAGS --> PHP["PHP (18)"]
  FRAGS --> KT["Kotlin (17)"]
  FRAGS --> GEN["Language-agnostic<br><i>model IDs, rate limits, caching</i>"]

  GEN -.- NOTE["Only fragments matching the<br>detected language are loaded"]
  style SKILL fill:#8B9DAF,color:#fff,stroke:#6E7F91
  style TRIGGER fill:#9CAF88,color:#fff,stroke:#7A8D68
  style FRAGS fill:#C2856E,color:#fff,stroke:#A06A54
  style PY fill:#B39EB5,color:#fff,stroke:#8E7A93
  style TS fill:#C4A882,color:#fff,stroke:#A08562
  style GO fill:#8E9B7A,color:#fff,stroke:#6E7B5A
  style CS fill:#8B9DAF,color:#fff,stroke:#6E7F91
  style JAVA fill:#9CAF88,color:#fff,stroke:#7A8D68
  style RUBY fill:#C2856E,color:#fff,stroke:#A06A54
  style PHP fill:#B39EB5,color:#fff,stroke:#8E7A93
  style KT fill:#C4A882,color:#fff,stroke:#A08562
  style GEN fill:#8E9B7A,color:#fff,stroke:#6E7B5A
  style NOTE fill:#8B9DAF,color:#fff,stroke:#6E7F91,stroke-dasharray:5 5

Figure 5: Language fragment architecture for the claude-api skill. The tree fans out from 143 total prompt fragments into 8 language-specific branches (Python, TypeScript, Go, C#, Java, Ruby, PHP, Kotlin – approximately 18 fragments each) plus a language-agnostic branch covering model IDs, rate limits, and caching. Only the branch matching the detected working language is loaded at runtime, enabling a 143-fragment skill to fit within the 2,000-token budget.

How to read this diagram. Start at the top with the claude-api skill node, which branches into its trigger mechanism (import detection) and its 143 prompt fragments. The fragments node fans out into 8 language-specific branches (Python, TypeScript, Go, etc.) plus a language-agnostic branch covering shared concepts like model IDs and rate limits. The dashed note at the bottom emphasizes the critical constraint: only fragments matching the detected working language are loaded at runtime, which is how a 143-fragment skill fits within the 2,000-token budget.

The selective loading mechanism is critical. Loading all 143 fragments would consume far more than the 1% skill budget. Instead, the skill detects which language the user is working in (from the import syntax, file extensions, and recent tool results) and loads only the relevant fragments. A developer writing Python gets ~18 Python-specific fragments plus a few language-agnostic ones. The rest are never loaded.

This is the same principle behind lazy evaluation in functional programming. Haskell does not evaluate an expression until its value is needed. The claude-api skill does not load a fragment until its language is needed. Both strategies conserve a scarce resource (Haskell: computation time and memory; skills: context window tokens) by deferring work until it is known to be necessary.

Key Insight

The claude-api skill demonstrates a general principle: curated instructions beat training data for task-specific accuracy. The model’s training data is vast but noisy. A skill’s instructions are small but precise. When the two conflict, the skill wins because it appears in the system prompt, which has higher influence weight than conversation history or training data. Skills are not redundant with training – they are corrections to it.

Custom Skill Development: Building Your Own Expertise Modules

Creating a custom skill requires a Markdown file in .claude/skills/ with trigger conditions and domain instructions. No code, no compilation, no registration API.

Here is a practical example. Suppose your team has a specific coding convention: all database queries must use parameterized statements, all error responses must follow a consistent shape, and all API endpoints must include rate limiting. You can encode this as a skill:

# .claude/skills/team-api-standards.md

---
name: team-api-standards
trigger:
  keywords: ["API", "endpoint", "route", "handler", "controller"]

---

## Team API Standards

When writing API code for this project, follow these conventions:

### Database Queries
- ALWAYS use parameterized queries. Never interpolate values into SQL strings.
- Use the query builder from `src/db/query.ts`, not raw SQL.

### Error Responses
Return consistent error shapes:
```json
{ "error": { "code": "VALIDATION_ERROR", "message": "...", "details": [...] } }

Rate Limiting

Every public endpoint must use the @rateLimit decorator.
Default: 100 requests/minute for authenticated, 20 for anonymous.
Document the limit in the endpoint’s JSDoc comment.

Testing

Every endpoint needs at least one happy-path and one error-path integration test.
Use supertest for HTTP assertions.
Mock external services with msw.


This skill activates whenever the user's prompt mentions API-related keywords. It does not add any tools -- the model already knows how to write code. It shapes *how* the model writes code for your specific project, ensuring consistent adherence to team conventions.

Best practices for custom skills:

Keep it under [500 tokens]{.kw-slate}. Remember the 1% budget. A skill that consumes the entire budget crowds out other active skills. Aim for density: every sentence should change the model's behavior.

Use [imperative language]{.kw-sage}. "ALWAYS use parameterized queries" is more effective than "It's generally a good idea to consider using parameterized queries." The model responds to clear directives.

**Be specific to your project.** Generic advice ("write clean code") is already in the model's training data. Skill content should encode project-specific decisions that the model cannot infer: your query builder's API, your error shape, your rate limit defaults.

**Test trigger conditions.** If your trigger keywords are too broad ("code", "fix", "write"), the skill will activate on nearly every prompt, consuming budget unnecessarily. If they are too narrow ("PostgreSQL connection pool configuration"), the skill will rarely fire. Find the middle ground.


---

## Skills and the Extensibility Stack: Composition Patterns

**Skills compose with hooks, MCP, and custom agents to form a complete extensibility architecture. Each mechanism operates at a different layer, and they complement rather than compete.**

Consider a full-stack development workflow:

```{mermaid}
%%| label: fig-composition
%%| fig-cap: "Extensibility composition across four layers in a full-stack workflow. Layer 1 (Skills) injects behavioral guidance such as semantic HTML rules and parameterized-query mandates. Layer 2 (Tools + MCP) provides executable capabilities including built-in file operations and external Postgres/GitHub integrations. Layer 3 (Hooks) enforces policies via PreToolUse and PostToolUse interceptors. Layer 4 (Custom Agents) adds isolated personas like a security-reviewer or API doc-writer. Data flows downward: skills shape how the model reasons, tools define what it can do, hooks enforce what it must not do, and agents provide specialized execution contexts."
%%{init: {'theme': 'neutral', 'flowchart': {'useMaxWidth': false, 'htmlLabels': true, 'padding': 20, 'nodeSpacing': 30, 'rankSpacing': 40}, 'themeVariables': {'primaryColor': '#8B9DAF', 'primaryTextColor': '#ffffff', 'primaryBorderColor': '#6E7F91', 'secondaryColor': '#9CAF88', 'secondaryTextColor': '#ffffff', 'secondaryBorderColor': '#7A8D68', 'tertiaryColor': '#C2856E', 'tertiaryTextColor': '#ffffff', 'tertiaryBorderColor': '#A06A54', 'lineColor': '#B5A99A', 'textColor': '#4A4A4A', 'mainBkg': '#8B9DAF', 'nodeBorder': '#6E7F91', 'clusterBkg': 'rgba(139,157,175,0.12)', 'clusterBorder': '#B5A99A', 'edgeLabelBackground': 'transparent'}}}%%
flowchart TD
  subgraph L1["LAYER 1: SKILLS (shape reasoning)"]
    FD["frontend-design: semantic HTML, ARIA labels"]
    TA["team-api-standards: parameterized queries only"]
  end

  subgraph L2["LAYER 2: TOOLS + MCP (enable actions)"]
    BI["Built-in: Read, Write, Edit, Bash, Grep"]
    PG["MCP: postgres query"]
    GH["MCP: github create pr"]
  end

  subgraph L3["LAYER 3: HOOKS (enforce policies)"]
    H1["PreToolUse on Bash: block rm -rf / patterns"]
    H2["PostToolUse on Write: run prettier"]
    H3["PostToolUse on Edit: run eslint"]
  end

  subgraph L4["LAYER 4: CUSTOM AGENTS (specialized personas)"]
    SR["security-reviewer: read-only, checks for vulns"]
    DW["api-doc-writer: Write access, generates OpenAPI"]
  end

  L1 -- "Skills shape HOW the model reasons" --> L2
  L2 -- "Tools + MCP define WHAT the model can do" --> L3
  L3 -- "Hooks enforce WHAT the model must not do" --> L4
  style FD fill:#8B9DAF,color:#fff,stroke:#6E7F91
  style TA fill:#9CAF88,color:#fff,stroke:#7A8D68
  style BI fill:#C2856E,color:#fff,stroke:#A06A54
  style PG fill:#B39EB5,color:#fff,stroke:#8E7A93
  style GH fill:#C4A882,color:#fff,stroke:#A08562
  style H1 fill:#8E9B7A,color:#fff,stroke:#6E7B5A
  style H2 fill:#8B9DAF,color:#fff,stroke:#6E7F91
  style H3 fill:#9CAF88,color:#fff,stroke:#7A8D68
  style SR fill:#C2856E,color:#fff,stroke:#A06A54
  style DW fill:#B39EB5,color:#fff,stroke:#8E7A93

How to read this diagram. Four horizontal layers stack from top to bottom, each representing a different extension mechanism. Layer 1 (Skills) contains behavioral guidance like semantic HTML rules. Layer 2 (Tools + MCP) lists executable capabilities. Layer 3 (Hooks) shows enforcement interceptors. Layer 4 (Custom Agents) defines specialized personas. Arrows between layers indicate the direction of influence: skills shape reasoning, which flows into tool execution, which is gated by hooks, which feeds into agent-specific review contexts.

The composition works because each layer is independent:

Skills tell the model to use parameterized queries. They cannot enforce it.
Hooks enforce that no raw SQL reaches the database. They do not know why parameterized queries matter.
MCP provides the database connection. It does not care about query style.
Custom agents apply specialized review. They inherit the skill context of their parent.

This layered independence is the separation of concerns principle applied to agent extensibility. No single mechanism tries to do everything. Skills are pure knowledge injection. Hooks are pure policy enforcement. MCP is pure capability extension. The composition produces behavior that no single mechanism could achieve alone.

The decision tree for choosing the right extension point:

“I want the model to follow our coding standards” –> Skill (modify reasoning)

“I want every file write to be auto-formatted” –> Hook (PostToolUse on Write)

“I want the model to query our internal database” –> MCP server (add capability)

“I want a security-focused review of my PR” –> Custom agent (isolated persona)

“I want to block deployment on Fridays” –> Hook (PreToolUse on Bash, check day-of-week)

“I want the model to know our deployment process” –> Skill (inject process documentation)

Design Patterns in the Skills System

Three classical design patterns converge in the skills architecture: Template Method, Dependency Injection, and Separation of Concerns.

Template Method: Base Behavior + Skill Customizations

The Template Method pattern defines the skeleton of an algorithm in a base class, letting subclasses override specific steps. Claude Code’s base system prompt defines the agent’s general behavior: be helpful, use tools effectively, follow safety rules. Skills “override” specific aspects by injecting domain-specific instructions that customize behavior for particular contexts.

The model’s general behavior is the template method. Each skill is a hook method that customizes one aspect. Without skills, the reasoning step uses general reasoning. With frontend-design, it prioritizes accessibility and responsive design. With debugging, it follows reproduce-hypothesize-test-fix. With code-review, it checks bugs, security, performance, and maintainability in sequence.

The base behavior is never modified. Skills overlay additional constraints that specialize the general algorithm for a specific domain.

Dependency Injection of Knowledge

In a traditional DI container, you inject services at construction time: new OrderController(paymentService, inventoryService). The controller does not know which payment implementation it uses – it depends on the abstraction, and the container provides the concrete implementation.

Skills inject knowledge at prompt construction time. The agent does not know which coding standards it follows until the prompt is assembled and the relevant skill is injected. The skill registry is the DI container. The trigger conditions are the binding configuration. The assembled prompt is the fully-constructed object with all dependencies resolved.

Separation of Concerns: Capabilities vs. Behavior

The cleanest design pattern in the skills system is the separation of what the agent can do (tools) from how it should behave (skills). This separation means:

Adding a new tool does not change the agent’s coding style
Adding a new skill does not change the agent’s capability set
Changing a skill’s trigger does not affect other skills
Each concern evolves independently

This is the same principle that separates CSS from HTML, business logic from data access, and policy from mechanism in operating systems.

Pattern Spotted

The skills system as a whole implements the Strategy pattern at the knowledge level. Traditional Strategy lets you swap algorithms at runtime. Skills let you swap domain expertise at prompt assembly time. The context (the agent) has a placeholder for expertise. The concrete strategy (the skill) fills it based on trigger conditions. The model’s reasoning is parameterized by the injected knowledge.

Summary

The skills system reveals several principles that generalize well beyond Claude Code:

Skills decouple expertise from capabilities. The model’s capability set (tools) and its behavioral expertise (skills) evolve independently. Adding a new tool does not require updating any skill. Adding a new skill does not touch the tool registry. This separation makes the system composable: any skill works with any tool combination. In traditional software, this is the interface segregation principle – clients should not depend on interfaces they do not use.

The system prompt is a configuration surface, not a static string. Skills, MCP instructions, CLAUDE.md content, and system reminders all inject into the system prompt. It is assembled at runtime from fragments selected by context. This transforms Claude Code from a fixed-behavior agent into a configurable platform – the same binary serves different workflows by assembling different prompts.

Compile-time injection beats runtime correction. Skills inject before reasoning begins, shaping the model’s approach from its first token. This is cheaper and more effective than trying to correct the model’s behavior after it has already started down the wrong path. The same principle applies in traditional software: it is cheaper to prevent bugs at compile time (type systems, static analysis) than to detect them at runtime (testing, monitoring).

Token economics are the physics of LLM engineering. The 1% skill budget is not arbitrary – it reflects a real constraint. Every token allocated to skills is a token taken from conversation history. The budget forces conciseness, which produces better skills. Constraint breeds creativity; scarcity breeds efficiency.

Curated instructions beat training data for task-specific work. The claude-api skill exists even though the model was trained on Anthropic SDK documentation. The skill wins because it is authoritative (single source of truth), current (updated with the SDK), and prominent (injected into the system prompt, which has higher influence weight than training data). When accuracy matters, curate.

Skills transform a general agent into a specialist without forking. A single Claude Code binary serves React developers, backend engineers, data scientists, and DevOps teams. Skills are the mechanism: each specialty loads its own expertise modules. This is the plugin architecture pattern applied to cognition – the same pattern that lets VS Code serve every programming language through extensions.

This post extends the Inside Claude Code series with a deep dive into the skills system that was surveyed in Part III.4: Hooks & Lifecycle. For the prompt assembly pipeline that skills inject into, see Part III.1: Prompt Assembly. For system reminders – the runtime counterpart to compile-time skills – see Part III.2: Context Compaction.