1.
The Claude Code Skill ecosystem is expanding rapidly. As of March 2026, the anthropics/skills repository reached over 87,000 stars on GitHub and more individuals are constructing and sharing Skills every week.
How can we construct a Skill from scratch in a structured way? This text walks through designing, constructing, and distributing a Skill from scratch. I’ll use my very own experience shipping an e-commerce review Skill (Link) as a running example throughout.
2. What Is a Claude Skill?
A Claude skill is a set of instructions that teaches Claude learn how to handle specific tasks or workflows. Skills are one of the crucial powerful ways to customize Claude to your specific needs.
Skills are built around progressive disclosure. Claude fetches information in three stages:
- Metadata (name + description): All the time in Claude’s context. About 100 tokens. Claude decides whether to load a Skill based on this alone.
- SKILL.md body: Loaded only when triggered.
- Bundled resources (scripts/, references/, assets/): Loaded on demand when needed.
With this structure, you may install many Skills without blowing up the context window. For those who keep copy-pasting the identical long prompt, just turn it right into a Skill.
3. Skills vs MCP vs Subagents
Before constructing a Skill, let me walk you thru how Skills, MCP, and Subagents are different, so you may be sure that a Skill is the appropriate selection.
- Skills teach Claude learn how to behave — evaluation workflows, coding standards, brand guidelines.
- MCP servers give Claude latest tools — sending a Slack message, querying a database.
- Subagents let Claude run independent work in a separate context.
An analogy that helped me: MCP is the kitchen — knives, pots, ingredients. A Skill is the recipe that tells you learn how to use them. You’ll be able to mix them. Sentry’s code review Skill, for instance, defines the PR evaluation workflow in a Skill and fetches error data via MCP. But in lots of cases a Skill alone is enough to begin.
4. Planning and Design
I jumped straight into writing SKILL.md the primary time and bumped into problems. If the outline is just not well designed, the Skill is not going to even trigger. I’d say spend time on design before you write the prompts or code.
4a. Start with Use Cases
The very first thing to do is define 2–3 concrete use cases. Not “a helpful Skill” within the abstract, but actual repetitive work that you just observe in practice.
Let me share my very own example. I noticed that many colleagues and I were repeating the identical monthly and quarterly business reviews. In e-commerce and retail, the strategy of breaking down KPIs tends to follow an analogous pattern.
That was the start line. As an alternative of constructing a generic ‘data evaluation Skill,’ I defined it like this: “A Skill that takes order CSV data, decomposes KPIs right into a tree, summarizes findings with priorities, and generates a concrete motion plan.”
Here, it can be crucial to assume how users will actually phrase their requests:
- “run a review of my store using this orders.csv”
- “analyze last 90 days of sales data, break down why revenue dropped”
- “compare Q3 vs Q4, find the highest 3 things I should fix”
While you write concrete prompts like these first, the form of the Skill becomes clear. The input is CSV. The evaluation axis is KPI decomposition. The output is a review report and motion plan. The user is just not an information scientist — they’re someone running a business and so they need to know what to do next.
That level of detail shapes every thing else: Skill name, description, file formats, output format.
Inquiries to ask when defining use cases:
- Who will use it?
- In what situation?
- How will they phrase their request?
- What’s the input?
- What’s the expected output?
4b. YAML Frontmatter
Once use cases are clear, write the name and outline. It decides whether your Skill actually triggers.
As I discussed earlier, Claude only sees the metadata to choose which Skill to load. When a user request is available in, Claude decides which Skills to load based on this metadata alone. If the outline is vague, Claude won’t ever reach the Skill — irrespective of how good the instructions within the body are.
To make things trickier, Claude tends to handle easy tasks by itself without consulting Skills. It defaults to not triggering. So your description must be specific enough that Claude recognizes “this can be a job for the Skill, not for me.”
So the outline must be somewhat “pushy.” Here’s what I mean.
# Bad — too vague. Claude doesn't know when to trigger.
name: data-helper
description: Helps with data tasks
# Good — specific trigger conditions, barely "pushy"
name: sales-data-analyzer
description: >
Analyze sales/revenue CSV and Excel files to search out patterns,
calculate metrics, and create visualizations. Use when user
mentions sales data, revenue evaluation, profit margins, churn,
ad spend, or asks to search out patterns in business metrics.
Also trigger when user uploads xlsx/csv with financial or
transactional column headers.
A very powerful thing is being explicit about what the Skill does and what input it expects — “Analyze sales/revenue CSV and Excel files” leaves no ambiguity. After that, list the trigger keywords. Return to the use case prompts you wrote in 4a and pull out the words users actually say: sales data, revenue evaluation, profit margins, churn. Finally, think concerning the cases where the user doesn’t mention your Skill by name. “Also trigger when user uploads xlsx/csv with financial or transactional column headers” catches those silent matches.
The constraints are: name as much as 64 characters, description as much as 1,024 characters (per the Agent Skills API spec). You will have room, but prioritize information that directly affects triggering.
5. Implementation Patterns
Once the design is about, let’s implement. First, understand the file structure, then pick the appropriate pattern.
5a. File Structure
The physical structure of a Skill is straightforward:
my-skill/
├── SKILL.md # Required. YAML frontmatter + Markdown instructions
├── scripts/ # Optional. Python/JS for deterministic processing
│ ├── analyzer.py
│ └── validator.js
├── references/ # Optional. Loaded by Claude as needed
│ ├── advanced-config.md
│ └── error-patterns.md
└── assets/ # Optional. Templates, fonts, icons, etc.
└── report-template.docx
Only SKILL.md is required. That alone makes a working Skill. Try to maintain SKILL.md under 500 lines. If it gets longer, move content into the references/ directory and tell Claude in SKILL.md where to look. Claude is not going to read reference files unless you point it there.
For Skills that branch by domain, the variant approach works well:
cloud-deploy/
├── SKILL.md # Shared workflow + selection logic
└── references/
├── aws.md
├── gcp.md
└── azure.md
Claude reads only the relevant reference file based on the user’s context.
5b. Pattern A: Prompt-Only
The best pattern. Just Markdown instructions in SKILL.md, no scripts.
Good for: brand guidelines, coding standards, review checklists, commit message formatting, writing style enforcement.
When to make use of: If Claude’s language ability and judgment are enough for the duty, use this pattern.
Here’s a compact example:
---
name: commit-message-formatter
description: >
Format git commit messages using Conventional Commits.
Use when user mentions commit, git message, or asks to
format/write a commit message.
---
# Commit Message Formatter
Format all commit messages following Conventional Commits 1.0.0.
## Format
():
## Rules
- Imperative mood, lowercase, no period, max 72 chars
- Breaking changes: add `!` after type/scope
## Example
Input: "added user auth with JWT"
Output: `feat(auth): implement JWT-based authentication`
That’s it. No scripts, no dependencies. If Claude’s judgment is enough for the duty, that is all you wish.
5c. Pattern B: Prompt + Scripts
Markdown instructions plus executable code within the scripts/ directory.
Good for: data transformation/validation, PDF/Excel/image processing, template-based document generation, numerical reports.
Supported languages: Python and JavaScript/Node.js. Here is an example structure:
data-analysis-skill/
├── SKILL.md
└── scripts/
├── analyze.py # Foremost evaluation logic
└── validate_schema.js # Input data validation
Within the SKILL.md, you specify when to call each script:
## Workflow
1. User uploads a CSV or Excel file
2. Run `scripts/validate_schema.js` to envision column structure
3. If validation passes, run `scripts/analyze.py` with the file path
4. Present results with visualizations
5. If validation fails, ask user to make clear column mapping
The SKILL.md defines the “when and why.” The scripts handle the “how.”
5d. Pattern C: Skill + MCP / Subagent
This pattern calls MCP servers or Subagents from throughout the Skill’s workflow. Good for workflows involving external services — think create issue → create branch → fix code → open PR. More moving parts mean more things to debug, so I’d recommend getting comfortable with Pattern A or B first.
Selecting the Right Pattern
For those who usually are not sure which pattern to select, follow this order:
- Need real-time communication with external APIs? → Yes → Pattern C
- Need deterministic processing like calculations, validation, or file conversion? → Yes → Pattern B
- Claude’s language ability and judgment handle it alone? → Yes → Pattern A
When unsure, start with Pattern A. It is simple so as to add scripts later and evolve into Pattern B. But simplifying a very complex Skill is harder.
6. Testing
Writing the SKILL.md is just not the tip. What makes a Skill good is how much you test and iterate.
6a. Writing Test Prompts
“Testing” here doesn’t mean unit tests. It means throwing real prompts on the Skill and checking whether it behaves appropriately.
The one rule for test prompts: write them the way in which real users actually talk.
# Good test prompt (realistic)
"okay so my boss just sent me this XLSX file (its in my downloads,
called something like 'Q4 sales final FINAL v2.xlsx') and she or he wants
me so as to add a column that shows the profit margin as a percentage.
The revenue is in column C and costs are in column D i believe"
# Bad test prompt (too clean)
"Please analyze the sales data within the uploaded Excel file
and add a profit margin column"
The issue with clean test prompts is that they don’t reflect reality. Real users make typos, use casual abbreviations, and forget file names. A Skill tested only with clean prompts will break in unexpected ways in production.
6b. The Iteration Loop
The essential testing loop is straightforward:
- Run the Skill with test prompts
- Evaluate whether the output matches what you defined pretty much as good output in 4a
- Fix the SKILL.md if needed
- Return to 1
You’ll be able to run this loop manually, but Anthropic’s skill-creator may help quite a bit. It semi-automates test case generation, execution, and review. It uses a train/test split for evaluation and allows you to review outputs in an HTML viewer.
6c. Optimizing the Description
As you test, you might find the Skill works well when triggered but doesn’t trigger often enough. The skill-creator has a built-in optimization loop for this: it splits test cases 60/40 into train/test, measures trigger rate, generates improved descriptions, and picks the most effective one by test rating.
One thing I learned: Claude rarely triggers Skills for brief, easy requests. So be sure that your test set includes prompts with enough complexity.
7. Distribution
Once your Skill is prepared, it’s worthwhile to get it to users. The perfect method is dependent upon whether it’s only for you, your team, or everyone.
Getting Your Skill to Users
For most individuals, two methods cover every thing:
ZIP upload (claude.ai): ZIP the Skill folder and upload via Settings > Customize > Skills. One gotcha — the ZIP must contain the folder itself at the basis, not only the contents.
.claude/skills/ directory (Claude Code): Place the Skill in your project repo under .claude/skills/. When teammates clone the repo, everyone gets the identical Skill.
Beyond these, there are more options as your distribution needs grow: the Plugin Marketplace for open-source distribution, the Anthropic Official Marketplace for broader reach, Vercel’s npx skills add for cross-agent installs, and the Skills API for programmatic management. I won’t go into detail on each here — the docs cover them well.
Before sharing, check three things: the ZIP has the folder at root (not only contents), the frontmatter has each name and outline throughout the character limits, and there aren’t any hardcoded API keys.
And yet another thing — bump the version field whenever you update. Auto-update won’t kick in otherwise. Treat user feedback like “it didn’t trigger on this prompt” as latest test cases. The iteration loop from Section 6 doesn’t stop at launch.
Conclusion
A Skill is a reusable prompt with structure. You package what you already know a couple of domain into something others can install and run.
The flow: determine whether you wish a Skill, MCP, or Subagent. Design from use cases and write an outline that truly triggers. Pick the only pattern that works. Test with messy, realistic prompts. Ship it and keep iterating.
Skills are still latest and there’s loads of room. For those who keep doing the identical evaluation, the identical review, the identical formatting work again and again, that repetition is your Skill waiting to be built.
If you might have questions or need to share what you built, find me on LinkedIn.
