Agent Skills: Teaching AI Agents How to Actually Work

This is an AI-generated summary. The source video may include demos, visuals and additional context.
In Brief
AI agents are surprisingly good at knowing things. Ask one about Kubernetes architecture or the history of SQL and it answers fluently. But ask it to run your company's 47-step workflow for generating a compliant financial report and it either needs someone to spell out every single step (every time) or it just guesses.
The missing piece is procedural knowledge: not facts about the world, but how to do specific work in a specific order. Agent skills are how you close that gap.
Related reading:
What a Skill Actually Is
A skill is, as IBM's Martin Keen puts it, "almost comically simple": a folder containing a single text file called SKILL.md.
That markdown file has two parts. At the top sits a small block of YAML metadata (key information written as name: value pairs) with two mandatory fields: name (what the skill is called) and description (what the skill does and when the agent should use it). The description is the trigger. It tells the agent whether this skill applies to whatever task is in front of it.
Below the metadata lives the body: plain markdown with step-by-step instructions, rules, and examples of input and output — whatever the agent needs to do the job.
The folder can also contain three optional subdirectories:
- scripts/ — executable JavaScript, Python, or shell scripts the agent can run
- references/ — extra documentation loaded only if the agent decides it needs it
- assets/ — static resources like templates and data files
That's the whole thing.
What a real SKILL.md looks like
Here's the main file (SKILL.md) from Anthropic's own pdf-processing skill. At the top sits the YAML metadata the agent reads at startup:
---
name: pdf-processing
description: Extracts text and tables from PDF files, fills forms, and merges documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction.
---
Below comes the body in plain markdown. The file shows a short Python snippet to get started, then points to other files the agent can pull in on demand:
# PDF Processing
## Quick start
Extract text with pdfplumber:
```python
import pdfplumber
with pdfplumber.open("file.pdf") as pdf:
text = pdf.pages[0].extract_text()
```
## Advanced features
**Form filling**: See [FORMS.md](FORMS.md) for complete guide
**API reference**: See [REFERENCE.md](REFERENCE.md) for all methods
**Examples**: See [EXAMPLES.md](EXAMPLES.md) for common patterns
The Python snippet does one concrete thing: open a PDF file and pull text from the first page. The rest of the skill lives in separate files. The agent reads only the top-level file first. If the user asks about form filling, it fetches FORMS.md on demand. Otherwise that file stays untouched and consumes no space in the context window.
Progressive Disclosure: Three Tiers
An agent can have hundreds of skills installed. Loading all of them into the model's context window (its working memory) at startup would fill it up before anyone asked a single question. So skills use progressive disclosure — a three-tier loading strategy:
- Metadata only — at startup, the agent loads just the
nameanddescriptionfrom every installed skill. A tiny amount of space per skill, even across a hundred. - Full instructions — when the agent receives a request that matches a skill's description, it reads the complete
SKILL.mdbody into context. The matching is done by the model's own reasoning, which is why a precise description matters so much. - Resources on demand — scripts, references, and assets are pulled in only when a specific task actually needs them.
The agent starts with a lightweight index of everything it can do, pulls in detailed instructions when relevant, and grabs resources at the exact moment of need.
Four Ways to Give an Agent Knowledge
Skills are one of several ways to add knowledge to an agent. They each handle something different:
| Method | What it gives the agent | Limitation |
|---|---|---|
| Skills | Procedural knowledge — how to do things, in what order, with what judgment | Only useful for repeatable, definable workflows |
| MCP (Model Context Protocol) | Tool access — the ability to call external APIs and services | Gives the capability to reach out; doesn't teach the agent when or how |
| RAG (Retrieval-Augmented Generation) | Factual knowledge — pulls relevant chunks from a database at runtime | Reference material; doesn't teach workflows |
| Fine-tuning | Bakes knowledge permanently into the model's weights | Expensive, and must be redone every time the model changes |
In practice, skills and MCP work well together: MCP provides the capability to invoke something external, and the skill provides the judgment for when and how to do it.
The Cognitive Science Behind It
There's a useful parallel from cognitive science. Humans have three distinct types of memory:
- Semantic memory — facts. Rome is the capital of Italy.
- Episodic memory — personal experiences. I went to Rome last summer (and it was lovely).
- Procedural memory — know-how. How to ride a scooter through Roman traffic and live to tell the tale.
Agent architectures are starting to mirror this: RAG and knowledge bases map to semantic memory, conversation history maps to episodic memory, and skill files map directly to procedural memory.
Before You Install That Skill
Because skills can include executable scripts with access to your file system, environment variables, and API keys, they're powerful — and that power cuts both ways. Security audits have found publicly available skills containing prompt injection attacks, tool poisoning, and hidden malware.
Treat skill installation the way any responsible team treats installing a software dependency: read it, understand what it does, and verify the source before running it on your machine.
An Open Standard
The SKILL.md format is an open standard published at agentskills.io under an Apache 2.0 license and maintained by Anthropic. It has been adopted by Claude Code, OpenAI Codex, Cursor, GitHub Copilot, and a growing list of other platforms.
A skill built for one platform works on any platform that supports the spec — the same way a PDF opens in any PDF reader. The procedural knowledge travels with the file, not the tool.
Glossary
| Term | Definition |
|---|---|
| Procedural knowledge | Step-by-step knowledge of how to do something, as opposed to factual knowledge about what something is |
| Context window | The total amount of text an AI model can hold in its working memory at one time |
| Progressive disclosure | A loading strategy where only the minimum necessary information is loaded upfront, with more pulled in as needed |
| MCP (Model Context Protocol) | An open protocol that lets AI agents call external tools and services |
| RAG (Retrieval-Augmented Generation) | A technique where the AI retrieves relevant documents from a knowledge base at runtime before generating a response |
| Fine-tuning | Retraining a model on specific data to permanently change its behavior — baked into the model's weights |
| Prompt injection | An attack where malicious instructions are hidden inside content the AI is asked to process |
Sources and resources
Want to go deeper? Watch the full video on YouTube →