Skip to content
Back to articles

Agent Skills: Teaching AI Agents How to Actually Work

April 20, 2026/5 min read/933 words
AI AgentsIBMMCPAI SecurityOpen Source
Martin Keen from IBM Technology explaining AI agent skills on a whiteboard
Image: Screenshot from YouTube.
Published April 20, 2026
IBM Technology
IBM Technology
Hosts:Martin Keen

This is an AI-generated summary. The source video may include demos, visuals and additional context.

Watch the video · How the articles are generated

In Brief

AI agents are surprisingly good at knowing things. Ask one about Kubernetes architecture or the history of SQL and it answers fluently. But ask it to run your company's 47-step workflow for generating a compliant financial report and it either needs someone to spell out every single step (every time) or it just guesses.

The missing piece is procedural knowledge: not facts about the world, but how to do specific work in a specific order. Agent skills are how you close that gap.

What a Skill Actually Is

A skill is, as IBM's Martin Keen puts it, "almost comically simple": a folder containing a single text file called SKILL.md.

That markdown file has two parts. At the top sits a small block of YAML metadata (key information written as name: value pairs) with two mandatory fields: name (what the skill is called) and description (what the skill does and when the agent should use it). The description is the trigger. It tells the agent whether this skill applies to whatever task is in front of it.

Below the metadata lives the body: plain markdown with step-by-step instructions, rules, and examples of input and output — whatever the agent needs to do the job.

The folder can also contain three optional subdirectories:

  • scripts/ — executable JavaScript, Python, or shell scripts the agent can run
  • references/ — extra documentation loaded only if the agent decides it needs it
  • assets/ — static resources like templates and data files

That's the whole thing.

What a real SKILL.md looks like

Here's the main file (SKILL.md) from Anthropic's own pdf-processing skill. At the top sits the YAML metadata the agent reads at startup:

---
name: pdf-processing
description: Extracts text and tables from PDF files, fills forms, and merges documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction.
---

Below comes the body in plain markdown. The file shows a short Python snippet to get started, then points to other files the agent can pull in on demand:

# PDF Processing

## Quick start

Extract text with pdfplumber:
```python
import pdfplumber
with pdfplumber.open("file.pdf") as pdf:
    text = pdf.pages[0].extract_text()
```

## Advanced features

**Form filling**: See [FORMS.md](FORMS.md) for complete guide
**API reference**: See [REFERENCE.md](REFERENCE.md) for all methods
**Examples**: See [EXAMPLES.md](EXAMPLES.md) for common patterns

The Python snippet does one concrete thing: open a PDF file and pull text from the first page. The rest of the skill lives in separate files. The agent reads only the top-level file first. If the user asks about form filling, it fetches FORMS.md on demand. Otherwise that file stays untouched and consumes no space in the context window.

Progressive Disclosure: Three Tiers

An agent can have hundreds of skills installed. Loading all of them into the model's context window (its working memory) at startup would fill it up before anyone asked a single question. So skills use progressive disclosure — a three-tier loading strategy:

  1. Metadata only — at startup, the agent loads just the name and description from every installed skill. A tiny amount of space per skill, even across a hundred.
  2. Full instructions — when the agent receives a request that matches a skill's description, it reads the complete SKILL.md body into context. The matching is done by the model's own reasoning, which is why a precise description matters so much.
  3. Resources on demand — scripts, references, and assets are pulled in only when a specific task actually needs them.

The agent starts with a lightweight index of everything it can do, pulls in detailed instructions when relevant, and grabs resources at the exact moment of need.

Four Ways to Give an Agent Knowledge

Skills are one of several ways to add knowledge to an agent. They each handle something different:

MethodWhat it gives the agentLimitation
SkillsProcedural knowledge — how to do things, in what order, with what judgmentOnly useful for repeatable, definable workflows
MCP (Model Context Protocol)Tool access — the ability to call external APIs and servicesGives the capability to reach out; doesn't teach the agent when or how
RAG (Retrieval-Augmented Generation)Factual knowledge — pulls relevant chunks from a database at runtimeReference material; doesn't teach workflows
Fine-tuningBakes knowledge permanently into the model's weightsExpensive, and must be redone every time the model changes

In practice, skills and MCP work well together: MCP provides the capability to invoke something external, and the skill provides the judgment for when and how to do it.

The Cognitive Science Behind It

There's a useful parallel from cognitive science. Humans have three distinct types of memory:

  • Semantic memory — facts. Rome is the capital of Italy.
  • Episodic memory — personal experiences. I went to Rome last summer (and it was lovely).
  • Procedural memory — know-how. How to ride a scooter through Roman traffic and live to tell the tale.

Agent architectures are starting to mirror this: RAG and knowledge bases map to semantic memory, conversation history maps to episodic memory, and skill files map directly to procedural memory.

Before You Install That Skill

Because skills can include executable scripts with access to your file system, environment variables, and API keys, they're powerful — and that power cuts both ways. Security audits have found publicly available skills containing prompt injection attacks, tool poisoning, and hidden malware.

Treat skill installation the way any responsible team treats installing a software dependency: read it, understand what it does, and verify the source before running it on your machine.

An Open Standard

The SKILL.md format is an open standard published at agentskills.io under an Apache 2.0 license and maintained by Anthropic. It has been adopted by Claude Code, OpenAI Codex, Cursor, GitHub Copilot, and a growing list of other platforms.

A skill built for one platform works on any platform that supports the spec — the same way a PDF opens in any PDF reader. The procedural knowledge travels with the file, not the tool.

Glossary

TermDefinition
Procedural knowledgeStep-by-step knowledge of how to do something, as opposed to factual knowledge about what something is
Context windowThe total amount of text an AI model can hold in its working memory at one time
Progressive disclosureA loading strategy where only the minimum necessary information is loaded upfront, with more pulled in as needed
MCP (Model Context Protocol)An open protocol that lets AI agents call external tools and services
RAG (Retrieval-Augmented Generation)A technique where the AI retrieves relevant documents from a knowledge base at runtime before generating a response
Fine-tuningRetraining a model on specific data to permanently change its behavior — baked into the model's weights
Prompt injectionAn attack where malicious instructions are hidden inside content the AI is asked to process

Sources and resources

Share this article