Why AI Agents Need Their Own File System

Key insights
- AI agents are stateless. Their context window works like RAM: everything disappears when the session ends.
- RAG only solves the input side. Agents also need a way to save their output, like code or reports, between sessions.
- MCP provides a standard interface so agents can interact with any storage system without custom integrations.
- Safety layers like immutable versioning and sandboxing are essential when giving autonomous agents write access.
This article is a summary of What Is Agentic Storage? Solving AI's Limits with LLMs & MCP. Watch the video โ
Read this article in norsk
In Brief
Martin Keen, master inventor at IBM, argues that AI agents have a fundamental memory problem. Their context window works like RAM: the moment a session ends, everything the agent learned or created disappears. Retrieval-augmented generation (RAG) helps agents read information, but it does nothing for the output side. Agentic storage is the proposed solution: a persistent storage layer built specifically for autonomous agents, connected through the Model Context Protocol (MCP) and protected by safety mechanisms like immutable versioning and sandboxing.
The core problem: agents with amnesia
Large language models (LLMs) are stateless, meaning they don't remember anything between sessions. Their entire working memory exists inside the context window, a temporary buffer that Keen compares to RAM in a computer (0:26). When the session ends or the context window fills up, the agent's memory resets completely. It forgets what it did, what it learned, and what it produced.
This is fine for chatbots that answer one-off questions. But agentic AI systems do real work: they write code, create reports, and fix incidents autonomously (0:13). Without persistent memory, every session starts from scratch.
Why RAG is only half the answer
RAG is a technique where the AI looks up relevant documents in a vector database (a database that converts text, numbers, and images into mathematical representations for search) before generating a response. This partially addresses the memory problem by letting agents pull in context from external sources (1:07).
But Keen points out that RAG is fundamentally a read-only operation (1:42). It solves the input problem of getting information into the model, but not the output problem. If an agent writes a Python script or creates a step-by-step plan for fixing a problem, where does that work product go? Without a writable storage layer, it simply vanishes.
The proposed solution: agentic storage via MCP
Keen describes agentic storage as more than just "giving an agent a hard drive" (2:07). It's a storage layer that is aware of and designed for autonomous agents.
The integration challenge
Connecting agents to storage is not straightforward. A typical enterprise might have object storage (for files and media), block storage (for databases), and network-attached storage (NAS), each with different APIs, data models, and login methods (2:38). Writing custom integrations for each one does not scale.
MCP as the standard interface
Keen argues the industry is converging on the Model Context Protocol (MCP), an open standard created by Anthropic, as the solution (3:10). MCP provides a uniform interface between an AI application (the MCP host) and a storage system (the MCP server), using JSON-RPC (a lightweight protocol for sending structured requests) as the communication layer.
The MCP server offers two key building blocks:
- Resources: Passive data objects like file contents and database records. When the agent needs context, it requests resources. Conceptually similar to RAG, but standardized (4:18).
- Tools: Executable functions like
list directory,read file,write file, andcreate snapshot. The agent calls the tool, and the MCP server handles the translation to whatever storage system sits underneath (4:38).
Opposing perspectives
The security concern
Keen acknowledges the obvious objection by bringing in colleague Jeff Crume, IBM Distinguished Engineer and security specialist. Giving AI agents write access to a company's storage infrastructure raises serious risks (5:04). Agents can hallucinate (generate confident but wrong answers), misinterpret instructions, and take actions that seem logical in isolation but are catastrophic in context (5:10).
MCP adoption is still early
While Keen claims the industry is "converging" on MCP, the protocol is still relatively new. Anthropic introduced it in November 2024, and major providers like OpenAI and Google have adopted it, but storage companies are still in early stages of integration. Whether MCP becomes the dominant standard remains an open question.
Three safety layers for agent-aware storage
Rather than treating this as a reason to avoid agent storage entirely, Keen proposes building safety into the storage layer itself. He outlines three mechanisms that are "overkill for humans, but essential for AI" (5:43):
1. Immutable versioning. Every write operation creates a new version rather than overwriting. The agent can never truly delete data, only archive it. This provides a complete audit trail (a log of every change) and the ability to roll back any action (5:50).
2. Sandboxing. The agent operates within a limited environment with access to specific directories and operations only. If an agent manages application logs, it has no path to system binaries. This prevents the confused deputy problem (a security flaw where a trusted program is tricked into misusing its permissions) (6:16).
3. Intent validation. Before executing high-impact operations, the storage layer requires the agent to explain why. The agent must generate a reasoning chain, for example: "I'm deleting these files because they're older than 90 days and that matches the retention policy." The storage layer then verifies that claim before proceeding (6:47).
How to interpret these claims
A concept video, not a product demo
This video explains a concept rather than demonstrating a working product. The safety layers Keen describes (immutable versioning, sandboxing, intent validation) are architectural patterns, not features you can deploy today from a single vendor. Organizations wanting agentic storage would need to assemble these pieces from multiple tools and platforms.
IBM's commercial context
IBM launched its next-generation FlashSystem portfolio with agentic AI capabilities in February 2026. While the video does not mention FlashSystem directly, it introduces terminology and concepts that fit IBM's product strategy. This does not invalidate the technical arguments, but readers should be aware of the commercial context.
What remains uncertain
The video presents a clean separation between "agents that read" (RAG) and "agents that write" (agentic storage). In practice, the boundary is blurrier. Many agent frameworks already persist state through databases, file systems, and tool integrations without a formal "agentic storage" layer. The question is whether a standardized, safety-aware approach will prove necessary at scale, or whether existing ad-hoc solutions are enough.
Practical implications
For developers building AI agents
The distinction between read and write operations is worth considering when designing agent architectures. If agents produce output (code, documents, configurations), planning where and how that output is saved is as important as designing the prompt or choosing the model.
For enterprise IT teams
The safety layers Keen describes align with established security principles (least privilege, audit trails, access control) applied to a new context. Organizations already running AI agents should check whether their current storage setup has enough safeguards for autonomous systems.
Glossary
| Term | Definition |
|---|---|
| Agentic storage | Storage designed for autonomous AI agents, with built-in safety layers like versioning and sandboxing. |
| Context window | The limited working memory an AI model uses during a conversation. Like RAM: temporary and volatile. |
| RAG | Retrieval-augmented generation. A technique where the AI looks up relevant documents before answering. |
| MCP | Model Context Protocol. An open standard for connecting AI applications to external tools and storage. |
| LLM | Large language model. The AI system that powers tools like ChatGPT and Claude. |
| Vector database | A database that converts text, numbers, and images into mathematical representations, enabling search by meaning. |
| JSON-RPC | A lightweight protocol for sending structured requests between systems. Used by MCP. |
| Immutable versioning | Every change creates a new version instead of overwriting. Nothing is truly deleted. |
| Sandboxing | Restricting a program to a limited environment so it cannot access things outside its scope. |
| Confused deputy problem | A security flaw where a trusted program is tricked into misusing its permissions. |
| Intent validation | Requiring an AI agent to explain its reasoning before executing high-impact actions. |
| Object storage | Storage that manages data as objects (files with metadata). Used for unstructured data. |
| Block storage | Storage that manages data as fixed-size blocks. Used for databases and virtual machines. |
| NAS | Network-attached storage. Storage connected to a network that multiple devices can access. |
Sources and resources
Want to go deeper? Watch the full video on YouTube โ