Why AI Agents Need Their Own File System

Key insights

AI agents are stateless. Their context window works like RAM: everything disappears when the session ends.
RAG only solves the input side. Agents also need a way to save their output, like code or reports, between sessions.
MCP provides a standard interface so agents can interact with any storage system without custom integrations.
Safety layers like immutable versioning and sandboxing are essential when giving autonomous agents write access.

SourceYouTube

Published March 5, 2026

IBM Technology

Hosts:Martin Keen

This is an AI-generated summary. The source video may include demos, visuals and additional context.

Watch the video · How the articles are generated

In Brief

Martin Keen, master inventor at IBM, argues that AI agents have a fundamental memory problem. Their context window works like RAM: the moment a session ends, everything the agent learned or created disappears. Retrieval-augmented generation (RAG) helps agents read information, but it does nothing for the output side. Agentic storage is the proposed solution: a persistent storage layer built specifically for autonomous agents, connected through the Model Context Protocol (MCP) and protected by safety mechanisms like immutable versioning and sandboxing.

The core problem: agents with amnesia

Large language models (LLMs) are stateless, meaning they don't remember anything between sessions. Their entire working memory exists inside the context window, a temporary buffer that Keen compares to RAM in a computer (0:26). When the session ends or the context window fills up, the agent's memory resets completely. It forgets what it did, what it learned, and what it produced.

This is fine for chatbots that answer one-off questions. But agentic AI systems do real work: they write code, create reports, and fix incidents autonomously (0:13). Without persistent memory, every session starts from scratch.

Why RAG is only half the answer

RAG is a technique where the AI looks up relevant documents in a vector database (a database that converts text, numbers, and images into mathematical representations for search) before generating a response. This partially addresses the memory problem by letting agents pull in context from external sources (1:07).

But Keen points out that RAG is fundamentally a read-only operation (1:42). It solves the input problem of getting information into the model, but not the output problem. If an agent writes a Python script or creates a step-by-step plan for fixing a problem, where does that work product go? Without a writable storage layer, it simply vanishes.

The proposed solution: agentic storage via MCP

Keen describes agentic storage as more than just "giving an agent a hard drive" (2:07). It's a storage layer that is aware of and designed for autonomous agents.

The integration challenge

Connecting agents to storage is not straightforward. A typical enterprise might have object storage (for files and media), block storage (for databases), and network-attached storage (NAS), each with different APIs, data models, and login methods (2:38). Writing custom integrations for each one does not scale.

MCP as the standard interface

Keen argues the industry is converging on the Model Context Protocol (MCP), an open standard created by Anthropic, as the solution (3:10). MCP provides a uniform interface between an AI application (the MCP host) and a storage system (the MCP server), using JSON-RPC (a lightweight protocol for sending structured requests) as the communication layer.

The MCP server offers two key building blocks:

Resources: Passive data objects like file contents and database records. When the agent needs context, it requests resources. Conceptually similar to RAG, but standardized (4:18).
Tools: Executable functions like list directory, read file, write file, and create snapshot. The agent calls the tool, and the MCP server handles the translation to whatever storage system sits underneath (4:38).

Opposing perspectives

The security concern

Keen acknowledges the obvious objection by bringing in colleague Jeff Crume, IBM Distinguished Engineer and security specialist. Giving AI agents write access to a company's storage infrastructure raises serious risks (5:04). Agents can hallucinate (generate confident but wrong answers), misinterpret instructions, and take actions that seem logical in isolation but are catastrophic in context (5:10).

MCP adoption is still early

While Keen claims the industry is "converging" on MCP, the protocol is still relatively new. Anthropic introduced it in November 2024, and major providers like OpenAI and Google have adopted it, but storage companies are still in early stages of integration. Whether MCP becomes the dominant standard remains an open question.

Three safety layers for agent-aware storage

Rather than treating this as a reason to avoid agent storage entirely, Keen proposes building safety into the storage layer itself. He outlines three mechanisms that are "overkill for humans, but essential for AI" (5:43):

1. Immutable versioning. Every write operation creates a new version rather than overwriting. The agent can never truly delete data, only archive it. This provides a complete audit trail (a log of every change) and the ability to roll back any action (5:50).

2. Sandboxing. The agent operates within a limited environment with access to specific directories and operations only. If an agent manages application logs, it has no path to system binaries. This prevents the confused deputy problem (a security flaw where a trusted program is tricked into misusing its permissions) (6:16).

3. Intent validation. Before executing high-impact operations, the storage layer requires the agent to explain why. The agent must generate a reasoning chain, for example: "I'm deleting these files because they're older than 90 days and that matches the retention policy." The storage layer then verifies that claim before proceeding (6:47).

How to interpret these claims

A concept video, not a product demo

This video explains a concept rather than demonstrating a working product. The safety layers Keen describes (immutable versioning, sandboxing, intent validation) are architectural patterns, not features you can deploy today from a single vendor. Organizations wanting agentic storage would need to assemble these pieces from multiple tools and platforms.

IBM's commercial context

IBM launched its next-generation FlashSystem portfolio with agentic AI capabilities in February 2026. While the video does not mention FlashSystem directly, it introduces terminology and concepts that fit IBM's product strategy. This does not invalidate the technical arguments, but readers should be aware of the commercial context.

What remains uncertain

The video presents a clean separation between "agents that read" (RAG) and "agents that write" (agentic storage). In practice, the boundary is blurrier. Many agent frameworks already persist state through databases, file systems, and tool integrations without a formal "agentic storage" layer. The question is whether a standardized, safety-aware approach will prove necessary at scale, or whether existing ad-hoc solutions are enough.

Practical implications

For developers building AI agents

The distinction between read and write operations is worth considering when designing agent architectures. If agents produce output (code, documents, configurations), planning where and how that output is saved is as important as designing the prompt or choosing the model.

For enterprise IT teams

The safety layers Keen describes align with established security principles (least privilege, audit trails, access control) applied to a new context. Organizations already running AI agents should check whether their current storage setup has enough safeguards for autonomous systems.

Glossary

Term	Definition
Agentic storage	Storage designed for autonomous AI agents, with built-in safety layers like versioning and sandboxing.
Context window	The limited working memory an AI model uses during a conversation. Like RAM: temporary and volatile.
RAG	Retrieval-augmented generation. A technique where the AI looks up relevant documents before answering.
MCP	Model Context Protocol. An open standard for connecting AI applications to external tools and storage.
LLM	Large language model. The AI system that powers tools like ChatGPT and Claude.
Vector database	A database that converts text, numbers, and images into mathematical representations, enabling search by meaning.
JSON-RPC	A lightweight protocol for sending structured requests between systems. Used by MCP.
Immutable versioning	Every change creates a new version instead of overwriting. Nothing is truly deleted.
Sandboxing	Restricting a program to a limited environment so it cannot access things outside its scope.
Confused deputy problem	A security flaw where a trusted program is tricked into misusing its permissions.
Intent validation	Requiring an AI agent to explain its reasoning before executing high-impact actions.
Object storage	Storage that manages data as objects (files with metadata). Used for unstructured data.
Block storage	Storage that manages data as fixed-size blocks. Used for databases and virtual machines.
NAS	Network-attached storage. Storage connected to a network that multiple devices can access.

The storage case is easier to place after reading A2A and MCP and the safety-focused view in why AI agents need limits, not superpowers.