When would you use Memory tool (API)?

Persistent User Preferences: A conversational assistant stores user style preferences (output format, tone, language) in a memory file at the start of the relationship. On every subsequent session it reads that file first, so the user never has to repeat instructions like 'always respond in Markdown' or 'keep answers under 200 words'.

When would you use Memory tool (API)?

Long-Running Coding Agent with Architecture State: A coding agent maintains a CLAUDE.md file in /memories that records the project's architecture, linting rules, and key design decisions. As the developer makes changes across days or weeks, the agent updates this file so it always has an accurate understanding of the codebase without requiring a full re-read of every file each session.

When would you use Memory tool (API)?

Customer Support Issue Resolution Caching: A support agent diagnoses a recurring login error in session one and saves the root cause and fix to /memories/common_issues.md. In session two, a different customer reports the same symptom. The agent reads memory first, recognizes the pattern, and provides the cached solution immediately without re-diagnosing.

← ContentsClaude API · advanced

Memory tool (API)

The memory tool is a beta API feature that lets Claude persistently store, retrieve, update, and delete information across completely separate conversations. It works through a filesystem metaphor: Claude issues structured commands to read and write files in a dedicated /memories directory, while your application executes those operations against whatever storage backend you choose — local disk, a database, encrypted cloud storage, or anything else. Because Anthropic never stores the memory files themselves, you retain full control over the data and where it lives. This solves a fundamental limitation of large language models: every conversation normally starts fresh with no knowledge of prior sessions. With the memory tool, a Claude-powered agent can save what it learns — bug fixes it discovered, user preferences it noted, project decisions it made — and recall that knowledge in future sessions without requiring the user to repeat themselves or the developer to stuff enormous conversation histories into the context window. The tool is client-side by design. Claude generates structured tool-call requests (create, view, str_replace, insert, delete, rename) and your application handles the actual I/O. Anthropic provides SDK helper classes (BetaAbstractMemoryTool in Python, betaMemoryTool in TypeScript) to standardize request and response parsing. Because storage is your responsibility, implementations are eligible for Zero Data Retention arrangements and can meet enterprise compliance requirements.

🎧 Listen to this as a podcast episode

When you’d use it

◆Persistent User Preferences — A conversational assistant stores user style preferences (output format, tone, language) in a memory file at the start of the relationship. On every subsequent session it reads that file first, so the user never has to repeat instructions like 'always respond in Markdown' or 'keep answers under 200 words'.
◆Long-Running Coding Agent with Architecture State — A coding agent maintains a CLAUDE.md file in /memories that records the project's architecture, linting rules, and key design decisions. As the developer makes changes across days or weeks, the agent updates this file so it always has an accurate understanding of the codebase without requiring a full re-read of every file each session.
◆Customer Support Issue Resolution Caching — A support agent diagnoses a recurring login error in session one and saves the root cause and fix to /memories/common_issues.md. In session two, a different customer reports the same symptom. The agent reads memory first, recognizes the pattern, and provides the cached solution immediately without re-diagnosing.
◆Multi-Session Research Synthesis — A research agent analyzes batches of papers across ten separate sessions. After each batch it writes key findings and identified gaps to a synthesis memory file. In the final session it reads all accumulated memory files and produces a comprehensive report, avoiding re-analysis of already-processed material.
◆Extended Agentic Workflow with Context Overflow Protection — An agent running a 100-turn web search workflow approaches the context token limit. Combined with context editing, Claude receives a warning and writes critical findings to memory before stale tool results are cleared. On subsequent turns it retrieves exactly the facts it needs from memory rather than from a bloated conversation history, cutting token consumption by up to 84%.

What changed recently

◆2025-06-27 — Memory tool beta launched. Beta header 'context-management-2025-06-27' introduced to the Claude API, enabling the memory tool and context editing features for long-running agentic workflows.
◆2025-08-18 — Memory tool type identifier stabilized as 'memory_20250818', replacing earlier beta identifiers. This is the identifier required in the tools array.
◆2025-09-29 — Anthropic officially announced the memory tool (memory_20250818) in beta in release notes, confirming general availability to API developers alongside context editing.
◆2026-01-06 — Anthropic published performance benchmarks showing memory tool combined with context editing improved agent task completion by 39% over baseline. Context editing alone delivered 29% improvement. On a 100-turn web search evaluation, the combination reduced token consumption by 84% versus loading full conversation history.

This is the short version

The full chapter has three worked examples, the common pitfalls, and the workflow that makes it pay — plus the other 84 features, kept current.

Get Claude Master — $97 →