When would you use Prefilling?

Guaranteed JSON output without preamble: A data pipeline needs Claude to extract structured fields from free-text product descriptions. Any conversational opener before the JSON breaks the downstream parser. Prefilling '{' forces Claude to begin its response as a valid JSON object immediately.

When would you use Prefilling?

Enforcing XML or other structured markup: A document-processing system requires responses wrapped in a specific XML schema. Prefilling ' ' locks Claude into that tag structure, preventing it from adding explanatory prose before the XML.

When would you use Prefilling?

Persona and role-play consistency: An interactive storytelling app maintains a named character across many turns. Prefilling the character's name or a persona marker at the start of each assistant turn prevents Claude from breaking character or explaining that it is an AI.

← ContentsPrompting · advanced

Prefilling

Prefilling is an API-level technique that lets you supply the opening text of Claude's response by placing a final message in the conversation array with the role set to 'assistant'. Instead of waiting for Claude to choose how to begin its reply, you inject the first characters yourself—for example, an opening curly brace to force JSON output, or a persona marker to lock a role-play character. Claude then continues generating from exactly where your prefill ends, making it a precise way to control output format, skip conversational filler, and maintain character consistency. The feature works by appending a dictionary with 'role': 'assistant' and your desired opening text as 'content' to the messages list in an API call. Claude treats that text as if it had already written it and completes the response from that point forward. This bypasses any preamble Claude might otherwise add (such as 'Certainly! Here is…') and steers the model directly into a specific output trajectory before a single new token is generated. Prefilling is strictly an API feature—it is not available on Claude.ai. It is also subject to important model-version restrictions: starting with the Claude 4.6 family and Claude Mythos Preview, prefilling is explicitly disabled at the API level and returns a 400 error. On supported models it remains a useful low-overhead technique, but Anthropic now recommends structured outputs and system-prompt instructions as the primary approach for newer models.

When you’d use it

◆Guaranteed JSON output without preamble — A data pipeline needs Claude to extract structured fields from free-text product descriptions. Any conversational opener before the JSON breaks the downstream parser. Prefilling '{' forces Claude to begin its response as a valid JSON object immediately.
◆Enforcing XML or other structured markup — A document-processing system requires responses wrapped in a specific XML schema. Prefilling '<output>' locks Claude into that tag structure, preventing it from adding explanatory prose before the XML.
◆Persona and role-play consistency — An interactive storytelling app maintains a named character across many turns. Prefilling the character's name or a persona marker at the start of each assistant turn prevents Claude from breaking character or explaining that it is an AI.
◆Single-token constrained answers — A quiz or multiple-choice grading tool needs exactly one letter answer. By setting max_tokens to 1 and prefilling 'Answer:', the response is locked to a single character continuation, making parsing trivial and deterministic.
◆Skipping conversational filler to save tokens — A high-volume summarization service is billed by token. Prefilling 'Summary:' eliminates opener phrases like 'Certainly! Here is a summary of…', reducing per-call token costs and response latency at scale.

What changed recently

◆2025-11 — Anthropic announced that prefilling assistant messages would be deprecated on the Claude 4.6 model family and Claude Mythos Preview. The stated reason is that model instruction-following has advanced to the point where prefill is no longer necessary for most use cases.
◆2025-12 — Claude Sonnet 4.6 and Claude Opus 4.6 released with prefilling disabled at the API level. Requests including a trailing assistant message to these models return a 400 error: 'Prefilling assistant messages is not supported for this model'. Structured outputs and system-prompt instructions are the recommended replacements.
◆2026-04 — Claude Opus 4.7 and Claude Opus 4.8 confirmed as non-supporting prefilling, continuing the deprecation trajectory across the 4.x model family. Claude Mythos Preview also listed as unsupported.
◆2025-10 — The Anthropic Prompt Improver tool (in the developer console) was updated to automatically add prefill text to assistant messages as one of its optimization strategies, helping developers on supported models discover the technique.

This is the short version

The full chapter has three worked examples, the common pitfalls, and the workflow that makes it pay — plus the other 84 features, kept current.

Get Claude Master — $97 →