Claude Tips mascot
Claude Tips & Tricks
System Prompts intermediate

Trim Your System Prompt to Save Tokens

A bloated system prompt wastes tokens on every message. Cut yours in half by removing redundancy, using references, and being concise.

Every token in your system prompt is sent with every API call. A 2,000-token system prompt across 50 turns costs you 100K input tokens just in repeated instructions.

Common Bloat

Most system prompts have these problems:

  • Repetition: saying the same rule three different ways
  • Examples that could be references: embedding full documents instead of pointing to them
  • Defensive instructions: “don’t do X, also don’t do Y, also avoid Z” when one rule covers all three
  • Boilerplate: paragraphs of context the model doesn’t need

Before and After

Before (180 tokens):

You are a helpful coding assistant. You should always write clean,
well-documented code. Make sure to add comments to explain complex
logic. Always follow best practices. When writing code, ensure it
is readable and maintainable. Add docstrings to all functions.

After (25 tokens):

Write clean, documented code. Add docstrings. Comment non-obvious logic.

Techniques

  • Merge overlapping rules. If three rules all say “be concise,” keep one
  • Use shorthand. Claude understands “No yapping” as well as a paragraph about brevity
  • Move examples to few-shot messages instead of embedding in the system prompt
  • Reference, don’t embed. Say “Follow the style in src/utils/” instead of pasting the whole file
  • Delete “be helpful” instructions. Claude already tries to be helpful

Measure It

import anthropic
client = anthropic.Anthropic()
count = client.count_tokens(your_system_prompt)
print(f"System prompt: {count} tokens")

Aim for under 500 tokens for most use cases. If you’re over 1,000, you almost certainly have bloat.