The fastest way to cut API costs is to stop using Opus for everything. Opus costs 5x more than Sonnet, and for most tasks, Sonnet produces equivalent quality.
Model Routing Strategy
def select_model(task_type: str, complexity: str) -> str:
# Opus: complex reasoning, nuanced analysis, difficult code
if complexity == "high" or task_type in ["architecture_review", "complex_debug", "research"]:
return "claude-opus-4-6-20260301"
# Sonnet: the workhorse for 80% of tasks
if task_type in ["code_generation", "summarization", "classification", "refactoring"]:
return "claude-sonnet-4-6-20260301"
# Haiku: simple formatting, extraction, classification
if task_type in ["formatting", "extraction", "simple_qa"]:
return "claude-haiku-3-5-20241022"
return "claude-sonnet-4-6-20260301"
Cost Comparison (per 1M tokens)
| Model | Input | Output | Best For |
|---|---|---|---|
| Opus 4.6 | $15 | $75 | Deep reasoning, complex refactoring |
| Sonnet 4.6 | $3 | $15 | Most coding, summaries, analysis |
| Haiku 3.5 | $0.80 | $4 | Classification, formatting, extraction |
Stack Savings
Combine model routing with other techniques:
- Sonnet + Batch API = 50% off
- Sonnet + Prompt Caching = 90% off cached inputs
- Sonnet + Low Effort = fewer thinking tokens
- All combined = up to 95% savings vs. Opus at full price
A request costing $0.15 on Opus can cost under $0.01 on Sonnet with caching and batching.