API Tips intermediate

Route Tasks to the Right Model for 80% Cost Savings

Use Sonnet for 80% of tasks and only switch to Opus for complex reasoning. Most developers overspend by using the most expensive model for everything.

March 15, 2026

The fastest way to cut API costs is to stop using Opus for everything. Opus costs 5x more than Sonnet, and for most tasks, Sonnet produces equivalent quality.

Model Routing Strategy

def select_model(task_type: str, complexity: str) -> str:
    # Opus: complex reasoning, nuanced analysis, difficult code
    if complexity == "high" or task_type in ["architecture_review", "complex_debug", "research"]:
        return "claude-opus-4-6-20260301"

    # Sonnet: the workhorse for 80% of tasks
    if task_type in ["code_generation", "summarization", "classification", "refactoring"]:
        return "claude-sonnet-4-6-20260301"

    # Haiku: simple formatting, extraction, classification
    if task_type in ["formatting", "extraction", "simple_qa"]:
        return "claude-haiku-3-5-20241022"

    return "claude-sonnet-4-6-20260301"

Cost Comparison (per 1M tokens)

Model	Input	Output	Best For
Opus 4.6	$15	$75	Deep reasoning, complex refactoring
Sonnet 4.6	$3	$15	Most coding, summaries, analysis
Haiku 3.5	$0.80	$4	Classification, formatting, extraction

Stack Savings

Combine model routing with other techniques:

Sonnet + Batch API = 50% off
Sonnet + Prompt Caching = 90% off cached inputs
Sonnet + Low Effort = fewer thinking tokens
All combined = up to 95% savings vs. Opus at full price

A request costing $0.15 on Opus can cost under $0.01 on Sonnet with caching and batching.