Claude Tips mascot
Claude Tips & Tricks
API Tips advanced

Automate Desktop Tasks with the Computer Use API

Use Claude's computer use capability to control a desktop environment, click through UIs, fill forms, and automate tasks that don't have APIs.

Some tasks don’t have an API. Claude’s computer use tool lets it see a screen, move the mouse, click buttons, and type. It’s like giving Claude remote desktop access.

Basic API Call

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    tools=[{
        "type": "computer_20250124",
        "name": "computer",
        "display_width_px": 1920,
        "display_height_px": 1080,
    }],
    messages=[{
        "role": "user",
        "content": "Open the browser, go to our staging app, and test the login flow."
    }]
)

What It Can Do

  • Navigate web apps that don’t have APIs
  • Fill out forms and submit them
  • Take screenshots and verify visual state
  • Click through multi-step UI workflows
  • Read text from the screen

Real Use Cases

  • QA testing: walk through user flows in staging
  • Data entry: fill out forms in legacy systems
  • Admin tasks: configure settings in web dashboards
  • Visual verification: confirm UI matches a design mockup

Limitations

  • Slower than API calls (each step needs a screenshot + processing)
  • Resolution matters: higher res means more tokens per screenshot
  • Not great for pixel-precise interactions
  • Works best with clear, well-labeled UIs

Tip

Combine computer use with Playwright MCP for web tasks. Playwright is faster and more reliable for structured web automation. Save computer use for tasks that genuinely need visual understanding.