Claude Opus 4.7 Ships With Smarter Coding, Vision, and Multi-Step Agent Tasks

The morning grind for any developer looks the same: manually verifying complex code refactors, babysitting multi-step agent tasks that stall halfway through, and squinting at high-resolution images that the model can barely parse. This week, Anthropic quietly shipped a model that changes that equation. Claude Opus 4.7 is now live, and early testers say it's the first model they trust to walk away from the hardest problems.

What Opus 4.7 Actually Changes

Anthropic made Claude Opus 4.7 available on April 16 across all Claude products, the API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. Pricing stays flat against Opus 4.6: $5 per million input tokens and $25 per million output tokens. The API identifier is `claude-opus-4-7`.

The headline improvements land in three areas. On advanced software engineering, Opus 4.7 shows a clear jump over Opus 4.6, especially on the hardest tasks. Early testers report they can now hand off their most difficult coding work without tight supervision. The model handles complex, long-running tasks with consistency, follows instructions precisely, and builds its own verification steps before reporting results.

Vision capability gets a meaningful upgrade too. The model processes images at higher resolution and produces more polished, creative outputs for professional tasks like interface design, slide decks, and document creation. It doesn't match the overall capability of Claude Mythos Preview, but it beats Opus 4.6 across a range of benchmarks.

The Security Trade-Off: Opus 4.7 as a Testing Ground

Opus 4.7 is the first model to ship under Project Glasswing, announced last week. Anthropic explicitly states that Claude Mythos Preview's cybersecurity capabilities are too powerful, so they're testing new safety measures on a less capable model first. Opus 4.7 has deliberately reduced cyber capabilities compared to Mythos Preview, and the training process included experiments to differentially suppress those abilities.

The model includes automatic detection and blocking of requests that indicate prohibited or high-risk cybersecurity use. Security professionals who need Opus 4.7 for legitimate work — vulnerability research, penetration testing, red team operations — can enroll in Anthropic's new Cyber Verification Program.

Safety evaluations show Opus 4.7 has a similar profile to Opus 4.6. Rates of concerning behaviors like deception, sycophancy, and misuse collaboration remain low. Honesty and malicious prompt injection resistance improved over Opus 4.6, though the model's tendency to provide overly detailed harm-reduction advice on controlled substances weakened slightly. The alignment evaluation concludes the model is "generally well-aligned and reliable, but behavior is not fully ideal."

What Developers Need to Know About Token Usage

Opus 4.7 is a direct upgrade from Opus 4.6, but two changes affect token consumption. First, it uses an updated tokenizer. The same input can map to roughly 1.0 to 1.35 times more tokens depending on content type. Second, the model "thinks" more at higher effort levels, which improves reliability on hard problems but generates more output tokens.

Developers can control token usage through the `effort` parameter, adjusting task budgets, or prompting the model to respond more concisely. Anthropic reports improved token efficiency across all effort levels in internal tests but recommends measuring the difference in real traffic. A migration guide is available.

Image handling changes are immediately visible to developers. This is a model-level change, not an API parameter, so images sent to Claude are automatically processed at higher resolution. High-resolution images consume more tokens, so users who don't need the extra detail can downsample images before sending them to the model.

Claude Design: A New Visual Tool

Alongside Opus 4.7, Anthropic launched Claude Design, a new Anthropic Labs product. The tool lets users collaborate with Claude to produce visual work — designs, prototypes, slides, and one-page documents.

Early tester feedback is broadly positive. One tester called Opus 4.7 "the first model I can hand my hardest coding problems to," while another noted that "image understanding has improved dramatically — it analyzes complex diagrams and UI mockups far better."

The full evaluation results are available in the Opus 4.7 system card.

This is the model that finally makes multi-step agent tasks feel like delegation instead of babysitting.