---
title: "Claude Opus 4.7 Released: 3× Wins & New xhigh Tier"
description: "Anthropic's Claude Opus 4.7 lands with 13% coding gains, 3× more production task wins, a new xhigh reasoning tier, task budgets, and upgraded vision."
date: 2026-04-17
category: AI
tags: ["AI", "Anthropic", "Claude", "LLM", "Programming"]
url: https://uper.pl/en/blog/claude-opus-4-7/
---

# Claude Opus 4.7 Released — 3× More Production Wins, 13% Better Coding, and the New xhigh Reasoning Tier Explained

On April 16, 2026, Anthropic shipped **Claude Opus 4.7** — the next iteration of its flagship model, laser-focused on long-context tasks, autonomous reasoning, and disciplined execution across multi-hour engineering projects. The launch arrives just as the entire industry is shifting weight from single-shot prompts to **agentic workflows** — sessions where a model plans, writes code, fixes bugs, and reports progress for hours or days without human supervision. Opus 4.7 is tuned exactly for that: slower error cascades, better cross-session memory, and visibly stronger results on engineering benchmarks.

![Claude Opus 4.7 — coding and agentic benchmark results](../../assets/images/blog/claude-opus-4-7-benchmarks.webp)

## What's New in Claude Opus 4.7

The new model keeps the same pricing as its predecessor ($5 per million input tokens, $25 per million output tokens), but ships meaningful upgrades across architecture and reasoning control. This is not a cosmetic bump — Anthropic explicitly notes the model is "substantially better at following instructions" and warns that some teams will need to **re-tune their prompts** to get the full benefit.

Key changes versus Opus 4.6:

- **`xhigh` effort level** — a new reasoning tier sitting between `high` and `max`, giving finer control over the cost-to-quality trade-off.
- **Task budgets** — a parameter that caps token spend inside long agentic runs without manual interruption.
- **Upgraded vision** — images up to 2,576 px on the long edge (~3.75 megapixels), more than 3× the previous capacity.
- **`/ultrareview` command** — a dedicated code-review mode for Claude Pro and Max subscribers.
- **Auto mode** — autonomous decision-making expanded to Max plan users.
- **Updated tokenizer** — better text handling, but the same prompt can now generate **1.0–1.35× more tokens** than before, which matters for API budgets.

## Benchmarks: Real Production Numbers, Not Marketing

Anthropic backs the release with hard numbers measured alongside production partners. That's a tonal shift — instead of closed academic evaluations, the company is leaning on data from real engineering environments.

| Benchmark / Partner | Opus 4.7 | Opus 4.6 |
| :--- | :--- | :--- |
| **Replit** (93-task coding test) | +13% | baseline |
| **Rakuten** (production tasks) | **3× more resolutions** | baseline |
| **CursorBench** | 70% success | 58% success |
| **GDPval-AA** (finance/legal) | state-of-the-art | lower |
| **XBOW** (pentesting) | 98.5% visual acuity | — |

The Rakuten result deserves attention — a 3× jump in resolved production tasks isn't just about code quality. It signals the model is dramatically more resistant to "getting stuck" in long workflows. That translates directly to operating cost: fewer iterations, fewer wasted tokens, fewer human corrections needed downstream.

## Agentic Behavior and Cross-Session Memory

The most important direction in Opus 4.7 is **operating across long time horizons**. The model has been tuned for scenarios where an agent runs for hours or days — writing code, running tests, resuming tasks after interruption, and keeping decisions consistent along the way.

The key improvements in this area:

- **Better cross-session memory use** — the model handles earlier notes, logs, and project artifacts more effectively.
- **Tool coordination** — smoother chaining of function calls inside multi-step plans.
- **Long-context stability** — fewer "fatigue hallucinations" when working with large repositories.

For teams building their own integrations with Claude-class models — whether through the API or Claude Code — this means you can push more logic into the model itself and strip out some of the orchestration layers that used to be required to keep things on the rails.

## Tokenizer Changes: What This Means for Your API Bill

The updated tokenizer improves text handling, but it comes with a practical consequence: **the same prompts can cost more**. Anthropic reports a 1.0–1.35× multiplier depending on content, and at `xhigh` and `max` effort the model also produces more output tokens — deliberately, in service of higher reliability.

What this means in practice for production teams:

1. **Revisit monthly budgets** — especially if you run large contexts or many parallel agents.
2. **Audit your long prompts** — non-Latin scripts and code with many special characters are particularly sensitive.
3. **Run A/B cost tests** — even at identical per-million rates, the effective cost per request can rise 10–35%.

## Safety and Prompt Injection Resistance

The external safety assessment is cautiously positive. Opus 4.7 is judged as **"largely well-aligned and trustworthy,"** with visible gains in honesty and resistance to prompt injection — one of the biggest attack vectors against agentic systems.

The downside is a modest regression in harm-reduction advice, which Anthropic openly flags in the documentation. For teams building products in regulated spaces — health, finance, legal — that means maintaining additional filter layers and validation, just as with [earlier models in this class](/en/blog/claude-mythos-capybara-class/).

## Availability and Integrations

Claude Opus 4.7 is available from day one across every major platform:

- **Claude.ai** — Free, Pro, Max, and Teams/Enterprise users.
- **Anthropic API** — under the model ID `claude-opus-4-7`.
- **Amazon Bedrock** — in all regions supporting the Claude family.
- **Google Cloud Vertex AI** — full integration with GCP pipelines.
- **Microsoft Foundry** — a new distribution channel in the Microsoft ecosystem.

Pro and Max subscribers also get access to the `/ultrareview` command, which kicks off a dedicated code-review session where the model spends a larger compute budget on verifying changes, catching regressions, and proposing justified refactors.

## Summary

Claude Opus 4.7 is an **evolutionary model — but in the places that actually matter** for production teams: long agentic sessions, precise instruction-following, discipline across multi-hour engineering projects, and finer control over the cost-to-quality trade-off thanks to the new `xhigh` tier. The 3× lift in resolved production tasks at partners like Rakuten shows the improvement goes beyond benchmarks and translates into real operational savings.

For companies building workflows on top of [AI-powered SEO](/en/blog/seo-in-ai-era/) or their own agentic tooling, Opus 4.7 sends a clear signal: the model layer is absorbing another chunk of autonomy, and the work of orchestration should increasingly focus on boundaries and observability rather than hand-holding the model step by step. The real test, as always, will be the API bill after a few weeks of production use — especially given the tokenizer changes.

## Sources

1. **Introducing Claude Opus 4.7 — Anthropic**
[https://www.anthropic.com/news/claude-opus-4-7](https://www.anthropic.com/news/claude-opus-4-7)

2. **Claude Opus 4.7 System Card — Anthropic**
[https://www.anthropic.com/claude-opus-4-7-system-card](https://www.anthropic.com/claude-opus-4-7-system-card)

3. **Claude API Documentation — Anthropic**
[https://docs.anthropic.com/claude/docs](https://docs.anthropic.com/claude/docs)

4. **Amazon Bedrock — Claude Models**
[https://aws.amazon.com/bedrock/claude/](https://aws.amazon.com/bedrock/claude/)

5. **Vertex AI — Claude on Google Cloud**
[https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/claude](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/claude)