Coding #anthropic #python-sdk #claude-opus-4-8 #sdk-release

Anthropic Python SDK 0.105: Opus 4.8 and Mid-Session System Prompts

Three SDK releases in 7.5 hours ship claude-opus-4-8 support, mid-conversation system blocks, and finer output usage reporting.

Creeta

May 29, 2026

Anthropic Python SDK 0.105: Opus 4.8 and Mid-Session System Prompts

Anthropic shipped three consecutive Python SDK releases — v0.105.0 through v0.105.2 — within 7.5 hours on May 28–29, 2026 , timed to the Claude Opus 4.8 launch. Only the first carries developer-visible changes: Opus 4.8 type support, mid-conversation system blocks, finer output usage reporting, and custom file size overrides for the Files API. The two follow-on patches fixed CI plumbing around PyPI publishing. Here's what changed and what requires action in your codebase.

Three Releases, 7.5 Hours: The v0.105 Sequence

v0.105.0 is the one that matters. It shipped at 16:52 UTC on May 28 via commit 43b5b1f , bundling four developer-facing changes. Everything after was infrastructure correction.

Quick Answer: Upgrade to v0.105.2. All four API-facing changes — Opus 4.8 type stubs, mid-conversation system blocks, output_tokens_details, and custom file size overrides — landed in v0.105.0. The two follow-on patches fixed PyPI Trusted Publishing with zero SDK surface changes.

Release	Time (UTC)	Changes
v0.105.0	May 28, 16:52	Opus 4.8 model literal, mid-conversation system blocks, `output_tokens_details`, custom file size cap override
v0.105.1	May 29, 00:07	PyPI Trusted Publishing via GitHub Actions OIDC — no SDK API changes
v0.105.2	May 29, 00:20	Packaging fix for Trusted Publishing migration — no SDK API changes

v0.105.1 and v0.105.2 are separated by 13 minutes . The v0.105.2 changelog is empty, confirming it was a pyproject.toml or workflow YAML correction required for the Trusted Publishing upload to land cleanly. No observable SDK differences between the two. Pin to 0.105.2.

Mid-Conversation System Blocks: How It Works

The Messages API now accepts system-role entries inside the messages array — not only at the top-level system parameter. This lets you inject updated instructions at any point in a conversation without resetting state or replaying message history.

"System entries now accepted within messages array for mid-task instruction updates." — Anthropic Platform Documentation (source: platform.claude.com)

This is the SDK change most relevant to long-horizon agentic work. Claude Code's Dynamic Workflows feature (research preview) allows hundreds of parallel subagents within a single session . When requirements shift mid-run — new constraints, reprioritized goals — you can now push updated system instructions without tearing down session state.

No breaking migration. The top-level system parameter continues to work exactly as before. The messages-array form is purely additive.

claude-opus-4-8 in the SDK: Type Stubs and IDE Completion

Commit f18b014 registers claude-opus-4-8 as a valid model literal in the SDK's type stubs and internal routing . On SDK versions ≤0.104.1, you can still target the model by passing the string as an untyped raw literal — the REST API accepts it — but IDE completion and type validation are lost.

Model string: claude-opus-4-8. No date suffix required. Identical across Claude API, Bedrock, and Vertex routing.

Two models are formally deprecated with a hard retirement date :

claude-sonnet-4-20250514 — retires June 15, 2026
claude-opus-4-20250514 — retires June 15, 2026

If you're on either, migrate now. Three weeks is a short runway for production services.

Finer Output Usage Reporting

v0.105.0 adds usage.output_tokens_details, completing a three-layer token reporting structure in the SDK :

Input layer: input_tokens_details — cache hit/miss breakdown (existing)
Thinking layer: thinking_token_count beta — introduced in v0.104.0
Output layer: output_tokens_details — new in v0.105.0

The sub-field schema for output_tokens_details is not yet documented in the changelog. Add the field to your cost dashboard parsers now and log the raw object until the schema stabilizes. This field is purely additive — no existing response parsers break.

Custom File Size Caps in Document Uploads

PR #1825 lets developers pass a per-call file size override when uploading via the Files API . Before this change, handling larger payloads — legal filings, research corpora, large codebases — required monkey-patching the SDK's internal size constant. The override is per-call only; no SDK-level defaults are altered.

If your pipeline currently monkey-patches the file size constant, remove the patch after upgrading to 0.105.2 and use the documented parameter instead. The old approach will conflict with future SDK internals.

Opus 4.8: Effort Parameter, Adaptive Thinking, and Platform Coverage

Claude Opus 4.8 (claude-opus-4-8) launched May 28, 2026 on four platforms: Claude API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry. The Foundry context cap is 200k tokens; all other platforms support 1M tokens and 128k max output .

Claude Opus 4.8 scores 69.2% on SWE-Bench Pro, outperforming GPT-5.5 and Gemini 3.1 Pro . — Anthropic benchmarks (source: Decrypt, May 2026)

Mode	Speed	Input	Output	Context
Standard	Baseline	$5 / 1M tokens	$25 / 1M tokens	1M tokens (200k on Foundry)
Fast	~2.5×	~⅓ prior Opus fast cost	~⅓ prior Opus fast cost	Same

The effort parameter defaults to high on all surfaces. This is a latent cost trap: code targeting earlier Opus models will silently run at maximum effort without an explicit override. Set effort='low' or effort='medium' to reduce spend and latency.

Adaptive thinking is enabled on Opus 4.8; extended thinking is not. Opus 4.5 and 4.6 used a separate thinking block — Opus 4.8 does not support that configuration. If your code passes a thinking config object, test behavior carefully before deploying to the new model.

PyPI Trusted Publishing: Dropping the Stored Secret

v0.105.1 replaces long-lived PyPI API token uploads with GitHub Actions OIDC-based Trusted Publishing . Removing a stored secret from the release pipeline is a concrete supply-chain security improvement, not just a process change.

v0.105.2 followed 13 minutes later with an empty changelog — consistent with a packaging metadata fix required for the upload to succeed on PyPI. There are no observable SDK API differences between 0.105.1 and 0.105.2. Pin to 0.105.2 as the stable endpoint.

Frequently Asked Questions

Do I need to upgrade to v0.105 to call claude-opus-4-8?

Not strictly — you can pass "claude-opus-4-8" as an untyped string literal on SDK versions ≤0.104.1 and the REST API will accept it. But you lose IDE type completion and validation. For any production code targeting Opus 4.8, upgrading to 0.105.2 is the recommended path. The two-step upgrade cost is minimal; the type safety benefit is not.

What does the `effort` parameter default to on Opus 4.8, and how do I override it?

It defaults to high on all surfaces — Claude API, Claude Code, Amazon Bedrock, and Google Vertex AI. Pass effort='low' or effort='medium' explicitly in your API call to trade reasoning depth for reduced latency and cost. If your cost model was built against earlier Opus pricing and latency baselines, review your assumptions before deploying to Opus 4.8 without an explicit effort value.

How do mid-conversation system blocks differ from the top-level system parameter?

The top-level system parameter is set once at conversation start and applies uniformly to the entire session. Mid-conversation system blocks let you inject a system-role entry at any point inside the messages array — updating instructions without resetting conversation state or replaying message history. Both mechanisms remain valid. The messages-array form is purely additive: no migration required, no existing behavior changes.

Why did Anthropic publish three SDK releases in one night?

v0.105.0 was the substantive release carrying all four developer-facing changes. v0.105.1 switched the publishing pipeline to PyPI Trusted Publishing via GitHub Actions OIDC. v0.105.2 patched a workflow or packaging issue that surfaced immediately in that transition — its empty changelog and 13-minute gap from v0.105.1 are the tell. The last two releases carry no SDK API changes and were infrastructure corrections to the publishing automation.

What is the cost difference between Opus 4.8 standard and fast mode?

Standard is $5/M input tokens and $25/M output tokens . Fast mode runs at approximately 2.5× speed at roughly one-third of prior Opus fast-mode cost. Fast-mode per-token rates are subject to change, so confirm current figures on the Anthropic pricing page before committing to a cost budget.

What to Watch Next

Two open questions remain from this release cycle. The output_tokens_details sub-field schema is undocumented — watch the SDK CHANGELOG for a follow-up entry before relying on specific sub-fields in production parsers. The v0.105.2 defect description is also unpublished; if you mirror or audit Anthropic's release tooling, review the workflow YAML diff when it becomes visible in the commit history.

Anthropic has referenced forthcoming "Mythos-class models" in the coming weeks without a public specification. Given the SDK's rapid patch cadence around this launch, expect another tight release window when those models land. Keeping a pinned dependency and a clear upgrade checklist will reduce friction when that happens.

Last updated: 2026-05-29. Based on SDK release notes and CHANGELOG as of May 29, 2026.