Apply the same fix to the three Dockerfiles not in the release pipeline
today (alpine, dev, health_check) so they stay correct if/when they're
built for arm64 in the future.
Wolfi pins are not present in these files; the python:3.11-alpine and
python:3.13-slim digests they already use are multi-arch indexes that
include arm64/v8, so only the uv pin needed swapping.
The previous pins resolved to single-platform amd64 manifests, so buildx
pulled the same amd64 base for both linux/amd64 and linux/arm64 targets.
The published OCI index then advertised an arm64 entry whose layers are
byte-identical to amd64 -- arm64 users got an amd64 binary.
Switch all three Dockerfiles to the multi-arch image-index digests:
- cgr.dev/chainguard/wolfi-base (index has linux/amd64 + linux/arm64)
- ghcr.io/astral-sh/uv:0.11.7 (index has linux/amd64 + linux/arm64)
Resolved with `docker buildx imagetools inspect <ref>` -- that returns
the index digest. `docker pull` + `docker inspect` returns the per-host
platform digest, which is what slipped in last time.
Anthropic's main API no longer resolves the non-canonical 'claude-4-sonnet-20250514'
alias for freshly issued keys, returning 404 not_found_error. PR #27031 already
swept three other live tests pinned to this alias to claude-haiku-4-5-20251001
but missed test_multiturn_tool_calls in the responses API suite, which is now
failing reliably on PR CI runs (e.g. PR #27074, job 1603363).
Bump the two model references in test_multiturn_tool_calls to the same
claude-haiku-4-5-20251001 snapshot used by PR #27031 -- it covers everything
this test exercises (tool calling, multi-turn) and isn't on a deprecation
schedule.
Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
The Vertex batch output transformer was emitting both a populated 'response' and 'error' for failed batch entries. The OpenAI Batch output spec defines them as mutually exclusive: on error 'response' MUST be null. This broke any consumer using 'result["response"] is None' to detect failures.
Setting reasoning_effort="none" on Anthropic chat models (direct, Bedrock
Invoke, Bedrock Converse, Vertex AI Anthropic, Azure AI Anthropic) crashed
LiteLLM with:
litellm.APIConnectionError: 'NoneType' object has no attribute 'get'
Both the Anthropic chat transformation and Bedrock Converse called
``AnthropicConfig._map_reasoning_effort`` and assigned the ``None`` it returns
for ``"none"`` directly to ``optional_params["thinking"]``. Downstream
``is_thinking_enabled`` then did ``optional_params["thinking"].get("type")``
and crashed.
Pop ``thinking`` (and on Claude 4.6/4.7, ``output_config``) instead of
assigning ``None``, restoring the documented contract that
``reasoning_effort="none"`` means "do not enable thinking". This also
prevents downstream Anthropic 400s ("thinking: Input should be an object",
"output_config.effort: Input should be ...") if the bug were ever masked.
Verified end-to-end against the live Anthropic API and Bedrock Converse
on claude-opus-4-{5,6,7} and claude-sonnet-4-6, plus Bedrock Invoke for
Claude 4.5/4.6. Vertex AI Anthropic and Azure AI Anthropic inherit the
fixed ``map_openai_params`` from ``AnthropicConfig`` and need no further
changes.
Adds litellm.disable_vertex_batch_output_transformation (default False).
When True, afile_content returns raw Vertex predictions.jsonl untouched
so users that parse candidates/modelVersion directly are not broken.
The method was overwriting logging_obj.optional_params, logging_obj.model,
and logging_obj.start_time on the caller's Logging instance. When invoked
from llm_http_handler.py's generic framework path, the framework's own
logging_obj (which already went through pre_call) had its properties
clobbered, causing model and start_time to reflect the last batch line's
values rather than the original call context.
Fix: create a fresh local Logging instance for the per-line transformation
instead of mutating the incoming logging_obj. The caller's object is now
left entirely untouched regardless of whether a logging_obj was passed in
or not.
Regression tests added to verify model, start_time, and optional_params
are not mutated on the caller's logging_obj.
Co-authored-by: Sameer Kankute <Sameerlite@users.noreply.github.com>
- Remove unused _encode_gcp_label_value / _decode_gcp_label_value singular
helpers; only the _chunks variants are actually called.
- Use 'is not None' check for custom_id so empty-string custom_ids are
still labeled and round-trip through batch outputs.
Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
Newer Cloudflare Workers AI models (e.g. Nemotron) emit 'response_text'
instead of 'response' on streamed chunks. The non-streaming path was
already updated to fall back to 'response_text' (#26385), but the
streaming chunk parser still only read 'response', which caused
streaming requests against those models to silently produce empty
content.
Mirror the non-streaming fallback in CloudflareChatResponseIterator.chunk_parser
and add a streaming test for the response_text shape.
Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
base_anthropic_messages_test.test_anthropic_messages_with_thinking and
test_anthropic_streaming_with_thinking still pinned to
claude-4-sonnet-20250514 — the same legacy alias Anthropic no longer
recognizes under freshly issued keys. The other four tests in this base
class already use claude-sonnet-4-5-20250929; these two were missed.
Bump to claude-haiku-4-5-20251001 (supports_reasoning=true, no upcoming
deprecation). Subclasses including TestAnthropicPassthroughBasic
inherit these methods.
test_anthropic_messages_streaming_cost_injection hits the proxy's
/v1/messages route, which routes via the anthropic/* wildcard to
api.anthropic.com. The 404 surfaced in the test was Anthropic's own
not_found_error propagated back through the proxy (visible from the
x-litellm-model-id hash on the response — the proxy did route).
Same root cause as the prior commit: the legacy claude-4-sonnet-20250514
alias is no longer recognized by Anthropic's main API under the new key.
Swap to claude-haiku-4-5-20251001 — same routing path, canonical model.
Three live-API tests pinned to claude-4-sonnet-20250514, which is a
non-canonical alias of claude-sonnet-4-20250514. Anthropic's main API
no longer resolves the legacy form under freshly issued keys, so the
tests fail with not_found_error. The token counter test pinned to
claude-sonnet-4-20250514 itself (deprecation_date 2026-05-14, two weeks
out) was on borrowed time too.
Bump all four to claude-haiku-4-5-20251001 — capability superset for what
these tests exercise (streaming, parallel tool calling, extended thinking,
token counting), no upcoming deprecation, cheaper per-token.
Mirrors the membership rule on /key/update so that /key/generate and
/key/{key}/regenerate apply the same `_validate_caller_can_assign_key_org`
gate when the caller specifies an `organization_id`. Proxy admins bypass.
The check no-ops when `organization_id` is not being set.