Commit Graph

38722 Commits

Author SHA1 Message Date
user
bfdd786962 chore(deps): refresh dependency locks 2026-05-04 11:36:18 -07:00
Mateo Wang
790d8bbe1a
Merge pull request #26899 from BerriAI/litellm_suppress-spend-log-tracebacks-2208
feat(spend-logs): opt-in suppression of stack traces in spend-tracking error logs
2026-05-04 11:09:03 -07:00
yuneng-jiang
b2c270e653
Merge pull request #27123 from BerriAI/litellm_/intelligent-fermat-298a82
[Fix] Docker: Pin Wolfi And Uv To Multi-Arch Index Digests
2026-05-04 10:08:47 -07:00
Yuneng Jiang
25a5cccc7a
[Fix] Docker: Pin Uv To Multi-Arch Index Digest In Remaining Dockerfiles
Apply the same fix to the three Dockerfiles not in the release pipeline
today (alpine, dev, health_check) so they stay correct if/when they're
built for arm64 in the future.

Wolfi pins are not present in these files; the python:3.11-alpine and
python:3.13-slim digests they already use are multi-arch indexes that
include arm64/v8, so only the uv pin needed swapping.
2026-05-04 10:02:48 -07:00
Michael-RZ-Berri
675e49ed94
Merge pull request #26894 from BerriAI/litellm_langsmithRedactApiInfo
[Fix] Remove unwanted metadata info from LangSmith
2026-05-04 09:55:59 -07:00
Yuneng Jiang
08d130a8fe
[Fix] Docker: Pin Wolfi And Uv To Multi-Arch Index Digests
The previous pins resolved to single-platform amd64 manifests, so buildx
pulled the same amd64 base for both linux/amd64 and linux/arm64 targets.
The published OCI index then advertised an arm64 entry whose layers are
byte-identical to amd64 -- arm64 users got an amd64 binary.

Switch all three Dockerfiles to the multi-arch image-index digests:
  - cgr.dev/chainguard/wolfi-base   (index has linux/amd64 + linux/arm64)
  - ghcr.io/astral-sh/uv:0.11.7     (index has linux/amd64 + linux/arm64)

Resolved with `docker buildx imagetools inspect <ref>` -- that returns
the index digest. `docker pull` + `docker inspect` returns the per-host
platform digest, which is what slipped in last time.
2026-05-04 09:55:53 -07:00
Mateo Wang
196c7a0c09
Merge pull request #27077 from BerriAI/litellm_fix_responses_api_legacy_claude_4_sonnet-9574
test(responses): replace legacy `claude-4-sonnet-20250514` alias in multiturn tool-call test
2026-05-04 09:50:17 -07:00
Mateo Wang
c011a7e3ba
Merge pull request #27041 from BerriAI/litellm_vertex-batch-error-response-null-46dd
fix(vertex-ai): set response=null on batch error entries per OpenAI spec
2026-05-03 04:08:42 -07:00
Cursor Agent
2e6965381e
test(responses): replace legacy claude-4-sonnet alias in multiturn tool-call test
Anthropic's main API no longer resolves the non-canonical 'claude-4-sonnet-20250514'
alias for freshly issued keys, returning 404 not_found_error. PR #27031 already
swept three other live tests pinned to this alias to claude-haiku-4-5-20251001
but missed test_multiturn_tool_calls in the responses API suite, which is now
failing reliably on PR CI runs (e.g. PR #27074, job 1603363).

Bump the two model references in test_multiturn_tool_calls to the same
claude-haiku-4-5-20251001 snapshot used by PR #27031 -- it covers everything
this test exercises (tool calling, multi-turn) and isn't on a deprecation
schedule.

Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
2026-05-03 07:48:56 +00:00
Mateo Wang
c94a8d6514
Merge pull request #27039 from BerriAI/litellm_fix_reasoning_effort_none_anthropic
fix(anthropic,bedrock): omit thinking/output_config when reasoning_effort="none"
2026-05-02 01:42:50 -07:00
mateo-berri
7f3d7616b7
Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_fix_reasoning_effort_none_anthropic
Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
2026-05-02 08:37:07 +00:00
mateo-berri
cf9c2f0200
test(vertex-ai): cover transformation_error path emits response=null 2026-05-02 08:33:13 +00:00
Mateo Wang
4953b9e296
Merge pull request #26456 from BerriAI/litellm_hotfix_gpt-5.5-minimal-flag 2026-05-02 01:23:14 -07:00
mateo-berri
946dfb63c7
fix(vertex-ai): set response=null on batch error entries per OpenAI spec
The Vertex batch output transformer was emitting both a populated 'response' and 'error' for failed batch entries. The OpenAI Batch output spec defines them as mutually exclusive: on error 'response' MUST be null. This broke any consumer using 'result["response"] is None' to detect failures.
2026-05-02 08:15:56 +00:00
Mateo Wang
6dd04357f6
Merge pull request #25627 from BerriAI/litellm_vertex-batch-output-transformation
feat(vertex-ai): transform batch prediction outputs to OpenAI format
2026-05-02 01:10:09 -07:00
mateo-berri
3835306c83 fix(anthropic,bedrock): omit thinking/output_config when reasoning_effort="none"
Setting reasoning_effort="none" on Anthropic chat models (direct, Bedrock
Invoke, Bedrock Converse, Vertex AI Anthropic, Azure AI Anthropic) crashed
LiteLLM with:

  litellm.APIConnectionError: 'NoneType' object has no attribute 'get'

Both the Anthropic chat transformation and Bedrock Converse called
``AnthropicConfig._map_reasoning_effort`` and assigned the ``None`` it returns
for ``"none"`` directly to ``optional_params["thinking"]``. Downstream
``is_thinking_enabled`` then did ``optional_params["thinking"].get("type")``
and crashed.

Pop ``thinking`` (and on Claude 4.6/4.7, ``output_config``) instead of
assigning ``None``, restoring the documented contract that
``reasoning_effort="none"`` means "do not enable thinking". This also
prevents downstream Anthropic 400s ("thinking: Input should be an object",
"output_config.effort: Input should be ...") if the bug were ever masked.

Verified end-to-end against the live Anthropic API and Bedrock Converse
on claude-opus-4-{5,6,7} and claude-sonnet-4-6, plus Bedrock Invoke for
Claude 4.5/4.6. Vertex AI Anthropic and Azure AI Anthropic inherit the
fixed ``map_openai_params`` from ``AnthropicConfig`` and need no further
changes.
2026-05-02 01:08:07 -07:00
mateo-berri
439217511a
feat: add opt-out flag for Vertex batch output transformation
Adds litellm.disable_vertex_batch_output_transformation (default False).
When True, afile_content returns raw Vertex predictions.jsonl untouched
so users that parse candidates/modelVersion directly are not broken.
2026-05-02 07:39:17 +00:00
Cursor Agent
480bea2111
fix: don't mutate caller's logging_obj in _try_transform_vertex_batch_output_to_openai
The method was overwriting logging_obj.optional_params, logging_obj.model,
and logging_obj.start_time on the caller's Logging instance. When invoked
from llm_http_handler.py's generic framework path, the framework's own
logging_obj (which already went through pre_call) had its properties
clobbered, causing model and start_time to reflect the last batch line's
values rather than the original call context.

Fix: create a fresh local Logging instance for the per-line transformation
instead of mutating the incoming logging_obj. The caller's object is now
left entirely untouched regardless of whether a logging_obj was passed in
or not.

Regression tests added to verify model, start_time, and optional_params
are not mutated on the caller's logging_obj.

Co-authored-by: Sameer Kankute <Sameerlite@users.noreply.github.com>
2026-05-02 07:13:25 +00:00
Cursor Agent
04133ba07d
Fix Vertex batch output logging mutation 2026-05-02 07:11:55 +00:00
Cursor Agent
12ecac6f46
test vertex file content logging forwarding
Co-authored-by: Sameer Kankute <Sameerlite@users.noreply.github.com>
2026-05-02 07:00:55 +00:00
Cursor Agent
10261e4f90
Forward Vertex file content logging context 2026-05-02 06:57:55 +00:00
Cursor Agent
9d51706502
Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_vertex-batch-output-transformation
Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
2026-05-02 06:20:43 +00:00
Cursor Agent
9275eea131
Address bugbot: drop dead encode/decode helpers; preserve empty custom_id
- Remove unused _encode_gcp_label_value / _decode_gcp_label_value singular
  helpers; only the _chunks variants are actually called.
- Use 'is not None' check for custom_id so empty-string custom_ids are
  still labeled and round-trip through batch outputs.

Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
2026-05-02 06:13:44 +00:00
Mateo Wang
cfa058c3e9
Merge pull request #26530 from BerriAI/litellm_oss_staging_04_25_2026
chore(staging): roll oss_staging_04_25_2026 into internal staging (output_config fix + 4 upstream sync fixes)
2026-05-01 23:10:58 -07:00
Sameer Kankute
2ba009a435
Merge pull request #26878 from BerriAI/litellm_presidio-responses-stream-passthrough
fix(guardrails): preserve responses event streams in presidio output masking
2026-05-02 11:39:11 +05:30
Cursor Agent
1ce92da9e8
Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_oss_staging_04_25_2026
Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
2026-05-02 06:04:39 +00:00
Sameer Kankute
891445783c
Fix code qa 2026-05-02 11:31:31 +05:30
Cursor Agent
b07908133b
fix(cloudflare): support response_text in streaming chunk parser
Newer Cloudflare Workers AI models (e.g. Nemotron) emit 'response_text'
instead of 'response' on streamed chunks. The non-streaming path was
already updated to fall back to 'response_text' (#26385), but the
streaming chunk parser still only read 'response', which caused
streaming requests against those models to silently produce empty
content.

Mirror the non-streaming fallback in CloudflareChatResponseIterator.chunk_parser
and add a streaming test for the response_text shape.

Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
2026-05-02 05:59:15 +00:00
Sameer Kankute
a5a8f39845
Merge pull request #27036 from BerriAI/litellm_internal_staging
merge main
2026-05-02 11:26:44 +05:30
Sameer Kankute
f576eb3228
Merge pull request #26960 from BerriAI/litellm_org_mcp_permissions
feat(mcp): enforce org-level MCP server and toolset permissions
2026-05-02 11:25:55 +05:30
Cursor Agent
a30bcc9a41
Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_hotfix_gpt-5.5-minimal-flag
# Conflicts:
#	tests/test_litellm/llms/vertex_ai/test_vertex_ai_common_utils.py

Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
2026-05-02 05:55:51 +00:00
Mateo Wang
d493606ad6
Merge pull request #25764 from BerriAI/litellm_gemini_provider_default_thinking
fix(gemini): follow provider defaults for Gemini 3 thinking
2026-05-01 22:52:21 -07:00
Sameer Kankute
d2015f0baf
Merge pull request #26161 from BerriAI/litellm_access-group-routing-fix
fix(router): constrain same-name deployment routing by access groups
2026-05-02 11:19:51 +05:30
mateo-berri
a1b4330c18 fix: linting error 2026-05-01 21:05:50 -07:00
mateo-berri
b15136afa7 Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_suppress-spend-log-tracebacks-2208 2026-05-01 20:54:52 -07:00
mateo-berri
ea0d92a3d8 fix: remove traceback key instead of it being "" 2026-05-01 20:49:49 -07:00
yuneng-jiang
c3f7158b2b
Merge pull request #27008 from stuxf/fix/jwt-audience-and-issuer-verification
fix(auth): support JWT issuer verification + warn when unscoped
2026-05-01 19:58:52 -07:00
yuneng-jiang
5d73c31b27
Merge pull request #27031 from BerriAI/litellm_/inspiring-euclid-41919d
[Test] Anthropic: Replace Legacy Claude-4-Sonnet Alias With Haiku 4.5
2026-05-01 19:42:26 -07:00
shin-berri
38ddcdabdb
Merge pull request #27032 from BerriAI/litellm_yj_may1_2
[Infra] Merge dev branch
2026-05-01 19:39:42 -07:00
Yuneng Jiang
abfaab5dc3
[Test] Anthropic Passthrough: Bump Thinking Tests Off Legacy Sonnet 4 Alias
base_anthropic_messages_test.test_anthropic_messages_with_thinking and
test_anthropic_streaming_with_thinking still pinned to
claude-4-sonnet-20250514 — the same legacy alias Anthropic no longer
recognizes under freshly issued keys. The other four tests in this base
class already use claude-sonnet-4-5-20250929; these two were missed.

Bump to claude-haiku-4-5-20251001 (supports_reasoning=true, no upcoming
deprecation). Subclasses including TestAnthropicPassthroughBasic
inherit these methods.
2026-05-01 19:33:08 -07:00
shin-berri
3372b151d0
Merge pull request #26966 from BerriAI/litellm_fix_create_release_prerelease_detection
[Fix] Release Workflow: Detect SemVer-Style Pre-Release Dev Tags
2026-05-01 19:25:19 -07:00
Yuneng Jiang
1e63be7a72
[Test] Anthropic Passthrough: Bump Streaming Cost-Injection Test To Haiku 4.5
test_anthropic_messages_streaming_cost_injection hits the proxy's
/v1/messages route, which routes via the anthropic/* wildcard to
api.anthropic.com. The 404 surfaced in the test was Anthropic's own
not_found_error propagated back through the proxy (visible from the
x-litellm-model-id hash on the response — the proxy did route).

Same root cause as the prior commit: the legacy claude-4-sonnet-20250514
alias is no longer recognized by Anthropic's main API under the new key.
Swap to claude-haiku-4-5-20251001 — same routing path, canonical model.
2026-05-01 19:22:39 -07:00
Yuneng Jiang
95ccfee7ca
[Chore] Proxy/UI: Drop stray _experimental/out/chat/index.html
This file is a regenerable UI build artifact that should not be tracked
in source. Removing so the merge into litellm_internal_staging stays clean.
2026-05-01 19:22:30 -07:00
Yuneng Jiang
e3917c9d08
[Test] Anthropic: Replace Legacy Claude-4-Sonnet Alias With Haiku 4.5
Three live-API tests pinned to claude-4-sonnet-20250514, which is a
non-canonical alias of claude-sonnet-4-20250514. Anthropic's main API
no longer resolves the legacy form under freshly issued keys, so the
tests fail with not_found_error. The token counter test pinned to
claude-sonnet-4-20250514 itself (deprecation_date 2026-05-14, two weeks
out) was on borrowed time too.

Bump all four to claude-haiku-4-5-20251001 — capability superset for what
these tests exercise (streaming, parallel tool calling, extended thinking,
token counting), no upcoming deprecation, cheaper per-token.
2026-05-01 19:10:27 -07:00
yuneng-jiang
0ff9d65f8d
Merge pull request #26944 from stuxf/fix/sso-state-cookie-binding
chore(sso): bind generic SSO state to a session cookie
2026-05-01 18:52:02 -07:00
yuneng-jiang
5614469f22
Merge pull request #26825 from stuxf/fix/oauth2-proxy-header-forgery
chore(auth): require trusted proxy for header identity auth
2026-05-01 18:47:58 -07:00
ryan-crabbe-berri
85d426c6b5
Merge pull request #26275 from BerriAI/litellm_fix-ag-not-resolved 2026-05-01 18:37:38 -07:00
Yuneng Jiang
92d3bdbb27
[Fix] Proxy/Key Management: Align Key-Org Membership Checks On Generate And Regenerate
Mirrors the membership rule on /key/update so that /key/generate and
/key/{key}/regenerate apply the same `_validate_caller_can_assign_key_org`
gate when the caller specifies an `organization_id`. Proxy admins bypass.
The check no-ops when `organization_id` is not being set.
2026-05-01 18:19:24 -07:00
yuneng-jiang
e78d87ee00
Merge pull request #27011 from stuxf/fix/project-update-cross-team-hijack
fix(proxy): close project hijacking and key org IDOR (VERIA-55)
2026-05-01 18:01:06 -07:00
Yassin Kortam
9ed2fc24bf
Merge pull request #27018 from BerriAI/litellm_otelDualHandlerSpans
[Fix] Isolate dual OTEL handlers
2026-05-01 17:59:15 -07:00