Commit Graph

36703 Commits

Author SHA1 Message Date
stuxf
a6c30b30bf
build: migrate packaging, CI, and Docker from Poetry to uv (#25007)
* build: migrate packaging metadata to uv

* ci: move automation and local tooling to uv

* docker: migrate image builds and runtime setup to uv

* docs: update install and deployment guidance for uv

* chore: align auxiliary scripts and tests with uv

* test: harden test_litellm isolation

* fix: keep release and health check images self-contained

* build: pin uv tooling and health check deps

* test: isolate bedrock image request formatting from suite state

* test: cover sandbox executor requirements flow

* ci: fix circleci no-op command steps

* ci: fix circleci publish workflow parsing

* fix: stabilize remaining uv migration CI checks

* ci: increase matrix test timeout headroom

* fix: restore published docker and license coverage

* fix: restore proxy runtime build parity

* fix: restore proxy extras parity and venv migrations

* ci: persist uv path across circleci steps

* fix: keep psycopg binary in default test env

* docker: preserve prisma cache across stages

* test: run local proxy checks through uv python

* build: restore runtime deps moved into ci

* build: refresh uv lock after upstream merge

* fix: restore module import in test_check_migration after merge

The conflict resolution imported only the function but the test body
references check_migration as a module throughout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: revert dependency promotions, remove nodejs-wheel-binaries, fix Docker layer caching

- Move google-generativeai, Pillow, tenacity back to ci group (they are
  lazily imported and bloat the base SDK install needlessly)
- Remove nodejs-wheel-binaries from extra_proxy and proxy-dev (redundant
  in Docker where system Node.js is already installed via apk)
- Remove all nodejs-wheel node replacement and venv npm patching blocks
  from Dockerfiles since the wheel is no longer installed
- Add --no-default-groups to CodSpeed benchmark workflow so the benchmark
  environment matches the old minimal pip install footprint
- Apply standard uv two-phase Docker pattern: copy metadata first, install
  deps (cached layer), then copy source and install project
- Replace CircleCI enterprise no-op with proper uv sync command

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: regenerate uv.lock after removing nodejs-wheel-binaries

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): use cache/restore instead of cache to prevent cache poisoning

The old workflow used actions/cache/restore (read-only). The uv migration
changed it to actions/cache (read-write), which zizmor flags as a cache
poisoning risk. Restore the safer read-only variant.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): disable setup-uv built-in cache to silence cache-poisoning alert

The setup-uv action enables caching by default, which zizmor flags as a
cache poisoning risk. Disable it since we already use a read-only
cache/restore step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): disable setup-uv cache in publish workflow

Silences zizmor cache-poisoning alert. Publishing workflow runs
infrequently on protected branches so caching adds no real benefit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(test): remove duplicate verbose_logger mock in test_check_migration

The logger was patched twice — first via mocker.patch() then via
mocker.patch.object(autospec=True). The second call fails because
autospec cannot inspect an already-mocked attribute. Remove the
redundant first patch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): free disk space before Docker build in test-server-root-path

The Dockerfile.non_root build ran out of disk on the CI runner. Remove
Android SDK, .NET, Boost, and GHC toolchains (~12GB) to free space.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:46:23 -07:00
michelligabriele
cd9c511df6
feat(proxy): add credential overrides per team/project via model_config metadata (#24438) 2026-04-09 07:22:27 -07:00
Sameer Kankute
97f722f558
feat(cost): add baseten model api pricing entries (#25358)
Add Baseten Model API pricing entries for Nemotron, GLM, Kimi, GPT OSS, and DeepSeek models with validated model slugs. Include a focused regression test to assert provider and per-token pricing values.

Made-with: Cursor
2026-04-08 21:39:58 -07:00
Krrish Dholakia
f42ffed2bd
Litellm oss staging 04 02 2026 p1 (#25055)
* fix(vertex_ai): support pluggable (executable) credential_source for WIF auth (#24700)

The WIF credential dispatch in load_auth() only handled identity_pool and
aws credential types. When credential_source.executable was present (used
for Azure Managed Identity via Workload Identity Federation), it fell
through to identity_pool.Credentials which rejected it with MalformedError.

Add dispatch to google.auth.pluggable.Credentials for executable-type
credential sources, following the same pattern as the existing identity_pool
and aws helpers.

Fixes authentication for Azure Container Apps → GCP Vertex AI via WIF
with executable credential sources.

* feat(logging): add component and logger fields to JSON logs for 3rd p… (#24447)

* feat(logging): add component and logger fields to JSON logs for 3rd party filtering

* Let user-supplied extra fields win over auto-generated component/logger, tighten test assertions

* Feat - Add organization into the metrics metadata for org_id & org_alias (#24440)

* Add org_id and org_alias label names to Prometheus metric definitions

* Add user_api_key_org_alias to StandardLoggingUserAPIKeyMetadata

* Populate user_api_key_org_alias in pre-call metadata

* Pass org_id and org_alias into per-request Prometheus metric labels

* Add test for org labels on per-request Prometheus metrics

* chore: resolve test mockdata

* Address review: populate org_alias from DB view, add feature flag, use .get() for org metadata

* Add org labels to failure path and verify flag behavior in test

* Fix test: build flag-off enum_values without org fields

* Gate org labels behind feature flag in get_labels() instead of static metric lists

* Scope org label injection to metrics that carry team context, remove orphaned budget label defs, add test teardown

* Use explicit metric allowlist for org label injection instead of team heuristic

* Fix duplicate org label guard, move _org_label_metrics to class constant

* Reset custom_prometheus_metadata_labels after duplicate label assertion

* fix: emit org labels by default, remove flag, fix missing org_alias in all metadata paths

* fix: emit org labels by default, no opt-in flag required

* fix: write org_alias to metadata unconditionally in proxy_server.py

* fix: 429s from batch creation being converted to 500 (#24703)

* add us gov models (#24660)

* add us gov models

* added max tokens

* Litellm dev 04 02 2026 p1 (#25052)

* fix: replace hardcoded url

* fix: Anthropic web search cost not tracked for Chat Completions

The ModelResponse branch in response_object_includes_web_search_call()
only checked url_citation annotations and prompt_tokens_details, missing
Anthropic's server_tool_use.web_search_requests field. This caused
_handle_web_search_cost() to never fire for Anthropic Claude models.

Also routes vertex_ai/claude-* models to the Anthropic cost calculator
instead of the Gemini one, since Claude on Vertex uses the same
server_tool_use billing structure as the direct Anthropic API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix(anthropic): pass logging_obj to client.post for litellm_overhead_time_ms (#24071)

When LITELLM_DETAILED_TIMING=true, litellm_overhead_time_ms was null for
Anthropic because the handler did not pass logging_obj to client.post(),
so track_llm_api_timing could not set llm_api_duration_ms. Pass
logging_obj=logging_obj at all four post() call sites (make_call,
make_sync_call, acompletion, completion). Add test to ensure make_call
passes logging_obj to client.post.

Made-with: Cursor

* sap - add additional parameters for grounding

- additional parameter for grounding added for the sap provider

* sap - fix models

* (sap) add filtering, masking, translation SAP GEN AI Hub modules

* (sap) add tests and docs for new SAP modules

* (sap) add support of multiple modules config

* (sap) code refactoring

* (sap) rename file

* test(): add safeguard tests

* (sap) update tests

* (sap) update docs, solve merge conflict in transformation.py

* (sap) linter fix

* (sap) Align embedding request transformation with current API

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) mock commit

* (sap) run black formater

* (sap) add literals to models, add negative tests, fix test for tool transformation

* (sap) fix formating

* (sap) fix models

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) commit for rerun bot review

* (sap) minor improve

* (sap) fix after bot review

* (sap) lint fix

* docs(sap): update documentation

* fix(sap): change creds priority

* fix(sap): change creds priority

* fix(sap): fix sap creds unit test

* fix(sap): linter fix

* fix(sap): linter fix

* linter fix

* (sap) update logic of fetching creds, add additional tests

* (sap) clean up code

* (sap) fix after review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) add a possibility to put the service key by both variants

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) update test

* (sap) update service key resolve function

* (sap) run black formater

* (sap) fix validate credentials, add negative tests for credential fetching

* (sap) fix validate credentials, add negative tests for credential fetching

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) lint fix

* (sap) lint fix

* feat: support service_tier in gemini

* chore: add a service_tier field mapping from openai to gemini

* fix: use x-gemini-service-tier header in response

* docs: add service_tier to gemini docs

* chore: add defaut/standard mapping, and some tests

* chore: tidying up some case insensitivity

* chore: remove unnecessary guard

* fix: remove redundant test file

* fix: handle 'auto' case-insensitively

* fix: return service_tier on final steamed chunk

* chore: black

* feat: enable supports_service_tier to gemini models

* Fix get_standard_logging_metadata tests

* Fix test_get_model_info_bedrock_models

* Fix test_get_model_info_bedrock_models

* Fix remaining tests

* Fix mypy issues

* Fix tests

* Fix merge conflicts

* Fix code qa

* Fix code qa

* Fix code qa

* Fix greptile review

---------

Co-authored-by: michelligabriele <gabriele.michelli@icloud.com>
Co-authored-by: Josh <36064836+J-Byron@users.noreply.github.com>
Co-authored-by: mubashir1osmani <mubashir.osmani777@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: milan-berri <milan@berri.ai>
Co-authored-by: Alperen Kömürcü <alperen.koemuercue@sap.com>
Co-authored-by: Vasilisa Parshikova <vasilisa.parshikova@sap.com>
Co-authored-by: Lin Xu <lin.xu03@sap.com>
Co-authored-by: Mark McDonald <macd@google.com>
Co-authored-by: Sameer Kankute <sameer@berri.ai>
2026-04-08 21:37:10 -07:00
Sameer Kankute
6a0e0ce061
fix(router): pass custom_llm_provider to get_llm_provider for unprefixed model names (#25334)
Fixes 'LLM Provider NOT provided' errors when models are configured with
custom_llm_provider but model names lack provider prefix (e.g., 'gpt-4.1-mini'
instead of 'azure/gpt-4.1-mini').

Changes:
- Router now passes deployment's custom_llm_provider to get_llm_provider()
- Fixes 6 code paths: file creation, file content, batch operations, vector store
- Adds regression tests for file creation and file content operations

Made-with: Cursor
2026-04-08 21:27:13 -07:00
Sameer Kankute
6e6f5be3e4
feat(triton): add embedding usage estimation for self-hosted responses (#25345)
* feat(triton): add embedding usage estimation for self-hosted responses

Populate Triton embedding usage from request input using token counting with a safe fallback so cost/observability flows work even when provider usage is missing.

Made-with: Cursor

* fix(triton): sum per-input embedding token counts for batches

Joining batch strings with newlines before token_counter added spurious
tokens. Count each input separately and sum, matching OpenAI-style usage.

Made-with: Cursor
2026-04-08 21:14:27 -07:00
Sameer Kankute
3a4ed48f54
fix(router): don't create litellm_metadata for non-Responses API calls in encrypted_content_affinity_check (#25347)
Using setdefault('litellm_metadata', {}) unconditionally created an empty
litellm_metadata key for chat completions and embeddings. This caused
_get_metadata_variable_name_from_kwargs to return 'litellm_metadata' instead
of 'metadata', so tag-based routing looked for tags in the wrong dict and
ignored all tag filters.

Fix: only set the encrypted_content_affinity_enabled flag when litellm_metadata
already exists (Responses API path). Chat completions and embeddings never have
this key, so nothing is created and tag routing works correctly.
2026-04-08 21:11:19 -07:00
Kedar Thakkar
233870d7b2
Add Ramp as a built-in generic API callback with docs (#23769) 2026-04-08 20:06:48 -07:00
yuneng-jiang
072d4108c3
Merge pull request #25365 from BerriAI/litellm_e2e_ui_tests
[Feature] UI E2E Tests: Proxy Admin Team and Key Management
2026-04-08 15:29:10 -07:00
shin-berri
d871bce86e
Merge pull request #25354 from BerriAI/litellm_migrate_redis_tests_to_circleci
[Infra] Migrate Redis caching tests from GHA to CircleCI
2026-04-08 15:14:55 -07:00
Yuneng Jiang
467dbc4a3c
[Fix] Remove old broken key tests superseded by proxy-admin/keys.spec.ts 2026-04-08 13:32:37 -07:00
Yuneng Jiang
4ee7d42981
[Fix] Restructure HTML files after UI build so extensionless routes work in CI 2026-04-08 13:24:52 -07:00
Yuneng Jiang
ac9ebdf4d8
[Fix] Rename CI job to e2e_ui_testing and remove duplicate old job definition 2026-04-08 13:17:45 -07:00
Yuneng Jiang
a8f4f464ce
[Fix] Add missing test fixtures and address review feedback
- Add constants.ts with all required exports (key aliases, team IDs)
- Add fixtures/users.ts with all role definitions and storage paths
- Add fixtures/seed.sql for deterministic test database seeding
- Remove Firefox project from playwright config (only Chromium installed)
- Remove unused variable in teams.spec.ts
- Rename CircleCI job to e2e_ui_testing
2026-04-08 12:40:41 -07:00
Yuneng Jiang
d09d98a70a
[Feature] E2E UI tests: proxy-admin team and key management with CI integration
Add Playwright E2E tests covering proxy admin team and key management
workflows, with a self-contained test runner and CircleCI integration.

Tests cover: create team, invite user, edit/delete team members, create
key in team, regenerate key, update TPM/RPM limits, delete key, and
verify internal user keys are visible.

Infrastructure: run_e2e.sh builds the UI from source before starting
the proxy, ensuring tests always run against the latest UI changes.
Added data-testid attributes to key UI components for reliable selectors.
2026-04-08 11:51:15 -07:00
Yuneng Jiang
7ba0c69a07
[Fix] Install pytest-rerunfailures in redis caching CircleCI job 2026-04-08 11:50:00 -07:00
yuneng-jiang
2dac54b732
Merge pull request #25343 from BerriAI/litellm_fix-mcp-stdio-rce3
fix(mcp): block arbitrary command execution via stdio transport
2026-04-08 11:12:39 -07:00
Yuneng Jiang
0104b60d8e
[Infra] Add redis_caching_coverage to coverage combine command 2026-04-08 10:48:41 -07:00
Yuneng Jiang
3a02c0ac6b
[Infra] Migrate Redis caching tests from GHA to CircleCI
Redis caching unit tests (test_dual_cache, test_redis_batch_optimizations,
test_router_utils) required Redis secrets that should live in CircleCI.

- Add redis_caching_unit_tests job to CircleCI config
- Delete test-unit-caching-redis.yml GHA workflow
- Remove all Redis plumbing (inputs, secrets, env vars) from
  _test-unit-services-base.yml and its callers
2026-04-08 09:07:12 -07:00
Sameer Kankute
65829f79d7
docs: document LITELLM_MCP_STDIO_EXTRA_COMMANDS in env reference
Required by tests/documentation_tests/test_env_keys.py for os.getenv usage in constants.

Made-with: Cursor
2026-04-08 21:31:51 +05:30
Sameer Kankute
69be5be88b
fix(mcp): move inline imports to module level and enforce stdio allowlist
- Move os and MCP_STDIO_ALLOWED_COMMANDS imports to module level in mcp_server_manager.py
- Move MCP_STDIO_ALLOWED_COMMANDS import to module level in _types.py
- Change defense-in-depth warning to HTTPException 403 for legacy non-allowlisted commands
- Ensures arbitrary command execution is blocked for both new and legacy MCP servers

Addresses Greptile review comments:
- P2: Inline imports violate CLAUDE.md style guide
- P1 security: Defense-in-depth should block, not warn, for legacy commands

Made-with: Cursor
2026-04-08 21:28:43 +05:30
Sameer Kankute
ad31e79b97
fix(mcp): address Greptile review feedback
- Defense-in-depth: warn instead of hard-fail for legacy servers
- Move os import to module level in _types.py
- Document args residual risk in allowlist comment
- Add UpdateMCPServerRequest allowlist test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 19:42:41 +05:30
Sameer Kankute
7b7f304675
fix(mcp): block arbitrary command execution via stdio transport
Add command allowlist for MCP stdio transport to prevent RCE via
/mcp-rest/test/* endpoints. Restrict test endpoints to PROXY_ADMIN
role. Fix docker/README.md MASTER_KEY -> LITELLM_MASTER_KEY.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 19:42:37 +05:30
shin-berri
62757ff48f
Merge pull request #25316 from BerriAI/litellm_yj_apr7
[Infra] Bump version 1.83.4 → 1.83.5
2026-04-07 18:53:11 -07:00
Yuneng Jiang
bd327dbe54
bump: version 1.83.4 → 1.83.5 2026-04-07 18:37:29 -07:00
yuneng-jiang
5f49f29f4e
Merge pull request #25048 from joereyna/fix/dockerfile-node-gyp-path
Fix node-gyp symlink path after npm upgrade in Dockerfile
2026-04-07 17:14:04 -07:00
joereyna
41407d0287
Fix node-gyp symlink path after npm upgrade in Dockerfile 2026-04-07 17:01:55 -07:00
yuneng-jiang
08f34aa3cc
Merge pull request #25313 from BerriAI/litellm_align_v2_key_info_with_v1
[Refactor] Align /v2/key/info response handling with v1
2026-04-07 15:54:12 -07:00
yuneng-jiang
096893ea97
Merge pull request #25273 from BerriAI/litellm_pin_cosign_pub_to_commit
[Infra] Pin cosign.pub verification to initial commit hash
2026-04-07 15:40:46 -07:00
Yuneng Jiang
021429b797
[Refactor] Align /v2/key/info response handling with v1
The /v2/key/info endpoint was missing response filtering that
the v1 /key/info endpoint already had. This aligns the two
endpoints so v2 applies the same per-key permission checks and
strips internal fields from the response. Also fixes the
key_aliases query path to resolve aliases before querying.
2026-04-07 15:21:42 -07:00
milan-berri
bf8b615b64
fix(auth): support selective jwt override oauth2 routing (#25252)
Allow JWT tokens matching routing_overrides to use OAuth2 introspection without enabling global OAuth2 while keeping OAuth2 routing limited to LLM/info routes. Add regression coverage for management-route boundary and tighten opaque-token assertions; update docs to reflect selective-mode route scope.

Made-with: Cursor
2026-04-07 13:52:47 -07:00
yuneng-jiang
f3bc20056d
Merge pull request #25307 from BerriAI/litellm_/fix_npmrc_dockerfile
[Fix] Dockerfile.non_root: handle missing .npmrc gracefully
2026-04-07 13:01:30 -07:00
Yuneng Jiang
537727f0da
[Fix] Dockerfile.non_root: handle missing .npmrc gracefully
The .npmrc file (ignore-scripts=true, min-release-age=3d) is temporarily
removed during the Docker build since lifecycle scripts are needed by
npm ci. However, the unconditional `mv` fails when the build context
doesn't include .npmrc (e.g. when LiteLLM is vendored in a subdirectory).

Make all .npmrc mv operations conditional. This is safe because npm ci
already installs from package-lock.json with pinned versions and
integrity hashes.
2026-04-07 12:44:04 -07:00
yuneng-jiang
23e1a7d7c2
Merge pull request #25126 from BerriAI/litellm_ui_e2e_psql_pr
[Test] UI - E2E: Add Playwright tests with local PostgreSQL
2026-04-07 11:59:11 -07:00
Yuneng Jiang
184050e2a1
Merge remote main into litellm_ui_e2e_psql_pr 2026-04-07 10:27:12 -07:00
Yuneng Jiang
ce75fde727
Merge remote main into litellm_pin_cosign_pub_to_commit 2026-04-07 10:27:00 -07:00
yuneng-jiang
730ba0f670
Merge pull request #25299 from BerriAI/litellm_fix_check_responses_cost_tests
[Fix] Update check_responses_cost tests for _expire_stale_rows
2026-04-07 10:22:58 -07:00
Yuneng Jiang
48a68230c8
fix(test): update check_responses_cost tests for _expire_stale_rows
PR #25258 changed _cleanup_stale_managed_objects from update_many to
execute_raw via _expire_stale_rows, but the tests were not updated.
The tests now mock _expire_stale_rows on the instance and assert
update_many calls only for job completion, not stale cleanup.
2026-04-07 10:09:11 -07:00
Yuneng Jiang
8c16bc0346
Merge remote-tracking branch 'origin/main' into litellm_ui_e2e_psql_pr 2026-04-07 09:12:33 -07:00
Shivam Rawat
2bb7387a83
Litellm aws gov cloud mode support (#25254)
* add us gov models

* added max tokens

* greptile fix

---------

Co-authored-by: mubashir1osmani <mubashir.osmani777@gmail.com>
2026-04-07 08:49:17 -07:00
Yuneng Jiang
54f4be6ab6
Merge remote-tracking branch 'origin/main' into litellm_ui_e2e_psql_pr 2026-04-06 23:18:35 -07:00
Yuneng Jiang
965879e74f
fix: address Greptile review comments
- team-admin: assert Admin Settings is not visible (role-specific check)
- proxy-admin: use users[Role.ProxyAdmin].password from constants instead of duplicating the env var fallback inline
2026-04-06 23:12:40 -07:00
Yuneng Jiang
30565581be
[Infra] Pin cosign.pub verification to initial commit hash
Pin all cosign public key references to the immutable commit hash
(0112e53) that first introduced the key, instead of fetching it from
the release tag. This addresses the concern that an attacker with push
access could replace the key on main/tags and re-sign tampered images.

Docs now show two verification methods: commit hash (recommended) and
release tag (convenience), with explanation of why the hash is stronger.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:53:23 -07:00
ishaan-berri
03d9746815
bump litellm version to 1.83.4 (#25266)
* bump litellm version to 1.83.4

* regenerate poetry.lock
2026-04-06 21:30:20 -07:00
ishaan-berri
4bcd4bef44
bump litellm-enterprise to 0.1.37 (#25265)
* bump litellm-enterprise to 0.1.37

* update poetry.lock for enterprise 0.1.37 bump
2026-04-06 21:23:25 -07:00
ishaan-berri
7a9a9f0c79
fix: batch-limit stale managed object cleanup to prevent 300K row UPD… (#25258)
* fix: batch-limit stale managed object cleanup to prevent 300K row UPDATE (#25257)

* Add STALE_OBJECT_CLEANUP_BATCH_SIZE constant

Configurable batch limit (default 1000) for stale managed object cleanup,
preventing unbounded UPDATE queries from hitting 300K+ rows at once.

* Batch-limit stale managed object cleanup with single bounded SQL query

Two fixes to _cleanup_stale_managed_objects:

1. Replace unbounded update_many with a single execute_raw using a
   subquery LIMIT, capping each poll cycle to STALE_OBJECT_CLEANUP_BATCH_SIZE
   rows. Zero rows loaded into Python memory — everything stays in Postgres.
   Uses the same PostgreSQL raw-SQL pattern as spend_log_cleanup.py
   (the proxy requires PostgreSQL per schema.prisma).

2. Extract _expire_stale_rows as a separate method for testability.

Keeps the file_purpose='response' filter to avoid incorrectly expiring
long-running batch or fine-tune jobs that legitimately exceed the
staleness cutoff.

* docs: add STALE_OBJECT_CLEANUP_BATCH_SIZE to env vars reference

* test: remove deprecated embed-english-v2.0 cohere embedding tests
2026-04-06 19:11:55 -07:00
ryan-crabbe-berri
3ac61a519b
Merge pull request #25239 from BerriAI/litellm_backfill-team-member-permissions
feat: add POST /team/permissions_bulk_update endpoint
2026-04-06 18:15:42 -07:00
Ryan Crabbe
fdd2672e93
feat: add POST /team/permissions_bulk_update endpoint
Adds a new endpoint to bulk-update team_member_permissions across
teams. Supports apply_to_all_teams (with cursor-based pagination)
or a specific list of team_ids. Merges new permissions into each
team's existing set rather than overwriting.

Also fixes test isolation bug in test_get_prompt_info_by_base_id
where leaked prisma_client state from other tests caused a
TypeError on await.
2026-04-06 17:45:35 -07:00
yuneng-jiang
d132b1bf51
[Infra] Remove Redundant Matrix Unit Test Workflow (#25251)
* Remove redundant matrix unit test workflow

All test paths in test-litellm-matrix.yml are fully covered by the
newer semantic unit test workflows (test-unit-*.yml), making the
matrix workflow redundant CI spend.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add Codecov coverage reporting to semantic unit test workflows

Add coverage collection (--cov) and Codecov OIDC upload to both
reusable base workflows and all 12 caller workflows, replacing the
coverage reporting that was previously only in the matrix workflow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Move id-token/pull-requests permissions to job level for multi-job workflows

For workflows with multiple jobs (llm-providers, proxy-db), move
id-token: write and pull-requests: write from workflow level to job
level so permissions are scoped to only the jobs that need them.
Removes zizmor inline suppressions that were masking the issue.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:52:38 -07:00
yuneng-jiang
d2d99aa082
[Docs] Enforce Black Formatting in Contributor Docs (#25135)
* [Docs] Enforce Black formatting in contributor docs

Black formatting is now enforced in CI. Update CLAUDE.md, AGENTS.md,
and CONTRIBUTING.md to instruct contributors and AI agents to run
`poetry run black .` before committing, and add VS Code setup guidance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: fixes

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:51:25 -07:00