- Add constants.ts with all required exports (key aliases, team IDs)
- Add fixtures/users.ts with all role definitions and storage paths
- Add fixtures/seed.sql for deterministic test database seeding
- Remove Firefox project from playwright config (only Chromium installed)
- Remove unused variable in teams.spec.ts
- Rename CircleCI job to e2e_ui_testing
Add Playwright E2E tests covering proxy admin team and key management
workflows, with a self-contained test runner and CircleCI integration.
Tests cover: create team, invite user, edit/delete team members, create
key in team, regenerate key, update TPM/RPM limits, delete key, and
verify internal user keys are visible.
Infrastructure: run_e2e.sh builds the UI from source before starting
the proxy, ensuring tests always run against the latest UI changes.
Added data-testid attributes to key UI components for reliable selectors.
- Move os and MCP_STDIO_ALLOWED_COMMANDS imports to module level in mcp_server_manager.py
- Move MCP_STDIO_ALLOWED_COMMANDS import to module level in _types.py
- Change defense-in-depth warning to HTTPException 403 for legacy non-allowlisted commands
- Ensures arbitrary command execution is blocked for both new and legacy MCP servers
Addresses Greptile review comments:
- P2: Inline imports violate CLAUDE.md style guide
- P1 security: Defense-in-depth should block, not warn, for legacy commands
Made-with: Cursor
- Defense-in-depth: warn instead of hard-fail for legacy servers
- Move os import to module level in _types.py
- Document args residual risk in allowlist comment
- Add UpdateMCPServerRequest allowlist test
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add command allowlist for MCP stdio transport to prevent RCE via
/mcp-rest/test/* endpoints. Restrict test endpoints to PROXY_ADMIN
role. Fix docker/README.md MASTER_KEY -> LITELLM_MASTER_KEY.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The /v2/key/info endpoint was missing response filtering that
the v1 /key/info endpoint already had. This aligns the two
endpoints so v2 applies the same per-key permission checks and
strips internal fields from the response. Also fixes the
key_aliases query path to resolve aliases before querying.
Allow JWT tokens matching routing_overrides to use OAuth2 introspection without enabling global OAuth2 while keeping OAuth2 routing limited to LLM/info routes. Add regression coverage for management-route boundary and tighten opaque-token assertions; update docs to reflect selective-mode route scope.
Made-with: Cursor
The .npmrc file (ignore-scripts=true, min-release-age=3d) is temporarily
removed during the Docker build since lifecycle scripts are needed by
npm ci. However, the unconditional `mv` fails when the build context
doesn't include .npmrc (e.g. when LiteLLM is vendored in a subdirectory).
Make all .npmrc mv operations conditional. This is safe because npm ci
already installs from package-lock.json with pinned versions and
integrity hashes.
PR #25258 changed _cleanup_stale_managed_objects from update_many to
execute_raw via _expire_stale_rows, but the tests were not updated.
The tests now mock _expire_stale_rows on the instance and assert
update_many calls only for job completion, not stale cleanup.
- team-admin: assert Admin Settings is not visible (role-specific check)
- proxy-admin: use users[Role.ProxyAdmin].password from constants instead of duplicating the env var fallback inline
Pin all cosign public key references to the immutable commit hash
(0112e53) that first introduced the key, instead of fetching it from
the release tag. This addresses the concern that an attacker with push
access could replace the key on main/tags and re-sign tampered images.
Docs now show two verification methods: commit hash (recommended) and
release tag (convenience), with explanation of why the hash is stronger.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: batch-limit stale managed object cleanup to prevent 300K row UPDATE (#25257)
* Add STALE_OBJECT_CLEANUP_BATCH_SIZE constant
Configurable batch limit (default 1000) for stale managed object cleanup,
preventing unbounded UPDATE queries from hitting 300K+ rows at once.
* Batch-limit stale managed object cleanup with single bounded SQL query
Two fixes to _cleanup_stale_managed_objects:
1. Replace unbounded update_many with a single execute_raw using a
subquery LIMIT, capping each poll cycle to STALE_OBJECT_CLEANUP_BATCH_SIZE
rows. Zero rows loaded into Python memory — everything stays in Postgres.
Uses the same PostgreSQL raw-SQL pattern as spend_log_cleanup.py
(the proxy requires PostgreSQL per schema.prisma).
2. Extract _expire_stale_rows as a separate method for testability.
Keeps the file_purpose='response' filter to avoid incorrectly expiring
long-running batch or fine-tune jobs that legitimately exceed the
staleness cutoff.
* docs: add STALE_OBJECT_CLEANUP_BATCH_SIZE to env vars reference
* test: remove deprecated embed-english-v2.0 cohere embedding tests
Adds a new endpoint to bulk-update team_member_permissions across
teams. Supports apply_to_all_teams (with cursor-based pagination)
or a specific list of team_ids. Merges new permissions into each
team's existing set rather than overwriting.
Also fixes test isolation bug in test_get_prompt_info_by_base_id
where leaked prisma_client state from other tests caused a
TypeError on await.
* Remove redundant matrix unit test workflow
All test paths in test-litellm-matrix.yml are fully covered by the
newer semantic unit test workflows (test-unit-*.yml), making the
matrix workflow redundant CI spend.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add Codecov coverage reporting to semantic unit test workflows
Add coverage collection (--cov) and Codecov OIDC upload to both
reusable base workflows and all 12 caller workflows, replacing the
coverage reporting that was previously only in the matrix workflow.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Move id-token/pull-requests permissions to job level for multi-job workflows
For workflows with multiple jobs (llm-providers, proxy-db), move
id-token: write and pull-requests: write from workflow level to job
level so permissions are scoped to only the jobs that need them.
Removes zizmor inline suppressions that were masking the issue.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* [Docs] Enforce Black formatting in contributor docs
Black formatting is now enforced in CI. Update CLAUDE.md, AGENTS.md,
and CONTRIBUTING.md to instruct contributors and AI agents to run
`poetry run black .` before committing, and add VS Code setup guidance.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: fixes
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The proxy_e2e_azure_batches_tests workflow is consistently flaky and
does not provide reliable signal on whether changes break anything.
Remove the workflow from both CircleCI and GitHub Actions, along with
the test directory it exclusively used.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs(blog): add cosign Docker image verification instructions
Add steps for verifying Docker images with cosign to three security blog posts:
CI/CD v2, Security Townhall, and Security Update.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs(proxy): add cosign verification to Docker/Helm/Terraform deploy page
Add image signature verification steps to the main deployment doc so
users pulling Docker images know how to verify them with cosign.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: fixes
* Update index.md
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* [Docs] Scope cosign signing docs to GHCR and specify starting version
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* [Docs] Add starting version callout to ci_cd_v2 blog post
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Krrish Dholakia <krrish+github@berri.ai>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* bump litellm-proxy-extras version to 0.4.65
* bump litellm-proxy-extras==0.4.65 in pyproject.toml
* bump litellm-proxy-extras==0.4.65 in requirements.txt
* added support for metadata (#24261)
* added support for metadata
* fix: PR review - meta truthiness, BlobResourceContents mimeType, add Blob+empty meta tests
Made-with: Cursor
* pyproject to .25
* feat(teams): resolve access group models/MCPs/agents in team endpoints
Add access_group_models, access_group_mcp_server_ids, and
access_group_agent_ids to /team/info and /v2/team/list responses.
These fields contain resources inherited from access groups, kept
separate from direct assignments so the UI can distinguish the source.
Backend: _resolve_access_group_resources() helper resolves access
group resources via existing _get_*_from_access_groups() functions.
UI: Teams table and detail view show direct models as blue badges
and access-group-sourced models as green badges.
* perf(teams): single-pass access group resolution + asyncio.gather in list endpoint
- Fetch each access group object once and extract all 3 resource fields
in a single pass instead of 3 separate calls (3N → N lookups)
- Use asyncio.gather to resolve access groups across teams concurrently
in list_team_v2 instead of sequential awaits
- Add 5 unit tests for _resolve_access_group_resources
* docs: add default_team_params to config reference and update examples
- Add default_team_params to litellm_settings reference table in
config_settings.md with all sub-fields documented
- Update self_serve.md and msft_sso.md examples to include
team_member_permissions, tpm_limit, and rpm_limit
- Fix misleading comment that implied default_team_params only applies
to SSO auto-created teams — it applies to all /team/new calls
* docs: clarify that models sub-field only applies to SSO auto-created teams
* fix: lazy import get_access_object to break cyclic import + short-circuit all-proxy-models display
- Remove get_access_object from module-level import in team_endpoints.py
and use a lazy _get_access_object wrapper to avoid cyclic dependency
- Add _prisma_client is None early-exit guard in _resolve_access_group_resources
- Short-circuit UI to show "All Proxy Models" when team.models is empty
or contains "all-proxy-models", skipping access group model resolution
* add: making organizations a select instead of read only badges
* fix(ui): only send organization_id when changed and use raw initial value
* fix(ui): add paginated team search to usage page filter
Replace the static team dropdown on the usage page with a new
TeamMultiSelect component that uses the paginated v2/team/list
endpoint with debounced server-side search and infinite scroll.
* fix(ui): fix imports and update placeholder for team multi select
* fix(ui): wire team_id filter to key alias dropdown on Virtual Keys tab
The Key Alias dropdown on the Virtual Keys page was showing aliases from
all teams regardless of which team was selected. The team_id was never
passed through the frontend chain to the backend /key/aliases endpoint.
- Backend: add optional team_id query param to /key/aliases endpoint
- networking.tsx: add team_id param to keyAliasesCall
- useKeyAliases: accept and forward team_id to API call and query key
- filter.tsx: pass allFilters context to custom filter components
- PaginatedKeyAliasSelect: read Team ID from allFilters and pass to hook
* fix(tests): correct mock targets in TestResolveAccessGroupResources
Three tests were patching the non-existent `get_access_object` instead
of `_get_access_object` (the lazy-import wrapper), causing AttributeError.
Also added missing `prisma_client` mock so tests get past the early-exit
guard and actually exercise the resolution logic.
* fix: use direct attribute access with or [] fallback in _resolve_access_group_resources
Replace getattr(ag, "field", []) with ag.field or [] for cleaner
access and safe handling if a field is None.
* fix(ui): remove model source legend from team detail view
The blue/green color distinction is self-explanatory; the legend added
visual clutter without providing enough value.
* fix(ui): add missing access_group fields to TeamData.team_info type
The TeamData interface was missing access_group_models,
access_group_mcp_server_ids, and access_group_agent_ids fields,
causing a TypeScript build failure.
* perf(teams): batch-fetch access groups in single DB query
Replace per-ID _resolve_access_group_resources loop with a single
find_many call that deduplicates IDs across all teams. Removes the
N+1 query pattern on cold cache for the team list endpoint.
* refactor(proxy): extract helpers to fix PLR0915 violations
Extract `_apply_non_admin_alias_scope` from `key_aliases`,
`_resolve_team_access_group_resources` from `team_info`, and
`_enforce_list_team_v2_access` from `list_team_v2` to bring each
function under ruff's 50-statement limit. No behavior changes.
* test(ui): update tests to match new team_id / access-group signatures
- useKeyAliases, PaginatedKeyAliasSelect: add trailing `undefined` to
spy matchers for the new `team_id` param on `useInfiniteKeyAliases`
and `keyAliasesCall`.
- EntityUsage: mock new `TeamMultiSelect` child so QueryClientProvider
is not required for team-entity tests.
- ModelsCell: replace the overflow-accordion test with one that
verifies the new collapse-on-`all-proxy-models` behavior (no
accordion, single badge).
* fix(ui): send null (not '') for cleared organization_id on team update
AntD <Select allowClear> returns undefined when the user clears the
selection. Coalescing to "" caused the team-update payload to carry
organization_id: "" instead of null, relying on the backend to coerce
it. Send null directly so the intent is explicit at the source.
* poetry
* chore: regen poetry.lock for litellm-proxy-extras 0.4.64 bump
* chore: update Next.js build artifacts (2026-04-04 17:55 UTC, node v22.16.0)
---------
Co-authored-by: shivam <shivam@uni.minerva.edu>
Co-authored-by: Ryan Crabbe <ryan@berri.ai>
Co-authored-by: yuneng-jiang <yuneng@berri.ai>
* Tag query fix (#25094)
* feat(tag-spend): implement separate scheduler job for daily tag spend updates
* fix(docker): add g++ to build dependencies in Dockerfile
* initial test cases. TODO: check scheduler init and test cases in proxy_server related to it
* resolved QPS issue when redis transaction buffer is enabled
* resolving circular import error flagged by greptile
* fix(mypy): use Optional[str] for api_base in PydanticAI provider to match superclass signature
---------
Co-authored-by: Shivam Rawat <shivam@berri.ai>
Co-authored-by: shivam <shivam@uni.minerva.edu>
Co-authored-by: Ryan Crabbe <ryan@berri.ai>
Co-authored-by: yuneng-jiang <yuneng@berri.ai>
Co-authored-by: Harish <harishgokul01@gmail.com>
Co-authored-by: Ishaan Jaffer <ishaan@berri.ai>