litellm

Author	SHA1	Message	Date
yuneng-jiang	f2969ca78a	Merge pull request #27165 from BerriAI/litellm_/friendly-lichterman-35cf02 [Fix] CI: Enable VCR replay for test_azure_o_series	2026-05-04 20:59:46 -07:00
Yuneng Jiang	0976fbc6c4	[Fix] Tests: Restore /metrics access for prometheus test suite /metrics now requires auth by default; tests/otel_tests/test_prometheus.py makes 4+ unauthenticated GETs against http://0.0.0.0:4000/metrics, so every prometheus test in CI now fails the metric assertion. Set require_auth_for_metrics_endpoint: false in otel_test_config.yaml to opt out for this test job, which scrapes /metrics directly. Verified locally: 8/8 prometheus tests green (one flaky retry on test_proxy_success_metrics that pre-dates this PR). Also drop the -x stop-on-first-failure flag from the otel test command so all failures in the job surface in a single CI run rather than hiding behind whichever one trips first.	2026-05-04 20:54:54 -07:00
Yuneng Jiang	6a6c79d992	[Fix] CI: Enable VCR replay for test_azure_o_series The Azure o-series tests were excluded from the conftest's VCR auto-marker because of a respx/vcrpy transport-patching conflict, but the only respx reference in the file was an unused `MockRouter` import. Drop the dead import and remove the file from the conflict set so cassettes record on first run and replay thereafter, eliminating the 60-95s live Azure latency that was crashing xdist workers under --timeout=120 thread-mode timeouts.	2026-05-04 20:48:26 -07:00
Sameer Kankute	b0edffb883	Merge pull request #27103 from BerriAI/litellm_azure-deployment-image-body fix(azure): omit model from deployment image gen and image edit bodies	2026-05-05 09:09:45 +05:30
Yuneng Jiang	e6f524f951	[Fix] Tests: Pick chat-completion OTEL trace by content, not recency The /otel-spans endpoint returns process-wide spans and tags most_recent_parent by max start_time. After tightening that route to proxy_admin (sk-1234), the GET /otel-spans request itself emits auth spans that beat the chat-completion spans on start_time, so most_recent_parent now points at the request's own auth trace (['postgres', 'postgres']) and the >=5-span assertion fails. Pick the chat-completion trace by content: it is the only trace whose span list is a superset of {postgres, redis, raw_gen_ai_request, batch_write_to_db}. Verified locally end-to-end against otel_test_config.yaml + OTEL_EXPORTER=in_memory: 3/3 runs green.	2026-05-04 20:35:09 -07:00
Sameer Kankute	4487d8352f	Merge pull request #27115 from Sameerlite/litellm_health_check_reasoning_effort feat(proxy): add health_check_reasoning_effort for model health checks	2026-05-05 09:00:09 +05:30
Yuneng Jiang	8a1b6635fa	[Fix] Tests: Use master key for /otel-spans in test_chat_completion_check_otel_spans /otel-spans now requires proxy admin (returns 401 'Only proxy admin can be used to generate, delete, update info for new keys/users/teams. Route=/otel-spans' for non-admin callers). Switch the GET call to use the master key sk-1234 while keeping the generated key for the chat-completion request that produces the spans.	2026-05-04 20:23:11 -07:00
Sameer Kankute	b4ee6a2355	test(proxy): cover health_check_reasoning_effort for completion mode Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-05 08:52:57 +05:30
Sameer Kankute	bb0e4168ad	refactor(azure): move image gen JSON helper; rename image edit finalize hook - Add image_generation/http_utils.azure_deployment_image_generation_json_body; call from azure.py (keeps AzureChatCompletion focused on chat). - Rename finalize_image_edit_multipart_data to finalize_image_edit_request_data with docstring covering multipart and JSON POST payloads (review feedback). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-05 08:49:46 +05:30
Yuneng Jiang	193907a4a3	[Fix] Lint: Mark _user_has_admin_view re-export in common_utils Ruff F401 flagged the aliased import as unused within common_utils.py because the name is consumed only by external modules (~15 callers across guardrails, spend tracking, MCP, agents, management endpoints). Add `# noqa: F401 re-exported` so the alias survives lint while keeping a single source of truth in litellm.proxy._types.	2026-05-04 20:16:59 -07:00
Yuneng Jiang	8cac6c5bff	[Fix] Proxy: Address Greptile feedback on hook-cycle PR - Move _user_has_admin_view to litellm.proxy._types as user_api_key_has_admin_view (single source of truth). common_utils.py and isolation.py both import from there now, removing the duplicated role-check that could silently diverge if new admin roles are added. - Add pytest.importorskip("litellm_enterprise") to the two regression tests that assert managed_files / managed_vector_stores are registered; those keys come from ENTERPRISE_PROXY_HOOKS so the tests would fail unconditionally in a checkout without the enterprise extra installed.	2026-05-04 20:13:31 -07:00
Yuneng Jiang	727ab8dcc4	[Fix] Proxy: Break managed-resources import cycle on Python 3.13 The Python 3.13 CCI smoke matrix surfaces a partially-initialized-module ImportError when loading the managed files hook chain: litellm.proxy.hooks/__init__ (mid-import) -> enterprise.enterprise_hooks -> litellm_enterprise.proxy.hooks.managed_files -> litellm.llms.base_llm.managed_resources.isolation -> litellm.proxy.management_endpoints.common_utils -> litellm.proxy.utils (re-enters litellm.proxy.hooks) The except ImportError block in hooks/__init__.py silently swallowed the failure, leaving managed_files unregistered and POST /files returning 500 "Managed files hook not found". Two-layer fix: - Inline the 3-line _user_has_admin_view check in isolation.py instead of importing it from litellm.proxy.management_endpoints.common_utils. litellm.llms.* should not depend on litellm.proxy.* — removing this layering violation breaks the cycle at its root. - Define PROXY_HOOKS and get_proxy_hook before the conditional enterprise import in litellm/proxy/hooks/__init__.py, so any future re-entry resolves the public names instead of hitting an ImportError on a partially-initialized module. Also fold in two unrelated CCI repairs surfaced in the same staging run: - tests/otel_tests/test_key_logging_callbacks.py: per-key gcs_bucket_name / gcs_path_service_account are now stripped by initialize_dynamic_callback_params, so the GCS client falls through to the env-only branch. Update the assertion to match the new "GCS_BUCKET_NAME is not set" message. - .circleci/config.yml: tests/pass_through_tests now resolves google-auth-library@10.x via the @google-cloud/vertexai 1.12.0 bump, which uses dynamic ESM imports Jest 29 cannot load without --experimental-vm-modules. Pass that flag in the Vertex JS test step. Adds tests/test_litellm/proxy/hooks/test_proxy_hooks_init.py as a regression guard: managed_files / managed_vector_stores must register, and isolation.py must not transitively import litellm.proxy.utils.	2026-05-04 20:05:24 -07:00
Yuneng Jiang	7c8409d013	chore: update Next.js build artifacts (2026-05-05 02:13 UTC, node v20.20.2)	2026-05-04 19:13:25 -07:00
yuneng-jiang	9ea824d5bf	Merge pull request #27143 from BerriAI/cursor/fix-secret-fields-in-spend-logs-a532 fix(security): prevent secret_fields from leaking into spend logs	2026-05-04 19:07:54 -07:00
yuneng-jiang	be5f217aaf	Merge pull request #26861 from BerriAI/litellm_fix_scim_virtual_key_deactivation fix(scim): revoke virtual keys when SCIM deprovisions a user	2026-05-04 19:03:55 -07:00
Cursor Agent	5923c3209b	fix(security): prevent secret_fields from leaking into spend logs secret_fields (containing raw HTTP headers including Authorization Bearer tokens) was being included in proxy_server_request['body'] because the body snapshot was a copy.copy(data) of the full request dict. This body gets serialized and persisted in the LiteLLM_SpendLogs table, exposing user credentials in the database. Root cause: data['secret_fields'] was set before the body snapshot at data['proxy_server_request']['body'] = copy.copy(data), so the full raw headers (including auth tokens) ended up in the snapshot. Fix (defense in depth): 1. Exclude 'secret_fields' when creating the body snapshot in litellm_pre_call_utils.py (primary fix) 2. Strip 'secret_fields' in _sanitize_request_body_for_spend_logs_payload as a secondary safeguard secret_fields remains available on the live data dict for legitimate downstream consumers (MCP, Responses API). Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>	2026-05-05 02:01:41 +00:00
yuneng-jiang	555a8131fe	Merge pull request #26951 from stuxf/codex/skills-containers-tenant-guard chore(proxy): tighten resource ownership checks	2026-05-04 18:47:17 -07:00
yuneng-jiang	2f305050ce	Merge pull request #27004 from stuxf/fix/managed-resource-service-account-isolation fix(proxy): isolate managed resources for service-account API keys	2026-05-04 18:45:55 -07:00
user	3dcb6bd3f9	Merge remote-tracking branch 'upstream/litellm_internal_staging' into codex/skills-containers-tenant-guard # Conflicts: # litellm/proxy/auth/auth_utils.py	2026-05-05 01:41:25 +00:00
user	7faba9656f	Merge remote-tracking branch 'upstream/litellm_internal_staging' into fix/managed-resource-service-account-isolation	2026-05-05 01:38:11 +00:00
yuneng-jiang	281296f9cf	Merge pull request #27151 from BerriAI/litellm_yj_may4 [Infra] Merge dev branch	2026-05-04 18:29:52 -07:00
user	aee064ad37	Merge remote-tracking branch 'upstream/litellm_internal_staging' into fix/managed-resource-service-account-isolation	2026-05-05 01:29:05 +00:00
yuneng-jiang	dcb357ee2d	Merge pull request #27149 from BerriAI/litellm_/peaceful-bell-ba8ca5 [Fix] Tests: Replace deprecated openrouter/claude-3.7-sonnet with claude-sonnet-4.5	2026-05-04 18:27:45 -07:00
yuneng-jiang	efca16ccfa	Merge pull request #27043 from stuxf/fix/ssti-prompt-managers fix(security): sandbox jinja2 in gitlab/arize/bitbucket prompt managers	2026-05-04 18:23:41 -07:00
Yuneng Jiang	e35cd5af76	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_yj_may4	2026-05-04 18:22:47 -07:00
Yuneng Jiang	7f550a5d67	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_/peaceful-bell-ba8ca5	2026-05-04 18:21:33 -07:00
Yassin Kortam	db2a3cafb6	Merge pull request #27131 from BerriAI/litellm_fix/routing-groups-ui feat: routing groups ui	2026-05-04 18:16:49 -07:00
mateo-berri	4179159f0f	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_azure-deployment-image-body	2026-05-04 18:16:46 -07:00
Yassin Kortam	a56256e5ee	feat: routing groups ui	2026-05-04 18:09:14 -07:00
yuneng-jiang	42cd9493e9	Merge pull request #27071 from stuxf/fix/strip-pricing-fields chore(proxy): drop client-supplied pricing fields from request bodies	2026-05-04 18:08:41 -07:00
yuneng-jiang	68c120a68f	Merge pull request #26957 from stuxf/chore/guardrail-coverage chore(guardrails): cover multimodal + Responses-API content shapes	2026-05-04 18:01:27 -07:00
Yuneng Jiang	00d0c3e745	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_/peaceful-bell-ba8ca5	2026-05-04 17:51:54 -07:00
Yuneng Jiang	22782f3c3f	[Fix] Tests: Replace deprecated openrouter/claude-3.7-sonnet with claude-sonnet-4.5 OpenRouter has dropped active endpoints for anthropic/claude-3.7-sonnet, causing test_reasoning_content_completion to fail with a 404 "No endpoints found" error. Switch to anthropic/claude-sonnet-4.5, which is current and supports reasoning streaming.	2026-05-04 17:51:50 -07:00
user	4699b3dc81	chore(container): use delete_cache, json-encode scope key, clean test /simplify follow-ups: * Replace the two-``pop`` reach into ``cache_dict``/``ttl_dict`` with the existing public ``InMemoryCache.delete_cache(key)`` — the same idiom used elsewhere in the proxy. Bonus: ``delete_cache`` calls ``_remove_key`` which also handles ``expiration_heap`` consistency the direct pops were silently leaking. * JSON-encode the sorted scope list for the cache key instead of ``"\|".join``. ``user_id`` / ``team_id`` / ``org_id`` / ``api_key`` are free-form strings and could contain a literal ``\|`` — JSON quoting escapes any in-string separator unambiguously. * Extract ``_allowed_container_ids_cache_key()`` so the read and invalidation sites compute the key the same way. * Fix a placeholder-then-overwrite test construction: the ``__module__.split(".")[0] and "proxy_admin"`` line evaluated to a literal string that was immediately overwritten with the real enum value. Hoist the import and construct directly.	2026-05-05 00:43:47 +00:00
user	2adfa96db2	fix(container): cache list-allow-set, track admin-created containers Address Greptile P2 follow-ups from the prior round: * Cache ``_get_allowed_container_ids`` (60s LRU/TTL keyed by sorted owner-scope tuple) so ``GET /v1/containers`` doesn't issue a fresh ``find_many`` against ``litellm_managedobjecttable`` on every list call. Invalidate the caller's own cache entry when they record a new owner so the just-created container shows up on their next list. * Tighten the admin early-return in ``record_container_owner`` to skip ONLY when there's literally no container ID to stamp. An admin with identity (the master-key path populates ``user_id`` + ``api_key``) flows through the normal record path so admin-created containers are tracked like any other caller's. The truly-identity-less admin case still falls through to the 403 below — correct fail-secure default. Skill-cache invalidation gap (also flagged by Greptile) is moot: there is no skill update endpoint exposed; ownership-affecting mutations are only delete (already invalidates) and create (new ID, no cache entry to update).	2026-05-05 00:39:53 +00:00
user	6ce84effe1	chore: simplify ownership tracking — drop thin stores, in-memory fallback, hand-rolled cache Substantial reduction (~765 LOC) without changing the security boundary: * Drop ContainerOwnershipStore and LiteLLMSkillsStore — both were one-method-per-Prisma-call wrappers. Inline the calls instead, matching the established pattern in vector_store_endpoints, agent_endpoints, and mcp_server/db.py. * Drop the prisma_client is None in-memory fallback. Production deploys always have Prisma; running ownership-critical paths on a process-local dict is a security footgun in the dev-mode case it was meant to support, and complicates every code path with a branch. Fail-secure: skip recording if Prisma is unavailable, and treat reads as "not found" (admin-only). * Drop the hand-rolled module-level cache. Replace with the existing litellm.caching.in_memory_cache.InMemoryCache, which already has TTL + max-size + eviction tested in its own module. Sentinel string for negative caching since InMemoryCache can't disambiguate "miss" from "cached as None". * Tests: drop coverage for removed code paths (in-memory fallback, hand-rolled cache internals). Keep tests for actual behavior (cache hit-rate, negative caching, owner check, list filtering, identity-less reject, admin bypass).	2026-05-05 00:23:32 +00:00
user	83971a8712	fix(proxy): normalize managed resource team owner field	2026-05-04 17:05:50 -07:00
yuneng-jiang	de7175d6ab	Merge pull request #26912 from stuxf/codex/auth-sensitive-routes chore(proxy): guard sensitive public endpoints	2026-05-04 17:04:10 -07:00
user	12fe945e7b	fix: keep skills handler FastAPI-free; fold gcs deny list into the body bouncer Two cleanups: * ``LiteLLMSkillsHandler.create_skill`` raised ``HTTPException`` for identity-less callers, importing FastAPI from a ``litellm/llms/`` module — that violates the project rule that FastAPI lives only under ``proxy/``. Switch to ``ValueError`` (the same shape the rest of the handler uses for not-found/forbidden) and update the test. * The proxy-auth body bouncer derived its observability ban list from ``_supported_callback_params`` only, missing ``_request_blocked_callback_params`` (where ``gcs_bucket_name`` and ``gcs_path_service_account`` live). Two recently-merged sibling PRs (#27019 added the deny list, #27081 added the test asserting these are rejected at the request body root) crossed without folding them together. Union the GCS deny list into the bouncer's derivation so the single source of truth covers both code paths.	2026-05-04 23:54:33 +00:00
user	abcf204d38	fix(proxy): include request-blocked callback params in auth bans	2026-05-04 16:54:04 -07:00
user	b5a14f22d6	Merge remote-tracking branch 'upstream/litellm_internal_staging' into codex/skills-containers-tenant-guard	2026-05-04 23:50:29 +00:00
user	6a3f6b47de	Merge remote-tracking branch 'origin/litellm_internal_staging' into fix/strip-pricing-fields-pr27071 # Conflicts: # litellm/proxy/litellm_pre_call_utils.py	2026-05-04 16:45:21 -07:00
user	777862a018	Merge remote-tracking branch 'upstream/litellm_internal_staging' into codex/skills-containers-tenant-guard	2026-05-04 23:40:26 +00:00
user	758b488326	fix(ownership): reject identity-less callers instead of sharing a sentinel scope UNSCOPED_RESOURCE_OWNER_SCOPE collapsed every caller without an identity field (no user_id / team_id / org_id / api_key / token) into a single shared owner — a cross-tenant access primitive: any two such callers could see and delete each other's containers and skills. Drop the sentinel. ``get_primary_resource_owner_scope`` returns ``None`` and ``get_resource_owner_scopes`` returns ``[]`` for identity-less callers. ``record_container_owner`` and ``LiteLLMSkillsHandler.create_skill`` now reject creates from identity-less callers with a 403 instead of stamping the placeholder. Read paths already deny ``owner is None`` correctly so legacy rows (if any) are admin-only.	2026-05-04 23:40:22 +00:00
user	de682c810e	chore(container,skills): drop legacy-access opt-out env vars LITELLM_ALLOW_UNTRACKED_CONTAINER_ACCESS and LITELLM_ALLOW_UNOWNED_SKILL_ACCESS were operator-toggleable opt-outs for the cross-tenant access primitive this PR closes — flipping either on re-enabled exactly the VERIA-20 read path. Default-secure with no escape hatch matches sibling fixes (vector-store cred isolation, semantic cache key isolation, user_config strip): all rejected the opt-out-of-security pattern. Untracked containers and unowned skills (rows that pre-date this enforcement) are admin-only. Non-admin owners need to either re-create via the now-tracked flow or have an admin assign ``created_by`` on the existing row. Update tests to assert the strict-only behaviour.	2026-05-04 23:22:19 +00:00
yuneng-jiang	07824b5eec	Merge pull request #26990 from stuxf/codex/semantic-cache-tenant-isolation chore(caching): isolate semantic cache entries	2026-05-04 16:02:43 -07:00
yuneng-jiang	0c0b5e005f	Merge pull request #27082 from stuxf/fix/vector-store-cred-leak fix(vector_store): resolve embedding config at request time, never persist creds	2026-05-04 15:55:40 -07:00
user	ec9b84d38c	chore(container,skills): LRU eviction for owner caches; widen file_purpose Literal Two cleanups from the /simplify pass: * ``_CONTAINER_OWNER_CACHE`` and ``_SKILL_CACHE`` now LRU-evict via ``OrderedDict.popitem(last=False)`` instead of full ``clear()`` at capacity. Full clears converted a steady-state cached workload into a periodic full-DB-load oscillation as the cache repopulated from zero and cleared again. Reads now ``move_to_end`` so the just-touched entry survives the next eviction. Mirrors the pre-existing LRU pattern in ``_remember_container_owner``. * ``LiteLLM_ManagedObjectTable.file_purpose`` Literal now includes ``"container"`` so Pydantic validation accepts rows written by the ownership store.	2026-05-04 22:52:54 +00:00
yuneng-jiang	e4ac46b5d1	Merge pull request #27081 from stuxf/fix/strip-callback-fields chore(proxy): close callback-config and observability-credential side channels	2026-05-04 15:45:42 -07:00
user	4fa577810b	fix(container): keep ownership-filter exceptions out of the LLM-error path filter_container_list_response runs after the upstream call has already succeeded; treating an ownership-lookup failure as an LLM-API error fires post_call_failure_hook for a successful upstream call and returns a misleading provider-shaped error to the client. Run the filter outside the try/except so genuine LLM errors stay scoped to the upstream call.	2026-05-04 22:43:18 +00:00

1 2 3 4 5 ...

39037 Commits