litellm

Author	SHA1	Message	Date
Yuneng Jiang	491aa7ea51	[Fix] CI: fix 6 more CircleCI job failures from uv migration 1. check_code_and_doc_quality: add PLR0915 ignore for sandbox_executor.py 2. auth_ui_unit_tests: add prisma generate step (entrypoint.sh only runs migration, not client generation) 3. proxy_store_model_in_db_tests: move does_mcp_server_exist outside `if MCP_AVAILABLE:` so it's importable on Python < 3.10 4. build_and_test: fix datetime.fromisoformat('...Z') on Python 3.9 (Z suffix support was added in 3.11) 5. proxy_logging_guardrails: fix container name typo my-app-2 -> my-app-3 6. upload-coverage: use `uv tool run` instead of `uv run --with` to avoid resolving the full workspace (which fails for Python 3.14)	2026-04-10 21:06:25 -07:00
Yuneng Jiang	93d340c1ad	[Fix] CI: fix uv migration breaking 7 CircleCI jobs setup_litellm_enterprise_pip was running `uv sync --package litellm-enterprise` which overwrites the shared .venv, stripping out pytest, prisma, and other dev/test deps. Since litellm-enterprise is a workspace member it is already installed by the main `uv sync --all-groups --all-extras`. Replace with a verification-only import check. Also prefix bare `ruff check` with `uv run --no-sync` since uv does not auto-activate the venv.	2026-04-10 17:09:33 -07:00
Yuneng Jiang	bb7ac7c4ca	[Fix] Finish uv migration for redis_caching, e2e_ui, and fix prisma/black in CI - Replace `uv run --no-sync prisma generate` with `python -m prisma generate` in proxy_part1, proxy_part2, and enterprise jobs (fixes spawn error) - Migrate redis_caching_unit_tests from requirements.txt to uv sync - Migrate e2e_ui_testing from requirements.txt to uv sync, replace bare prisma/python calls with uv run equivalents - Bump venv cache keys from v1 to v2 with config.yml checksum to bust stale caches missing black and other dev dependencies	2026-04-10 16:10:50 -07:00
Yuneng Jiang	4b6eb02b66	[Fix] Pin uv/pip versions and fix bare prisma calls in CI - Pin `pip==26.0.1` and `uv==0.10.9` in CCI jobs that used unpinned `pip install uv` (redis_caching_unit_tests, ui_e2e_tests) - Replace bare `prisma generate` with `uv run --no-sync prisma generate` in proxy_part1, proxy_part2, and enterprise test jobs - Remove duplicate `check=True` kwarg in test_basic_python_version.py that caused TypeError with `_run_uv()` helper	2026-04-10 00:04:32 -07:00
stuxf	a6c30b30bf	build: migrate packaging, CI, and Docker from Poetry to uv (#25007 ) * build: migrate packaging metadata to uv * ci: move automation and local tooling to uv * docker: migrate image builds and runtime setup to uv * docs: update install and deployment guidance for uv * chore: align auxiliary scripts and tests with uv * test: harden test_litellm isolation * fix: keep release and health check images self-contained * build: pin uv tooling and health check deps * test: isolate bedrock image request formatting from suite state * test: cover sandbox executor requirements flow * ci: fix circleci no-op command steps * ci: fix circleci publish workflow parsing * fix: stabilize remaining uv migration CI checks * ci: increase matrix test timeout headroom * fix: restore published docker and license coverage * fix: restore proxy runtime build parity * fix: restore proxy extras parity and venv migrations * ci: persist uv path across circleci steps * fix: keep psycopg binary in default test env * docker: preserve prisma cache across stages * test: run local proxy checks through uv python * build: restore runtime deps moved into ci * build: refresh uv lock after upstream merge * fix: restore module import in test_check_migration after merge The conflict resolution imported only the function but the test body references check_migration as a module throughout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: revert dependency promotions, remove nodejs-wheel-binaries, fix Docker layer caching - Move google-generativeai, Pillow, tenacity back to ci group (they are lazily imported and bloat the base SDK install needlessly) - Remove nodejs-wheel-binaries from extra_proxy and proxy-dev (redundant in Docker where system Node.js is already installed via apk) - Remove all nodejs-wheel node replacement and venv npm patching blocks from Dockerfiles since the wheel is no longer installed - Add --no-default-groups to CodSpeed benchmark workflow so the benchmark environment matches the old minimal pip install footprint - Apply standard uv two-phase Docker pattern: copy metadata first, install deps (cached layer), then copy source and install project - Replace CircleCI enterprise no-op with proper uv sync command Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate uv.lock after removing nodejs-wheel-binaries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): use cache/restore instead of cache to prevent cache poisoning The old workflow used actions/cache/restore (read-only). The uv migration changed it to actions/cache (read-write), which zizmor flags as a cache poisoning risk. Restore the safer read-only variant. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): disable setup-uv built-in cache to silence cache-poisoning alert The setup-uv action enables caching by default, which zizmor flags as a cache poisoning risk. Disable it since we already use a read-only cache/restore step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): disable setup-uv cache in publish workflow Silences zizmor cache-poisoning alert. Publishing workflow runs infrequently on protected branches so caching adds no real benefit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(test): remove duplicate verbose_logger mock in test_check_migration The logger was patched twice — first via mocker.patch() then via mocker.patch.object(autospec=True). The second call fails because autospec cannot inspect an already-mocked attribute. Remove the redundant first patch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): free disk space before Docker build in test-server-root-path The Dockerfile.non_root build ran out of disk on the CI runner. Remove Android SDK, .NET, Boost, and GHC toolchains (~12GB) to free space. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 11:46:23 -07:00
yuneng-jiang	072d4108c3	Merge pull request #25365 from BerriAI/litellm_e2e_ui_tests [Feature] UI E2E Tests: Proxy Admin Team and Key Management	2026-04-08 15:29:10 -07:00
Yuneng Jiang	4ee7d42981	[Fix] Restructure HTML files after UI build so extensionless routes work in CI	2026-04-08 13:24:52 -07:00
Yuneng Jiang	ac9ebdf4d8	[Fix] Rename CI job to e2e_ui_testing and remove duplicate old job definition	2026-04-08 13:17:45 -07:00
Yuneng Jiang	a8f4f464ce	[Fix] Add missing test fixtures and address review feedback - Add constants.ts with all required exports (key aliases, team IDs) - Add fixtures/users.ts with all role definitions and storage paths - Add fixtures/seed.sql for deterministic test database seeding - Remove Firefox project from playwright config (only Chromium installed) - Remove unused variable in teams.spec.ts - Rename CircleCI job to e2e_ui_testing	2026-04-08 12:40:41 -07:00
Yuneng Jiang	d09d98a70a	[Feature] E2E UI tests: proxy-admin team and key management with CI integration Add Playwright E2E tests covering proxy admin team and key management workflows, with a self-contained test runner and CircleCI integration. Tests cover: create team, invite user, edit/delete team members, create key in team, regenerate key, update TPM/RPM limits, delete key, and verify internal user keys are visible. Infrastructure: run_e2e.sh builds the UI from source before starting the proxy, ensuring tests always run against the latest UI changes. Added data-testid attributes to key UI components for reliable selectors.	2026-04-08 11:51:15 -07:00
Yuneng Jiang	7ba0c69a07	[Fix] Install pytest-rerunfailures in redis caching CircleCI job	2026-04-08 11:50:00 -07:00
Yuneng Jiang	0104b60d8e	[Infra] Add redis_caching_coverage to coverage combine command	2026-04-08 10:48:41 -07:00
Yuneng Jiang	3a02c0ac6b	[Infra] Migrate Redis caching tests from GHA to CircleCI Redis caching unit tests (test_dual_cache, test_redis_batch_optimizations, test_router_utils) required Redis secrets that should live in CircleCI. - Add redis_caching_unit_tests job to CircleCI config - Delete test-unit-caching-redis.yml GHA workflow - Remove all Redis plumbing (inputs, secrets, env vars) from _test-unit-services-base.yml and its callers	2026-04-08 09:07:12 -07:00
yuneng-jiang	a60e19aeb8	Remove flaky proxy_e2e_azure_batches_tests CI workflow (#25247 ) The proxy_e2e_azure_batches_tests workflow is consistently flaky and does not provide reliable signal on whether changes break anything. Remove the workflow from both CircleCI and GitHub Actions, along with the test directory it exclusively used. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 16:49:14 -07:00
Yuneng Jiang	006d481025	[Fix] Remove neon CLI dependency and pin all JS dependencies Remove @neondatabase/api-client and neonctl to address CVE-2026-25639 (axios supply chain vulnerability). Pin all JS dependencies to exact versions across all package.json files to prevent future supply chain attacks via semver range resolution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 16:15:32 -07:00
Yuneng Jiang	85f72c9d24	[Fix] Remove unused aioboto3 dependency and botocore conflict workarounds aioboto3 was listed as a dependency for async sagemaker calls but is not imported anywhere in the codebase — async calls use httpx + botocore SigV4 instead. Removing it eliminates the unresolvable botocore version conflict between boto3 and aiobotocore, along with all grep -v / --no-deps workarounds across Dockerfiles and CI. Also addresses Greptile review feedback: collapse redundant grpcio python-version markers, bump pyproject.toml cryptography to 46.0.5 to match Docker (GHSA-r6ph-v2qm-q3c2), and fix misleading .npmrc comment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 14:25:44 -07:00
Yuneng Jiang	821a634d25	[Fix] Handle boto3/aioboto3 botocore conflict across CI and Docker builds boto3==1.42.80 and aioboto3==15.5.0 have incompatible botocore version ranges. No aioboto3 release supports botocore 1.42.x yet. Both uv and pip 26.0.1 reject the resolution. Fix: filter aioboto3 out of requirements.txt at install time, then install aioboto3+aiobotocore with --no-deps to bypass resolution. Added wrapt and aioitertools to requirements.txt as pinned transitive deps of aiobotocore (skipped by --no-deps). Fixed pip stdin handling (/dev/stdin). Applied to all 5 Dockerfiles and all CircleCI install paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 12:27:21 -07:00
Yuneng Jiang	fc8eb81549	[Fix] Filter aioboto3 from resolver to fix boto3/aioboto3 conflict boto3==1.42.80 and aioboto3==15.5.0 have incompatible botocore ranges. Both uv and pip 26.0.1 reject the resolution. Fix: filter aioboto3 out of requirements.txt at install time, then install aioboto3+aiobotocore separately with --no-deps to bypass resolution. Removes uv-overrides.txt which only partially addressed the conflict. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 11:57:07 -07:00
Yuneng Jiang	467abd1909	[Fix] Add uv override for boto3/aioboto3 botocore conflict boto3==1.42.80 requires botocore>=1.42.80 but aioboto3==15.5.0 (via aiobotocore==2.25.1) requires botocore<1.40.62. No aioboto3 release supports botocore 1.42.x yet. pip's lenient resolver handles this for Docker builds, but uv's strict resolver rejects it in CI. Added uv-overrides.txt to force botocore to match boto3 during uv installs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 11:34:19 -07:00
Yuneng Jiang	43077af378	[Fix] Sync CircleCI dependency pins with requirements.txt CircleCI had stale version pins (e.g. boto3==1.36.0, aioboto3==13.4.0) that conflict with requirements.txt (boto3==1.42.80, aioboto3==15.5.0), causing uv resolution failures. Updated all mismatched pins across config.yml and .circleci/requirements.txt to match requirements.txt as the source of truth. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 11:27:44 -07:00
Ishaan Jaffer	be553c7204	fix aporia	2026-03-30 21:49:31 -07:00
Krrish Dholakia	cbd6253f9c	test: skip chromium/firefox check - TODO: move to a dynamic db	2026-03-30 20:55:27 -07:00
Ishaan Jaffer	f1e7aa9dbb	db_migration_disable_update_check	2026-03-30 19:46:18 -07:00
Ishaan Jaffer	3fbe7d1059	prisma_schema_sync	2026-03-30 19:40:25 -07:00
Krrish Dholakia	37440c28b7	test: use dynamic db	2026-03-30 19:33:52 -07:00
Yuneng Jiang	7aec9101f5	[Infra] Remove CircleCI jobs now covered by GitHub Actions Removes 10 CircleCI jobs that are fully duplicated by GHA workflows: - caching_unit_tests → test-unit-caching-redis.yml - litellm_security_tests → test-unit-security.yml - litellm_proxy_unit_testing_{key_generation,part1,part2} → test-unit-proxy-db.yml + test-unit-proxy-legacy.yml - litellm_mapped_tests_{llms,core,litellm_core_utils,mcps,integrations} → test-unit-*.yml workflows Also cleans up upload-coverage requires and workflow entries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 19:06:48 -07:00
Yuneng Jiang	ba8455a3be	[Infra] Migrate PyPI publishing from CircleCI to GitHub Actions OIDC - Add .github/workflows/publish_to_pypi.yml with OIDC trusted publisher - Remove publish_to_pypi job from .circleci/config.yml - Zero long-lived tokens, all actions SHA-pinned, build deps version-pinned	2026-03-26 19:02:14 -07:00
Ishaan Jaff	81dadb698a	Ishaan - March 18th changes (#24056 ) * add DD Tracing (#24033) * feat(models): add Azure GPT-5.4 mini and nano variants (#24045) Add `azure/gpt-5.4-mini` and `azure/gpt-5.4-nano` to the model database with official pricing from Azure OpenAI: - GPT-5.4 mini: $0.75/M input, $0.075/M cached, $4.5/M output - GPT-5.4 nano: $0.20/M input, $0.02/M cached, $1.25/M output Both models support: - 1.05M input / 128K output context window - Chat, batch, and responses endpoints - Function calling, tools, vision, reasoning - Prompt caching with automatic tiered pricing Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * Add new model pricing details for volcengine Doubao-Seed-2.0 series (#23871) Add entries for volcengine Doubao-Seed-2.0 series * fix(mcp): support refresh_token grant type in OAuth token endpoint (#23701) * fix(mcp): support refresh_token grant type in OAuth token endpoint (#23700) The .well-known/oauth-authorization-server metadata advertises refresh_token as a supported grant type, but the token endpoint rejected it with HTTP 400. This adds refresh_token grant support so MCP clients can refresh expired tokens without re-authenticating. * test(mcp): add tests for refresh_token grant type in OAuth token endpoint * fix(mcp): move code_verifier guard into authorization_code branch code_verifier is only relevant for authorization_code grants (PKCE). Move it inside the else branch so it doesn't apply to refresh_token. * fix(mcp): guard None client_secret and forward scope in token exchange - Conditionally include client_secret in form data to prevent httpx from sending the literal string "None" (applies to both authorization_code and refresh_token branches) - Forward optional scope parameter per RFC 6749 §6, allowing clients to request a subset of originally-granted scopes on refresh * fix(mcp): validate code param in authorization_code grant Guard against None code being form-encoded as literal string "None" by httpx, symmetric with the existing refresh_token guard. * docs: add incident report for guardrail logging secret exposure (#24059) Add blog post documenting the guardrail logging path exposing internal request data (e.g. Authorization headers) in spend logs and OTEL traces. Fix available in LiteLLM 1.82.3+. Made-with: Cursor * [Fix] Datadog LLM Observability tags format (env, service, version missing) (#23673) * tag fix * greptile comment * fix(ci): stabilize 6 failing CI jobs 1. mypy: remove duplicate type annotation for token_data in discoverable_endpoints.py 2. integrations tests: add parameterized to CI test deps 3. doc quality: document OTEL_IGNORE_CONTEXT_PROPAGATION env key 4. security: allowlist CVE-2026-2673, CVE-2026-3644, CVE-2026-4224 (no fix available) 5. proxy_store_model_in_db: fix missing x-litellm-call-id header on error responses 6. google tests: add --retries 3 for transient Vertex AI rate limits Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(streaming): handle RuntimeError during model_copy in streaming handler The race condition occurs when model_copy(deep=True) tries to deepcopy _hidden_params dict while it's being concurrently modified by logging callbacks. Fall back to shallow copy if the deep copy fails. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(cost): handle non-string traffic_type in cost calculator + add retries 1. Fix AttributeError in _map_traffic_type_to_service_tier when traffic_type is an integer (cast to str before calling .upper()). This was causing pass-through vertex spend logging to fail silently. 2. Add --retries to llm_translation_testing for flaky external API calls. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: ExMatics HydrogenC <33123710+HydrogenC@users.noreply.github.com> Co-authored-by: Jack Venberg <jack.venberg@rover.com> Co-authored-by: milan-berri <milan@berri.ai> Co-authored-by: Shivam Rawat <161387515+shivamrawat1@users.noreply.github.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>	2026-03-19 10:20:35 -07:00
yuneng-jiang	278c9babc6	[Infra] Merging RC Branch with Main (#23786 ) * fix(test): add missing mocks for test_streamable_http_mcp_handler_mock The test was missing mocks for extract_mcp_auth_context and set_auth_context, causing the handler to fail silently in the except block instead of reaching session_manager.handle_request. This mirrors the fix already applied to the sibling test_sse_mcp_handler_mock. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(ci): route OpenAI models through chat completions in pass-through tests The test_anthropic_messages_openai_model_streaming_cost_injection test fails because the OpenAI Responses API returns 400 for requests routed through the Anthropic Messages endpoint. Setting LITELLM_USE_CHAT_COMPLETIONS_URL_FOR_ANTHROPIC_MESSAGES=true routes OpenAI models through the stable chat completions path instead. Cost injection still works since it happens at the proxy level. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(ci): fix assemblyai custom auth and router wildcard test flakiness 1. custom_auth_basic.py: Add user_role='proxy_admin' so the custom auth user can access management endpoints like /key/generate. The test test_assemblyai_transcribe_with_non_admin_key was hidden behind an earlier -x failure and was never reached before. 2. test_router_utils.py: Add flaky(retries=3) and increase sleep from 1s to 2s for test_router_get_model_group_usage_wildcard_routes. The async callback needs time to write usage to cache, and 1s is insufficient on slower CI hardware. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * ci: retrigger CI pipeline Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(mypy): use LitellmUserRoles enum instead of raw string in custom_auth_basic Fixes mypy error: Argument 'user_role' has incompatible type 'str'; expected 'LitellmUserRoles \| None' Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: don't close HTTP/SDK clients on LLMClientCache eviction (#22926) * fix: don't close HTTP/SDK clients on LLMClientCache eviction Removing the _remove_key override that eagerly called aclose()/close() on evicted clients. Evicted clients may still be held by in-flight streaming requests; closing them causes: RuntimeError: Cannot send a request, as the client has been closed. This is a regression from commit `fb72979432`. Clients that are no longer referenced will be garbage-collected naturally. Explicit shutdown cleanup happens via close_litellm_async_clients(). Fixes production crashes after the 1-hour cache TTL expires. * test: update LLMClientCache unit tests for no-close-on-eviction behavior Flip the assertions: evicted clients must NOT be closed. Replace test_remove_key_closes_async_client → test_remove_key_does_not_close_async_client and equivalents for sync/eviction paths. Add test_remove_key_removes_plain_values for non-client cache entries. Remove test_background_tasks_cleaned_up_after_completion (no more _background_tasks). Remove test_remove_key_no_event_loop variant that depended on old behavior. * test: add e2e tests for OpenAI SDK client surviving cache eviction Add two new e2e tests using real AsyncOpenAI clients: - test_evicted_openai_sdk_client_stays_usable: verifies size-based eviction doesn't close the client - test_ttl_expired_openai_sdk_client_stays_usable: verifies TTL expiry eviction doesn't close the client Both tests sleep after eviction so any create_task()-based close would have time to run, making the regression detectable. Also expand the module docstring to explain why the sleep is required. * docs(AGENTS.md): add rule — never close HTTP/SDK clients on cache eviction * docs(CLAUDE.md): add HTTP client cache safety guideline * [Fix] Install bsdmainutils for column command in security scans The security_scans.sh script uses `column` to format vulnerability output, but the package wasn't installed in the CI environment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: handle string callback values in prometheus multiproc setup When callbacks are configured as a plain string (e.g., `callbacks: "my_callback"`) instead of a list, the proxy crashes on startup with: TypeError: can only concatenate str (not "list") to str Normalize each callback setting to a list before concatenating. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * bump: version 1.82.2 → 1.82.3 * fix(test): update test_startup_fails_when_db_setup_fails for opt-in enforcement The --enforce_prisma_migration_check flag is now required to trigger sys.exit(1) on DB migration failure, after #23675 flipped the default behavior to warn-and-continue. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(cost_calculator): use model name for per-request custom pricing when router_model_id has no pricing When custom pricing is passed as per-request kwargs (input_cost_per_token/output_cost_per_token), completion() registers pricing under the model name, but _select_model_name_for_cost_calc was selecting the router deployment hash (which has no pricing data), causing response_cost to be 0.0. Now checks whether the router_model_id entry actually has pricing before preferring it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 15:32:20 -07:00
yuneng-jiang	9cec81a087	[Fix] Revert proxy unit test groupings to prevent xdist state pollution Part1 had 4 test files combined (was originally 2), causing cross-file state pollution under xdist. Reverted to original grouping. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 00:48:56 -07:00
yuneng-jiang	2372427dbc	[Fix] Remove xdist from caching_unit_tests to fix GCS cache test failures GCS cache tests (test_gcs_cache_unit_tests.py) rely on module-level state (vertex_chat_completion singleton, credential caches) that importlib.reload resets but the xdist-safe function-scoped fixture does not. Removing -n 4 from this job restores single-process execution where module reload properly resets all state before each test, while CI-level parallelism (parallelism: 2) still splits test files across nodes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 00:23:04 -07:00
yuneng-jiang	96183e8bde	[Fix] Drop --no-deps from aurelio_sdk in guardrails and enterprise tests aurelio_sdk imports requests_toolbelt at load time, so it needs its deps. Unlike semantic_router, aurelio_sdk has no conflict with openai>=2, so --no-deps is unnecessary. Verified via uv dry-run locally. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 23:11:43 -07:00
yuneng-jiang	f68a9be04d	[Infra] Optimize CI: migrate litellm_security_tests from machine to docker xlarge Switch from expensive Linux machine (medium) to docker xlarge executor. Drop miniconda, manual Docker CLI install, and manual PostgreSQL container in favor of cimg/python:3.13, setup_remote_docker, and service container. Use uv + cache for dependency installation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 23:07:22 -07:00
yuneng-jiang	eba54bae11	[Fix] Add aurelio_sdk --no-deps alongside semantic_router in guardrails and enterprise tests semantic_router imports aurelio_sdk at module load time, so it must be installed even when using --no-deps. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 23:02:35 -07:00
yuneng-jiang	ae1e827319	[Infra] Optimize CI: add xdist to caching tests, drop Docker CLI installs, reduce verbosity - caching_unit_tests: add resource_class large, enable xdist -n 4, drop unused coverage collection - build_and_test & proxy_pass_through_endpoint_tests: remove redundant Docker CLI install (machine executor has it) - Downgrade -vv to -v across 4 jobs to reduce log noise Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 22:41:27 -07:00
yuneng-jiang	f07301a518	[Infra] Optimize CI: right-size resource classes, drop unused coverage, increase xdist workers Downgrade langfuse, assistants, and python 3.13 install jobs to medium (were defaulting to large at ~25% CPU). Bump enterprise and image_gen xdist workers to -n 4 on explicit large instances. Drop coverage collection and persist_to_workspace for 4 jobs that no longer need it. Downgrade verbosity from -vv to -v across all 5 jobs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 22:27:50 -07:00
yuneng-jiang	65b3335735	[Infra] Use uv for requirements.txt installs across 22 CI jobs Switch pip install -r requirements.txt to uv pip install --system -r requirements.txt for all docker-based jobs that use the main requirements.txt. This applies the same optimization already proven in the mapped test jobs to the rest of the CI pipeline. Also adds --no-deps to semantic_router installs in guardrails_testing and litellm_mapped_enterprise_tests to avoid uv's strict resolution conflict with openai>=2. Skipped: machine executor + conda jobs (security, proxy_spend_accuracy, proxy_multi_instance, proxy_store_model_in_db) and Group B jobs using .circleci/requirements.txt. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 22:22:44 -07:00
yuneng-jiang	379c3952f4	[Fix] Use uv for requirements.txt only, pip for test deps with conflicting pins uv's strict resolver rejects transitive dep conflicts (semantic-router wants openai<2, llm-sandbox wants pydantic>=2.11.5). Use uv for the heavy requirements.txt install and pip for the small test dep batch. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 21:58:07 -07:00
yuneng-jiang	26207bb7be	[Infra] Speed up mapped test jobs: uv installs, site-packages caching, drop unused coverage - Switch setup_litellm_test_deps from pip to uv with batched installs - Cache installed site-packages (~/.local/lib, ~/.local/bin) instead of pip download cache for near-instant installs on cache hit - Remove unused coverage collection from 6 mapped test jobs (only mcps coverage is consumed by the coverage combine step) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 21:50:18 -07:00
yuneng-jiang	1a00dd4dbb	Fix router test isolation for xdist and rebalance proxy unit tests Router tests: expand conftest save/restore to cover all globals mutated by router tests (default_fallbacks, tag_budget_config, request_timeout, enable_azure_ad_token_refresh, num_retries_per_request, model_cost, token_counter). These were leaking across xdist workers. Proxy tests: move test_proxy_utils.py (169 parametrized) and test_proxy_server.py (72 parametrized) from part2 to part1, balancing ~370 vs ~360 tests (was ~129 vs ~600). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 21:36:56 -07:00
yuneng-jiang	74e57bdd27	Optimize CI test jobs: increase xdist workers, drop coverage, add caching Increase pytest-xdist parallelism to match available CPU on I/O-bound and CPU-bound test jobs. Drop coverage collection from 8 jobs (still collected by ~15 other jobs). Add dependency caching to 4 uncached jobs. Reduce verbose output (-vv to -v) and remove -s/--log-cli-level overhead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 21:20:50 -07:00
yuneng-jiang	c1efbd3c8a	[Fix] Drop httpx and opentelemetry pins from local_testing batched installs Batching pip installs surfaced more hidden conflicts: - respx==0.22.0 requires httpx>=0.25.0, conflicting with httpx==0.24.1 - traceloop-sdk==0.21.1 requires otel-semantic-conventions<0.46, conflicting with opentelemetry-sdk==1.25.0 (needs ==0.46b0) These were masked before because separate pip install calls let later installs silently override earlier pins. Dropping the pins lets pip resolve compatible versions. Verified with pip --dry-run locally. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 21:10:25 -07:00
yuneng-jiang	59f0db0538	[Fix] Remove anyio==4.2.0 pin from local_testing batched installs Batching pip installs exposed a dependency conflict: langfuse==2.59.7 requires anyio>=4.4.0, which conflicts with the anyio==4.2.0 pin. Dropping the pin lets pip resolve a compatible version. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 21:02:50 -07:00
yuneng-jiang	f3cc292daf	[Infra] Speed up CI: batch pip installs and fix pytest -n parallelism Several test jobs were underutilizing their CPU allocation (~25%) because they were either missing pytest-xdist -n or using -n 2 on 4-vCPU machines. Batching individual pip install calls into single commands reduces resolver overhead and saves ~30-60s per job. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 20:54:46 -07:00
yuneng-jiang	19e8a16cce	Optimize logging_testing CI: suppress DEBUG logs, fix xdist isolation - Add LITELLM_LOG=WARNING to suppress verbose DEBUG log output - Remove -s flag to stop capturing all stdout - Bump xdist workers from -n 2 to -n 4 - Add --timeout=120 for safety - Rewrite conftest.py to use save/restore pattern (matching guardrails_tests) instead of per-function importlib.reload + event loop creation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 18:24:57 -07:00
yuneng-jiang	ec537dd973	Add -n 2 parallelism to 8 medium jobs running serial tests These jobs run on medium (2 CPU) but weren't using pytest-xdist, leaving the second CPU idle. Added pytest-xdist dep and -n 2 to: - auth_ui_unit_tests (~33 tests) - litellm_router_unit_testing (~191 tests) - mcp_testing (~112 tests) - llm_responses_api_testing (~80 tests) - search_testing (~53 tests) - batches_testing (~45 tests) - litellm_utils_testing (~205 tests) - pass_through_unit_testing (~102 tests) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 17:15:54 -07:00
yuneng-jiang	a95cae6b1e	Speed up build_docker_database_image: drop Docker upgrade, use zstd - Use ubuntu-2204:2024.04.1 which ships with a recent Docker, eliminating the 1-minute `curl get.docker.com \| sh` upgrade step - Switch image save/load from gzip to zstd -1 -T0 for ~3-5x faster compression/decompression, saving ~30s on save and on each downstream load Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 17:09:37 -07:00
yuneng-jiang	4c4246ab4a	Downgrade oversized resource classes to match actual workload - proxy_unit_testing_key_generation: large→medium (serial, 1 test file) - proxy_unit_testing_part1: large→medium, -n 4→2 (only 2 test files) - mapped_tests_proxy_part1: xlarge→large, -n 8→4 (~2000 tests, 4 CPUs sufficient) Saves ~40 credits/min across these 3 jobs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 16:19:43 -07:00
yuneng-jiang	65575f3992	Fix pytest -n worker oversubscription to match available CPUs 8 jobs had -n workers set higher than available vCPUs, causing context switch overhead and degraded performance. Aligned -n to match resource_class: - medium (2 CPU): enterprise -n 8→2, image_gen/logging/guardrails -n 4→2 - large (4 CPU): proxy_part1/llms/core/integrations -n 8→4 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 15:59:53 -07:00
yuneng-jiang	a66b635a4c	Upsize ui_build and ui_unit_tests CI machines for faster feedback ui_build (medium → medium+) is on the critical path blocking 3 downstream jobs. ui_unit_tests (medium+ → large, maxForks 3 → 5) targets ~7 min from ~11. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 15:36:59 -07:00

1 2 3 4 5 ...

783 Commits