litellm

Author	SHA1	Message	Date
user	bfdd786962	chore(deps): refresh dependency locks	2026-05-04 11:36:18 -07:00
Mateo Wang	05439530c2	Merge branch 'litellm_internal_staging' into litellm_vcr-cassette-llm-tests-af37	2026-05-01 14:37:48 -07:00
Yuneng Jiang	6da13efcec	uv lock	2026-04-30 21:40:09 -07:00
Cursor Agent	05333e42ba	tests(llm_translation): switch to pytest-recording for marker-based bulk capture Per Yuneng's feedback, use a single @pytest.mark.vcr marker so one record sweep populates cassettes for every marked test across all providers, instead of forcing each test to bind to a hard-coded cassette path. Changes vs. the initial scaffolding: - Add 'pytest-recording==0.13.4' on top of vcrpy. Adopt its layout: cassettes live at 'cassettes/<test_module>/<test_name>.yaml', resolved automatically. New tests just decorate with '@pytest.mark.vcr' — no imports or path bookkeeping. - Move the shared filter/match config into a 'vcr_config' fixture in 'tests/llm_translation/conftest.py' (consumed by pytest-recording for every marked test in the dir). Drop the standalone 'vcr_config.py'. - Bulk record / replay via the standard '--record-mode' CLI flag: 'make test-llm-translation-record' now sweeps every '@pytest.mark.vcr' test under tests/llm_translation in one shot. Optional 'TARGET=' var scopes to a single file. - Move existing cassettes to the per-test paths and update the local in-process Anthropic regenerator to write to the same paths. - Refresh README + Makefile target docs to match the sweep workflow. Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>	2026-04-30 18:08:57 +00:00
Cursor Agent	94b319c577	tests(llm_translation): add VCR cassette infrastructure for offline replay Live LLM e2e tests have been draining provider billing accounts and going flaky on outages (LIT-2683). This change introduces vcrpy-backed cassette replay so CI can exercise the same end-to-end LiteLLM transformation paths without hitting the live provider: - Add 'vcrpy==8.1.1' to the dev dependency group. - New 'tests/llm_translation/vcr_config.py' centralises the VCR config: filters auth/secret headers and per-request response headers, matches on method+URI+body, and exposes 'LITELLM_VCR_RECORD_MODE' for re-recording. - New 'tests/llm_translation/test_anthropic_completion_vcr.py' demonstrates the pattern with one non-streaming and one streaming Anthropic test that replay from cassettes shipped under 'cassettes/'. - New 'tests/llm_translation/cassettes/_record_anthropic_fixtures.py' lets contributors regenerate the canned Anthropic cassettes against a local in-process mock (no API key required), and 'cassettes/README.md' documents the full record/replay/refresh workflow. - New 'make test-llm-translation-record FILE=...' Makefile target to refresh cassettes against the live API. Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>	2026-04-30 00:45:50 +00:00
Yuneng Jiang	b4d9006f92	uv lock	2026-04-28 17:43:36 -07:00
Yuneng Jiang	a10fff888d	uv lock	2026-04-25 19:32:41 -07:00
user	4d74a30412	chore(deps): fix brace-expansion pin and revert risky dev bumps - Dockerfile: pin the unscoped `brace-expansion@5.0.5` alongside `@isaacs/brace-expansion@5.0.1`. The scoped package only has 5.0.0 and 5.0.1 published; CVE-2026-33750's fix (5.0.5) is on the unscoped package which npm also vendors. The override loop now swaps both. - Revert `black` 26.3.1 -> 24.10.0, `pytest` 9.0.3 -> 8.3.5, and `pytest-asyncio` 1.3.0 -> 1.2.0. The major-version bumps cause CI lint (black reformats hundreds of files) and code-quality (liccheck.ini has no entry for the new versions) failures. Both CVEs are dev-only; skipping leaves no runtime exposure.	2026-04-24 00:37:07 +00:00
user	fed1a14646	chore(deps): bump vulnerable dependencies Closes Nexus IQ policy violations and open Dependabot alerts for shipped Python deps and runtime-stage npm pins in the Docker image.	2026-04-24 00:36:59 +00:00
Yuneng Jiang	ffaeff54cd	add uv	2026-04-23 17:00:20 -07:00
Yuneng Jiang	95fa7678af	uv lock	2026-04-22 18:25:37 -07:00
Yuneng Jiang	e65d547c4d	adding uv lock	2026-04-21 18:10:47 -07:00
ishaan-berri	2f22a1293e	bump litellm-proxy-extras to 0.4.67 (#26043 ) * bump litellm-proxy-extras version to 0.4.67 * bump litellm-proxy-extras pin to 0.4.67 in litellm pyproject * regenerate uv.lock for litellm-proxy-extras 0.4.67 * bump litellm-enterprise version to 0.1.38 * bump litellm-enterprise pin to 0.1.38 in litellm pyproject * regenerate uv.lock for litellm-enterprise 0.1.38	2026-04-18 19:03:56 -07:00
Yuneng Jiang	49ba6b8160	add uv lock	2026-04-18 18:43:09 -07:00
Yuneng Jiang	9bdb3b1772	chore: lower python floor from 3.11 to 3.10 All three dependency bumps in this PR resolve on Python 3.10, so there is no need to jump the floor all the way to 3.11. Also restore the py3.10-specific lunary==1.4.36 pin that was collapsed when the floor was temporarily at 3.11.	2026-04-18 12:50:04 -07:00
Yuneng Jiang	d1e665742b	chore: drop stale python_version markers after floor raise Now that requires-python starts at 3.11, the "python_version >= '3.9'" and ">= '3.10'" markers are unconditionally true, and the "< '3.10'" entries for psycopg, Pillow, pyarrow, langchain, lunary, and pylint can never resolve. Drop the dead markers and remove the unreachable pins so the dependency list reflects what actually gets installed.	2026-04-18 12:31:53 -07:00
Yuneng Jiang	1c29c5e903	chore: bump proxy deps and raise python floor to 3.11 Bumps orjson, fastapi-sso, and python-multipart to their latest releases in the proxy extra, and raises the project python floor to 3.11 so the updated pins can resolve. CI already runs on 3.11 / 3.12 / 3.13 and the Docker images ship python 3.13, so the floor change aligns the declared support range with what is actually tested and shipped.	2026-04-18 12:16:35 -07:00
Ishaan Jaffer	375cfb7f95	chore: update uv.lock after merging main	2026-04-17 12:56:23 -07:00
Yuneng Jiang	c294bbe4f0	fix(deps): pin langgraph-prebuilt==1.0.8 to avoid broken 1.0.9 langgraph-prebuilt 1.0.9 imports ExecutionInfo and ServerInfo from langgraph.runtime, but those symbols are not exported until langgraph 1.1.0. Our pin of langgraph==1.0.10 allows langgraph-prebuilt<1.1.0,>=1.0.8, and uv resolves to 1.0.9 (the latest in range), which breaks at import time in every test that touches langgraph.prebuilt (e.g. tests/pass_through_tests/test_mcp_routes.py): ImportError: cannot import name 'ExecutionInfo' from 'langgraph.runtime' Pinning langgraph-prebuilt to 1.0.8 pairs correctly with langgraph==1.0.10 and restores the import path.	2026-04-16 09:36:05 -07:00
Yuneng Jiang	dafa1bf97c	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_yj_apr15 # Conflicts: # litellm/litellm_core_utils/litellm_logging.py # uv.lock	2026-04-16 09:17:20 -07:00
Brendan Smith-Elion	265a960472	fix(noma-v2): fall back to key_alias for application_id in Noma dashboard (#25795 ) Noma v1 resolved application_id from user_api_key_alias when no explicit value was set (PR #16832). Noma v2 (PR #21400) was rewritten from scratch and this fallback was not ported, causing all requests from shared LiteLLM instances to appear as a single generic "litellm" application in the Noma dashboard — breaking per-user traceability. Fix: after checking dynamic_params and self.application_id, fall back to user_api_key_alias from litellm_metadata or metadata. This matches the pattern used by PromptSecurityGuardrail._resolve_key_alias_from_request_data() and restores the v1 behavior where each API key gets its own application entry in the Noma dashboard. Fixes #25794 Co-authored-by: Brendan Smith-Elion <brendan.smith-elion@arcadia.io> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 19:04:24 +05:30
Ishaan Jaffer	9114b0da96	fix(ci): sync uv.lock with pyproject.toml	2026-04-15 18:16:22 -07:00
jayden	0a1b4427a6	fix(guardrails): replace custom_code sandbox with RestrictedPython	2026-04-15 15:13:52 -07:00
Yuneng Jiang	83c459225c	[Fix] CI: fix GHA timeouts and uv lock --check failures 1. exclude-newer: change from absolute "2026-04-10" to relative "3 days". All pinned deps were published before the 3-day cutoff. Re-locked so uv lock --check passes in test-mcp.yml and test-linting.yml. 2. test_eager_tiktoken_load: run all 10 env var values in a single subprocess instead of spawning 10 separate processes. Each cold import litellm takes ~78s on CI, so the old loop took ~13 min on a single xdist worker. Now takes ~78s total. 3. proxy-db remaining timeout: increase from 20 to 30 minutes. The remaining group has 51 test files and was consistently timing out at 71% across all branches (pre-existing issue, not migration-related).	2026-04-11 09:04:49 -07:00
Yuneng Jiang	d9a460277a	[Fix] CI: fix uv lock resolution and tiktoken test timeout 1. Cap requires-python to <3.14 — no deps ship 3.14 wheels yet, and uv's cross-version resolver fails on the Python 3.14 split. 2. Change exclude-newer from relative "30 days" to absolute "2026-04-10" so the lockfile stays reproducible. The relative date caused cryptography==46.0.7 (published April 8) to fall outside the window. 3. Parametrize test_eager_loading_env_var_values instead of looping — with xdist the 6 subprocess cases can run in parallel instead of all running sequentially on one worker (~13 min → ~2 min). Also removed redundant case variants (Yes/YES/On/ON) that test the same str_to_bool code path.	2026-04-10 22:21:15 -07:00
user	8d1493ed08	fix(security): bump vulnerable dependencies pip: - cryptography 43.0.3 → 46.0.7 (5 CVEs including CVSS 8.2 ECDH key leak) npm: - hono 4.1.4/4.12.7 → 4.12.12 (prototype pollution, cookie injection, path traversal, middleware bypass, IP matching bypass) - @hono/node-server 1.19.6 → 1.19.13 (serveStatic middleware bypass) - vite 7.3.1 → 7.3.2 (file read via WebSocket, path traversal, fs.deny bypass) - lodash override 4.17.23 → 4.18.1 (code injection via _.template, prototype pollution via _.unset/_.omit) mlflow left at 3.9.0 — 2 of 3 alerts have no upstream fix, and 3.11.1 is blocked by exclude-newer (transitive dep chain). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 19:35:19 +00:00
stuxf	a6c30b30bf	build: migrate packaging, CI, and Docker from Poetry to uv (#25007 ) * build: migrate packaging metadata to uv * ci: move automation and local tooling to uv * docker: migrate image builds and runtime setup to uv * docs: update install and deployment guidance for uv * chore: align auxiliary scripts and tests with uv * test: harden test_litellm isolation * fix: keep release and health check images self-contained * build: pin uv tooling and health check deps * test: isolate bedrock image request formatting from suite state * test: cover sandbox executor requirements flow * ci: fix circleci no-op command steps * ci: fix circleci publish workflow parsing * fix: stabilize remaining uv migration CI checks * ci: increase matrix test timeout headroom * fix: restore published docker and license coverage * fix: restore proxy runtime build parity * fix: restore proxy extras parity and venv migrations * ci: persist uv path across circleci steps * fix: keep psycopg binary in default test env * docker: preserve prisma cache across stages * test: run local proxy checks through uv python * build: restore runtime deps moved into ci * build: refresh uv lock after upstream merge * fix: restore module import in test_check_migration after merge The conflict resolution imported only the function but the test body references check_migration as a module throughout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: revert dependency promotions, remove nodejs-wheel-binaries, fix Docker layer caching - Move google-generativeai, Pillow, tenacity back to ci group (they are lazily imported and bloat the base SDK install needlessly) - Remove nodejs-wheel-binaries from extra_proxy and proxy-dev (redundant in Docker where system Node.js is already installed via apk) - Remove all nodejs-wheel node replacement and venv npm patching blocks from Dockerfiles since the wheel is no longer installed - Add --no-default-groups to CodSpeed benchmark workflow so the benchmark environment matches the old minimal pip install footprint - Apply standard uv two-phase Docker pattern: copy metadata first, install deps (cached layer), then copy source and install project - Replace CircleCI enterprise no-op with proper uv sync command Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate uv.lock after removing nodejs-wheel-binaries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): use cache/restore instead of cache to prevent cache poisoning The old workflow used actions/cache/restore (read-only). The uv migration changed it to actions/cache (read-write), which zizmor flags as a cache poisoning risk. Restore the safer read-only variant. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): disable setup-uv built-in cache to silence cache-poisoning alert The setup-uv action enables caching by default, which zizmor flags as a cache poisoning risk. Disable it since we already use a read-only cache/restore step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): disable setup-uv cache in publish workflow Silences zizmor cache-poisoning alert. Publishing workflow runs infrequently on protected branches so caching adds no real benefit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(test): remove duplicate verbose_logger mock in test_check_migration The logger was patched twice — first via mocker.patch() then via mocker.patch.object(autospec=True). The second call fails because autospec cannot inspect an already-mocked attribute. Remove the redundant first patch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): free disk space before Docker build in test-server-root-path The Dockerfile.non_root build ran out of disk on the CI runner. Remove Android SDK, .NET, Boost, and GHC toolchains (~12GB) to free space. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 11:46:23 -07:00
Ryan Malloy	f76938af5e	fix(ollama): set finish_reason to tool_calls and remove broken capability check (#18924 ) * Update CLAUDE.md with qwen3 tool_calls bug fix instructions (#18922) * fix(ollama): set finish_reason to "tool_calls" when tool_calls present When qwen3 models return tool_calls through Ollama, the finish_reason was incorrectly left as "stop" instead of being set to "tool_calls". This caused clients to miss the tool_calls in the response. Added _get_finish_reason helper method following OpenAI provider's pattern, and fixed both streaming and non-streaming response paths. Fixes: https://github.com/BerriAI/litellm/issues/18922 * fix(ollama): pass tools directly without model capability check The previous code tried to check model capability via get_model_info() which made network calls to localhost:11434. When Ollama is remote, this fails and falls back to JSON format, breaking tool calling. Ollama 0.4+ supports native tool calling - let Ollama handle model capability detection instead of LiteLLM. Fixes #18922 * fix(ollama): transform tool_calls response to OpenAI format Ollama returns tool_calls with arguments as dict, but OpenAI format requires arguments to be a JSON string. Also ensures 'type': 'function' field is present. Completes the fix for #18922 * fix(ollama): set finish_reason to "tool_calls" when tool_calls present Fixes #18922 Two issues addressed: 1. Remove broken model capability check - get_model_info() fails when Ollama runs on remote server - Broken fallback triggered JSON prompt injection - Now passes tools directly - Ollama 0.4+ handles detection 2. Set finish_reason correctly - Was hardcoded to "stop" even with tool_calls present - Clients use this to know how to process the response - Now returns "tool_calls" when tool_calls are in response Both streaming and non-streaming responses are fixed. Tests: - All 14 existing Ollama tests pass - Added 3 focused tests for the fixes	2026-01-14 03:52:26 +05:30

28 Commits