Commit Graph

28 Commits

Author SHA1 Message Date
user
bfdd786962 chore(deps): refresh dependency locks 2026-05-04 11:36:18 -07:00
Mateo Wang
05439530c2
Merge branch 'litellm_internal_staging' into litellm_vcr-cassette-llm-tests-af37 2026-05-01 14:37:48 -07:00
Yuneng Jiang
6da13efcec
uv lock 2026-04-30 21:40:09 -07:00
Cursor Agent
05333e42ba
tests(llm_translation): switch to pytest-recording for marker-based bulk capture
Per Yuneng's feedback, use a single @pytest.mark.vcr marker so one record
sweep populates cassettes for every marked test across all providers,
instead of forcing each test to bind to a hard-coded cassette path.

Changes vs. the initial scaffolding:

- Add 'pytest-recording==0.13.4' on top of vcrpy. Adopt its layout:
  cassettes live at 'cassettes/<test_module>/<test_name>.yaml', resolved
  automatically. New tests just decorate with '@pytest.mark.vcr' — no
  imports or path bookkeeping.
- Move the shared filter/match config into a 'vcr_config' fixture in
  'tests/llm_translation/conftest.py' (consumed by pytest-recording for
  every marked test in the dir). Drop the standalone 'vcr_config.py'.
- Bulk record / replay via the standard '--record-mode' CLI flag:
  'make test-llm-translation-record' now sweeps every '@pytest.mark.vcr'
  test under tests/llm_translation in one shot. Optional 'TARGET=' var
  scopes to a single file.
- Move existing cassettes to the per-test paths and update the local
  in-process Anthropic regenerator to write to the same paths.
- Refresh README + Makefile target docs to match the sweep workflow.

Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
2026-04-30 18:08:57 +00:00
Cursor Agent
94b319c577
tests(llm_translation): add VCR cassette infrastructure for offline replay
Live LLM e2e tests have been draining provider billing accounts and going
flaky on outages (LIT-2683). This change introduces vcrpy-backed cassette
replay so CI can exercise the same end-to-end LiteLLM transformation paths
without hitting the live provider:

- Add 'vcrpy==8.1.1' to the dev dependency group.
- New 'tests/llm_translation/vcr_config.py' centralises the VCR config:
  filters auth/secret headers and per-request response headers, matches on
  method+URI+body, and exposes 'LITELLM_VCR_RECORD_MODE' for re-recording.
- New 'tests/llm_translation/test_anthropic_completion_vcr.py' demonstrates
  the pattern with one non-streaming and one streaming Anthropic test that
  replay from cassettes shipped under 'cassettes/'.
- New 'tests/llm_translation/cassettes/_record_anthropic_fixtures.py' lets
  contributors regenerate the canned Anthropic cassettes against a local
  in-process mock (no API key required), and 'cassettes/README.md' documents
  the full record/replay/refresh workflow.
- New 'make test-llm-translation-record FILE=...' Makefile target to refresh
  cassettes against the live API.

Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
2026-04-30 00:45:50 +00:00
Yuneng Jiang
b4d9006f92 uv lock 2026-04-28 17:43:36 -07:00
Yuneng Jiang
a10fff888d
uv lock 2026-04-25 19:32:41 -07:00
user
4d74a30412
chore(deps): fix brace-expansion pin and revert risky dev bumps
- Dockerfile: pin the unscoped `brace-expansion@5.0.5` alongside
  `@isaacs/brace-expansion@5.0.1`. The scoped package only has 5.0.0
  and 5.0.1 published; CVE-2026-33750's fix (5.0.5) is on the unscoped
  package which npm also vendors. The override loop now swaps both.
- Revert `black` 26.3.1 -> 24.10.0, `pytest` 9.0.3 -> 8.3.5, and
  `pytest-asyncio` 1.3.0 -> 1.2.0. The major-version bumps cause CI
  lint (black reformats hundreds of files) and code-quality
  (liccheck.ini has no entry for the new versions) failures. Both
  CVEs are dev-only; skipping leaves no runtime exposure.
2026-04-24 00:37:07 +00:00
user
fed1a14646
chore(deps): bump vulnerable dependencies
Closes Nexus IQ policy violations and open Dependabot alerts for
shipped Python deps and runtime-stage npm pins in the Docker image.
2026-04-24 00:36:59 +00:00
Yuneng Jiang
ffaeff54cd
add uv 2026-04-23 17:00:20 -07:00
Yuneng Jiang
95fa7678af
uv lock 2026-04-22 18:25:37 -07:00
Yuneng Jiang
e65d547c4d
adding uv lock 2026-04-21 18:10:47 -07:00
ishaan-berri
2f22a1293e
bump litellm-proxy-extras to 0.4.67 (#26043)
* bump litellm-proxy-extras version to 0.4.67

* bump litellm-proxy-extras pin to 0.4.67 in litellm pyproject

* regenerate uv.lock for litellm-proxy-extras 0.4.67

* bump litellm-enterprise version to 0.1.38

* bump litellm-enterprise pin to 0.1.38 in litellm pyproject

* regenerate uv.lock for litellm-enterprise 0.1.38
2026-04-18 19:03:56 -07:00
Yuneng Jiang
49ba6b8160
add uv lock 2026-04-18 18:43:09 -07:00
Yuneng Jiang
9bdb3b1772
chore: lower python floor from 3.11 to 3.10
All three dependency bumps in this PR resolve on Python 3.10, so there
is no need to jump the floor all the way to 3.11. Also restore the
py3.10-specific lunary==1.4.36 pin that was collapsed when the floor
was temporarily at 3.11.
2026-04-18 12:50:04 -07:00
Yuneng Jiang
d1e665742b
chore: drop stale python_version markers after floor raise
Now that requires-python starts at 3.11, the "python_version >= '3.9'"
and ">= '3.10'" markers are unconditionally true, and the "< '3.10'"
entries for psycopg, Pillow, pyarrow, langchain, lunary, and pylint can
never resolve. Drop the dead markers and remove the unreachable pins so
the dependency list reflects what actually gets installed.
2026-04-18 12:31:53 -07:00
Yuneng Jiang
1c29c5e903
chore: bump proxy deps and raise python floor to 3.11
Bumps orjson, fastapi-sso, and python-multipart to their latest releases
in the proxy extra, and raises the project python floor to 3.11 so the
updated pins can resolve. CI already runs on 3.11 / 3.12 / 3.13 and the
Docker images ship python 3.13, so the floor change aligns the declared
support range with what is actually tested and shipped.
2026-04-18 12:16:35 -07:00
Ishaan Jaffer
375cfb7f95
chore: update uv.lock after merging main 2026-04-17 12:56:23 -07:00
Yuneng Jiang
c294bbe4f0
fix(deps): pin langgraph-prebuilt==1.0.8 to avoid broken 1.0.9
langgraph-prebuilt 1.0.9 imports ExecutionInfo and ServerInfo from
langgraph.runtime, but those symbols are not exported until
langgraph 1.1.0. Our pin of langgraph==1.0.10 allows
langgraph-prebuilt<1.1.0,>=1.0.8, and uv resolves to 1.0.9 (the
latest in range), which breaks at import time in every test that
touches langgraph.prebuilt (e.g. tests/pass_through_tests/test_mcp_routes.py):

  ImportError: cannot import name 'ExecutionInfo' from 'langgraph.runtime'

Pinning langgraph-prebuilt to 1.0.8 pairs correctly with
langgraph==1.0.10 and restores the import path.
2026-04-16 09:36:05 -07:00
Yuneng Jiang
dafa1bf97c
Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_yj_apr15
# Conflicts:
#	litellm/litellm_core_utils/litellm_logging.py
#	uv.lock
2026-04-16 09:17:20 -07:00
Brendan Smith-Elion
265a960472
fix(noma-v2): fall back to key_alias for application_id in Noma dashboard (#25795)
Noma v1 resolved application_id from user_api_key_alias when no explicit
value was set (PR #16832). Noma v2 (PR #21400) was rewritten from scratch
and this fallback was not ported, causing all requests from shared LiteLLM
instances to appear as a single generic "litellm" application in the Noma
dashboard — breaking per-user traceability.

Fix: after checking dynamic_params and self.application_id, fall back to
user_api_key_alias from litellm_metadata or metadata. This matches the
pattern used by PromptSecurityGuardrail._resolve_key_alias_from_request_data()
and restores the v1 behavior where each API key gets its own application
entry in the Noma dashboard.

Fixes #25794

Co-authored-by: Brendan Smith-Elion <brendan.smith-elion@arcadia.io>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 19:04:24 +05:30
Ishaan Jaffer
9114b0da96
fix(ci): sync uv.lock with pyproject.toml 2026-04-15 18:16:22 -07:00
jayden
0a1b4427a6
fix(guardrails): replace custom_code sandbox with RestrictedPython 2026-04-15 15:13:52 -07:00
Yuneng Jiang
83c459225c
[Fix] CI: fix GHA timeouts and uv lock --check failures
1. exclude-newer: change from absolute "2026-04-10" to relative "3 days".
   All pinned deps were published before the 3-day cutoff. Re-locked so
   uv lock --check passes in test-mcp.yml and test-linting.yml.

2. test_eager_tiktoken_load: run all 10 env var values in a single
   subprocess instead of spawning 10 separate processes. Each cold
   import litellm takes ~78s on CI, so the old loop took ~13 min on a
   single xdist worker. Now takes ~78s total.

3. proxy-db remaining timeout: increase from 20 to 30 minutes. The
   remaining group has 51 test files and was consistently timing out at
   71% across all branches (pre-existing issue, not migration-related).
2026-04-11 09:04:49 -07:00
Yuneng Jiang
d9a460277a
[Fix] CI: fix uv lock resolution and tiktoken test timeout
1. Cap requires-python to <3.14 — no deps ship 3.14 wheels yet, and
   uv's cross-version resolver fails on the Python 3.14 split.
2. Change exclude-newer from relative "30 days" to absolute "2026-04-10"
   so the lockfile stays reproducible. The relative date caused
   cryptography==46.0.7 (published April 8) to fall outside the window.
3. Parametrize test_eager_loading_env_var_values instead of looping —
   with xdist the 6 subprocess cases can run in parallel instead of all
   running sequentially on one worker (~13 min → ~2 min).
   Also removed redundant case variants (Yes/YES/On/ON) that test the
   same str_to_bool code path.
2026-04-10 22:21:15 -07:00
user
8d1493ed08
fix(security): bump vulnerable dependencies
pip:
- cryptography 43.0.3 → 46.0.7 (5 CVEs including CVSS 8.2 ECDH key leak)

npm:
- hono 4.1.4/4.12.7 → 4.12.12 (prototype pollution, cookie injection,
  path traversal, middleware bypass, IP matching bypass)
- @hono/node-server 1.19.6 → 1.19.13 (serveStatic middleware bypass)
- vite 7.3.1 → 7.3.2 (file read via WebSocket, path traversal, fs.deny bypass)
- lodash override 4.17.23 → 4.18.1 (code injection via _.template,
  prototype pollution via _.unset/_.omit)

mlflow left at 3.9.0 — 2 of 3 alerts have no upstream fix, and
3.11.1 is blocked by exclude-newer (transitive dep chain).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 19:35:19 +00:00
stuxf
a6c30b30bf
build: migrate packaging, CI, and Docker from Poetry to uv (#25007)
* build: migrate packaging metadata to uv

* ci: move automation and local tooling to uv

* docker: migrate image builds and runtime setup to uv

* docs: update install and deployment guidance for uv

* chore: align auxiliary scripts and tests with uv

* test: harden test_litellm isolation

* fix: keep release and health check images self-contained

* build: pin uv tooling and health check deps

* test: isolate bedrock image request formatting from suite state

* test: cover sandbox executor requirements flow

* ci: fix circleci no-op command steps

* ci: fix circleci publish workflow parsing

* fix: stabilize remaining uv migration CI checks

* ci: increase matrix test timeout headroom

* fix: restore published docker and license coverage

* fix: restore proxy runtime build parity

* fix: restore proxy extras parity and venv migrations

* ci: persist uv path across circleci steps

* fix: keep psycopg binary in default test env

* docker: preserve prisma cache across stages

* test: run local proxy checks through uv python

* build: restore runtime deps moved into ci

* build: refresh uv lock after upstream merge

* fix: restore module import in test_check_migration after merge

The conflict resolution imported only the function but the test body
references check_migration as a module throughout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: revert dependency promotions, remove nodejs-wheel-binaries, fix Docker layer caching

- Move google-generativeai, Pillow, tenacity back to ci group (they are
  lazily imported and bloat the base SDK install needlessly)
- Remove nodejs-wheel-binaries from extra_proxy and proxy-dev (redundant
  in Docker where system Node.js is already installed via apk)
- Remove all nodejs-wheel node replacement and venv npm patching blocks
  from Dockerfiles since the wheel is no longer installed
- Add --no-default-groups to CodSpeed benchmark workflow so the benchmark
  environment matches the old minimal pip install footprint
- Apply standard uv two-phase Docker pattern: copy metadata first, install
  deps (cached layer), then copy source and install project
- Replace CircleCI enterprise no-op with proper uv sync command

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: regenerate uv.lock after removing nodejs-wheel-binaries

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): use cache/restore instead of cache to prevent cache poisoning

The old workflow used actions/cache/restore (read-only). The uv migration
changed it to actions/cache (read-write), which zizmor flags as a cache
poisoning risk. Restore the safer read-only variant.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): disable setup-uv built-in cache to silence cache-poisoning alert

The setup-uv action enables caching by default, which zizmor flags as a
cache poisoning risk. Disable it since we already use a read-only
cache/restore step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): disable setup-uv cache in publish workflow

Silences zizmor cache-poisoning alert. Publishing workflow runs
infrequently on protected branches so caching adds no real benefit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(test): remove duplicate verbose_logger mock in test_check_migration

The logger was patched twice — first via mocker.patch() then via
mocker.patch.object(autospec=True). The second call fails because
autospec cannot inspect an already-mocked attribute. Remove the
redundant first patch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): free disk space before Docker build in test-server-root-path

The Dockerfile.non_root build ran out of disk on the CI runner. Remove
Android SDK, .NET, Boost, and GHC toolchains (~12GB) to free space.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:46:23 -07:00
Ryan Malloy
f76938af5e
fix(ollama): set finish_reason to tool_calls and remove broken capability check (#18924)
* Update CLAUDE.md with qwen3 tool_calls bug fix instructions (#18922)

* fix(ollama): set finish_reason to "tool_calls" when tool_calls present

When qwen3 models return tool_calls through Ollama, the finish_reason
was incorrectly left as "stop" instead of being set to "tool_calls".
This caused clients to miss the tool_calls in the response.

Added _get_finish_reason helper method following OpenAI provider's
pattern, and fixed both streaming and non-streaming response paths.

Fixes: https://github.com/BerriAI/litellm/issues/18922

* fix(ollama): pass tools directly without model capability check

The previous code tried to check model capability via get_model_info()
which made network calls to localhost:11434. When Ollama is remote,
this fails and falls back to JSON format, breaking tool calling.

Ollama 0.4+ supports native tool calling - let Ollama handle
model capability detection instead of LiteLLM.

Fixes #18922

* fix(ollama): transform tool_calls response to OpenAI format

Ollama returns tool_calls with arguments as dict, but OpenAI format
requires arguments to be a JSON string. Also ensures 'type': 'function'
field is present.

Completes the fix for #18922

* fix(ollama): set finish_reason to "tool_calls" when tool_calls present

Fixes #18922

Two issues addressed:

1. Remove broken model capability check
   - get_model_info() fails when Ollama runs on remote server
   - Broken fallback triggered JSON prompt injection
   - Now passes tools directly - Ollama 0.4+ handles detection

2. Set finish_reason correctly
   - Was hardcoded to "stop" even with tool_calls present
   - Clients use this to know how to process the response
   - Now returns "tool_calls" when tool_calls are in response

Both streaming and non-streaming responses are fixed.

Tests:
- All 14 existing Ollama tests pass
- Added 3 focused tests for the fixes
2026-01-14 03:52:26 +05:30