Removes commentary that restated the code, including:
- module-level banners explaining what the conftest does (covered by
Readme.md and the function bodies)
- docstrings on _scrub_response, _before_record_response, vcr_config,
_vcr_disabled, pytest_recording_configure (function names + bodies
are self-evident)
- inline notes about header filtering, match_on, etc.
- per-test docstrings restating the test name
Keeps the two non-obvious notes that aren't recoverable from the code:
the vcrpy/respx httpx-transport collision rationale on
_RESPX_CONFLICTING_FILES, the vcrpy "return None to skip persisting"
contract on filter_non_2xx_response, and the fixture-ordering
dependency on _vcr_record_retries.
Removes the YAML cassette feature entirely and replaces it with a
Redis-only flow. Every test in tests/llm_translation/ and
tests/llm_responses_api_testing/ is auto-marked @pytest.mark.vcr via
conftest.pytest_collection_modifyitems, so any provider call lands in
the Redis cache (litellm:vcr:cassette:<rel_path>, 24h TTL). First run
records, runs within the day replay, day rollover re-records and
surfaces upstream API drift within 24h.
VCR is on by default. Set LITELLM_VCR_DISABLE=1, or simply leave
REDIS_HOST unset, to opt out — both bypass the auto-marker entirely so
nothing about cassettes runs. record_mode is "once" so cache-miss
records and cache-hit replays.
The 8 existing respx-using files in tests/llm_translation are excluded
from the auto-marker (vcrpy and respx both patch the httpx transport;
applying both makes one silently win). The persister's own unit-test
file is also excluded so it doesn't recursively run inside a cassette.
The persister moved from tests/llm_translation/_vcr_redis_persister.py
to tests/_vcr_redis_persister.py so both conftests share it. The two
demo tests in test_anthropic_completion_vcr.py were ported into
test_anthropic_completion.py and the demo file was deleted.
Adds tests/_flush_vcr_cache.py + a Make target
(test-llm-translation-flush-vcr-cache) that scans
litellm:vcr:cassette:* and pipelines DELETEs, for the
"I want the next CI run to re-record now" workflow. Drops the now-dead
test-llm-translation-record target.
Provider keys are still required on cache-miss (which happens on first
run and once a day after that). Replay-mode runs need only Redis.
Per Yuneng's feedback, use a single @pytest.mark.vcr marker so one record
sweep populates cassettes for every marked test across all providers,
instead of forcing each test to bind to a hard-coded cassette path.
Changes vs. the initial scaffolding:
- Add 'pytest-recording==0.13.4' on top of vcrpy. Adopt its layout:
cassettes live at 'cassettes/<test_module>/<test_name>.yaml', resolved
automatically. New tests just decorate with '@pytest.mark.vcr' — no
imports or path bookkeeping.
- Move the shared filter/match config into a 'vcr_config' fixture in
'tests/llm_translation/conftest.py' (consumed by pytest-recording for
every marked test in the dir). Drop the standalone 'vcr_config.py'.
- Bulk record / replay via the standard '--record-mode' CLI flag:
'make test-llm-translation-record' now sweeps every '@pytest.mark.vcr'
test under tests/llm_translation in one shot. Optional 'TARGET=' var
scopes to a single file.
- Move existing cassettes to the per-test paths and update the local
in-process Anthropic regenerator to write to the same paths.
- Refresh README + Makefile target docs to match the sweep workflow.
Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
Live LLM e2e tests have been draining provider billing accounts and going
flaky on outages (LIT-2683). This change introduces vcrpy-backed cassette
replay so CI can exercise the same end-to-end LiteLLM transformation paths
without hitting the live provider:
- Add 'vcrpy==8.1.1' to the dev dependency group.
- New 'tests/llm_translation/vcr_config.py' centralises the VCR config:
filters auth/secret headers and per-request response headers, matches on
method+URI+body, and exposes 'LITELLM_VCR_RECORD_MODE' for re-recording.
- New 'tests/llm_translation/test_anthropic_completion_vcr.py' demonstrates
the pattern with one non-streaming and one streaming Anthropic test that
replay from cassettes shipped under 'cassettes/'.
- New 'tests/llm_translation/cassettes/_record_anthropic_fixtures.py' lets
contributors regenerate the canned Anthropic cassettes against a local
in-process mock (no API key required), and 'cassettes/README.md' documents
the full record/replay/refresh workflow.
- New 'make test-llm-translation-record FILE=...' Makefile target to refresh
cassettes against the live API.
Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
* build: migrate packaging metadata to uv
* ci: move automation and local tooling to uv
* docker: migrate image builds and runtime setup to uv
* docs: update install and deployment guidance for uv
* chore: align auxiliary scripts and tests with uv
* test: harden test_litellm isolation
* fix: keep release and health check images self-contained
* build: pin uv tooling and health check deps
* test: isolate bedrock image request formatting from suite state
* test: cover sandbox executor requirements flow
* ci: fix circleci no-op command steps
* ci: fix circleci publish workflow parsing
* fix: stabilize remaining uv migration CI checks
* ci: increase matrix test timeout headroom
* fix: restore published docker and license coverage
* fix: restore proxy runtime build parity
* fix: restore proxy extras parity and venv migrations
* ci: persist uv path across circleci steps
* fix: keep psycopg binary in default test env
* docker: preserve prisma cache across stages
* test: run local proxy checks through uv python
* build: restore runtime deps moved into ci
* build: refresh uv lock after upstream merge
* fix: restore module import in test_check_migration after merge
The conflict resolution imported only the function but the test body
references check_migration as a module throughout.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: revert dependency promotions, remove nodejs-wheel-binaries, fix Docker layer caching
- Move google-generativeai, Pillow, tenacity back to ci group (they are
lazily imported and bloat the base SDK install needlessly)
- Remove nodejs-wheel-binaries from extra_proxy and proxy-dev (redundant
in Docker where system Node.js is already installed via apk)
- Remove all nodejs-wheel node replacement and venv npm patching blocks
from Dockerfiles since the wheel is no longer installed
- Add --no-default-groups to CodSpeed benchmark workflow so the benchmark
environment matches the old minimal pip install footprint
- Apply standard uv two-phase Docker pattern: copy metadata first, install
deps (cached layer), then copy source and install project
- Replace CircleCI enterprise no-op with proper uv sync command
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate uv.lock after removing nodejs-wheel-binaries
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): use cache/restore instead of cache to prevent cache poisoning
The old workflow used actions/cache/restore (read-only). The uv migration
changed it to actions/cache (read-write), which zizmor flags as a cache
poisoning risk. Restore the safer read-only variant.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): disable setup-uv built-in cache to silence cache-poisoning alert
The setup-uv action enables caching by default, which zizmor flags as a
cache poisoning risk. Disable it since we already use a read-only
cache/restore step.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): disable setup-uv cache in publish workflow
Silences zizmor cache-poisoning alert. Publishing workflow runs
infrequently on protected branches so caching adds no real benefit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(test): remove duplicate verbose_logger mock in test_check_migration
The logger was patched twice — first via mocker.patch() then via
mocker.patch.object(autospec=True). The second call fails because
autospec cannot inspect an already-mocked attribute. Remove the
redundant first patch.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): free disk space before Docker build in test-server-root-path
The Dockerfile.non_root build ran out of disk on the CI runner. Remove
Android SDK, .NET, Boost, and GHC toolchains (~12GB) to free space.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split tests/test_litellm into 10 parallel CI jobs using GitHub Actions
matrix strategy to reduce PR feedback time from ~25 min to ~8-10 min.
Changes:
- Add new test-litellm-matrix.yml workflow with 10 matrix jobs:
- llms (~225 files, 4 workers)
- proxy-guardrails (~51 files, 4 workers)
- proxy-core (~52 files, 4 workers)
- proxy-misc (~77 files, 4 workers)
- integrations (~60 files, 4 workers)
- core-utils (~32 files, 2 workers)
- other (~69 files, 4 workers) - includes all previously uncovered dirs
- root (~34 files, 4 workers)
- proxy-unit-a (~20 files, 2 workers)
- proxy-unit-b (~28 files, 2 workers)
- Deprecate test-litellm.yml (moved to workflow_dispatch for manual use)
- Add matching Makefile targets for local testing:
- make test-unit-llms
- make test-unit-proxy-guardrails
- make test-unit-proxy-core
- make test-unit-proxy-misc
- make test-unit-integrations
- make test-unit-core-utils
- make test-unit-other
- make test-unit-root
- make test-proxy-unit-a
- make test-proxy-unit-b
Benefits:
- ~3x faster wall-clock time through parallelization
- Dependency caching for faster subsequent runs
- Concurrency control to cancel stale runs
- Better failure isolation per test group
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* feat: add faster linting targets for development workflow
- Add lint-dev target that only checks changed files vs origin/main
- Add lint-format-changed to format only modified Python lines
- Add lint-ruff-dev using diff-quality for incremental lint checks
- Upgrade ruff from 0.1.x to 0.2.x for --range formatting support
- Add pylint and diff-cover as dev dependencies
- Use portable PIP variable for cross-platform compatibility
- Suppress poetry warnings in install-dev target
* fix(mypy): fix type: ignore placement for OTEL LogRecord import
The type: ignore[attr-defined] comment was on the import alias line
inside parentheses, but mypy reports the error on the `from` line.
Collapse to single-line imports so the suppression is on the correct
line. Also add no-redef to the fallback branch.
* fix: address review issues in faster linting PR
- Remove poetry lock/check from install-dev (slow, can mutate lockfile)
- Remove misplaced [virtualenvs] and [installer] from pyproject.toml
(these belong in poetry.toml, not project metadata)
- Remove unused pylint dev dependency (diff-quality uses pylint output
format, not the pylint package itself)
- Fix trailing whitespace in .PHONY declaration
- Use mktemp instead of hardcoded /tmp/ruff.txt in lint-ruff-dev
- Guard lint-ruff-FULL-dev against empty file list from git diff
- Fix incorrect comment on lint-dev target
- Regenerate poetry.lock
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix: address review issues in faster linting PR
- Remove poetry lock/check from install-dev (slow, can mutate lockfile)
- Remove misplaced [virtualenvs] and [installer] from pyproject.toml
(these belong in poetry.toml, not project metadata)
- Remove unused pylint dev dependency (diff-quality uses pylint output
format, not the pylint package itself)
- Fix trailing whitespace in .PHONY declaration
- Use mktemp instead of hardcoded /tmp/ruff.txt in lint-ruff-dev
- Guard lint-ruff-FULL-dev against empty file list from git diff
- Fix incorrect comment on lint-dev target
- Regenerate poetry.lock
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* chore: Consistency in install-test-deps using poetry run
* feat: update in-memory guardrails after database CRUD operations
* test: add parameterized tests for guardrail CRUD with memory sync
- Move from CircleCI dependency to direct pytest execution
- Add Python script to generate beautiful markdown reports
- Update GitHub workflow to run tests directly
- Update Makefile to use the new test runner script
- Generate both JUnit XML and markdown artifacts
- Group test results by provider with detailed statistics
- Introduced a comprehensive contributing guide outlining the checklist for PR submissions, including signing the Contributor License Agreement, adding tests, and ensuring code quality.
- Updated README.md to link to the new CONTRIBUTING.md and provide a quick start for contributors.
- Enhanced Makefile with additional commands for installation and testing to streamline the development workflow.
* fix(proxy_server.py): get master key from environment, if not set in general settings or general settings not set at all
* test: mark flaky test
* test(test_proxy_server.py): mock prisma client
* ci: add new github workflow for testing just the mock tests
* fix: fix linting error
* ci(conftest.py): add conftest.py to isolate proxy tests
* build(pyproject.toml): add respx to dev dependencies
* build(pyproject.toml): add prisma to dev dependencies
* test: fix mock prompt management tests to use a mock anthropic key
* ci(test-litellm.yml): parallelize mock testing
make it run faster
* build(pyproject.toml): add hypercorn as dev dep
* build(pyproject.toml): separate proxy vs. core dev dependencies
make it easier for non-proxy contributors to run tests locally - e.g. no need to install hypercorn
* ci(test-litellm.yml): pin python version
* test(test_rerank.py): move test - cannot be mocked, requires aws credentials for e2e testing
* ci: add thank you message to ci
* test: add mock env var to test
* test: add autouse to tests
* test: test mock env vars for e2e tests