PR was blocked by .github/workflows/guard-fork-dependencies.yml: fork PRs
cannot modify uv.lock. Reverting:
- uv.lock + pyproject.toml black bump (24.10.0 -> 26.3.1) and the 295
files of mechanical Black 26 reformat coupled to it
- pyproject.toml diskcache extra change (kept the runtime mitigation in
litellm/caching/disk_cache.py via JSONDisk)
Kept:
- Dockerfile cache narrowing (drops ~660 MB of uv build cache that
surfaced cached setuptools as CVE findings)
- litellm/caching/disk_cache.py: dc.JSONDisk to neutralize CVE-2025-69872
- ui/litellm-dashboard/package-lock.json + litellm-js/spend-logs/package-lock.json:
next/postcss/hono/uuid CVE bumps (these are not blocked by the fork guard)
- tests/test_litellm/caching/test_disk_cache.py
- tests/code_coverage_tests/liccheck.ini: harmless black authorization
Black + gitpython + langchain dep upgrades will need a follow-up from a
maintainer pushing a branch in the canonical BerriAI/litellm repo.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Narrow /root/.cache COPY in Dockerfile to /root/.cache/prisma{,-python}
only — drops ~660MB of uv build cache including a setuptools wheel
that surfaced as CVE-2024-6345 / CVE-2025-47273 even though it was
never on the runtime sys.path.
- DiskCache: switch to dc.JSONDisk to neutralize the pickle code path
(CVE-2025-69872, no upstream fix). Values must be JSON-serializable;
cleanup get_cache to skip the now-dead json.loads(dict) branch by
guarding on isinstance(str).
- pyproject.toml: drop diskcache pin from [caching] extra (no fixed
version exists). Stub kept so `pip install litellm[caching]` doesn't
warn; users who want disk caching install diskcache themselves.
- Bump black 24.10.0 → 26.3.1 (CVE-2026-32274) + apply 296-file mechanical
reformat. Black is dev-only (not in the runtime image), but bumping
clears the manifest-scan finding.
- Refresh ui/litellm-dashboard/package-lock.json to pick up next 16.2.4
(was 16.1.7, GHSA-q4gf-8mx6-v5v3), uuid 14.0.0, postcss 8.5.13.
- Refresh litellm-js/spend-logs/package-lock.json to pick up
hono 4.12.16 (GHSA-458j-xx4x-4375).
- uv lock: gitpython 3.1.46 → 3.1.49 (clears two High GHSAs),
langchain-text-splitters 1.1.1 → 1.1.2.
- Add tests/test_litellm/caching/test_disk_cache.py covering JSONDisk
enforcement, dict/string round-trip, TTL, increment, delete/flush.
Net delta on combined trivy + grype scans: 17 findings → 4 (all
remaining 4 are Wolfi system python-3.13 CVEs marked WONTFIX upstream
in CPython 3.14; CVE-2026-3298 is Windows-unreachable on Linux).
Existing on-disk caches written by the previous pickle-format Disk
will silently miss after upgrade — diskcache is intended to be
ephemeral so impact is recreate-on-next-write.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TogetherAIConfig.get_supported_openai_params called get_model_info(),
whose first line calls litellm.get_supported_openai_params() — which for
together_ai routes straight back into this method. The recursion only
terminated when Python's recursion limit was hit or when
_get_model_info_helper raised "not mapped" at the deepest level. Either
way the try/except caught it, so the bug stayed silent — but the cycle
ran ~332 deep every time, emitting hundreds of DEBUG log lines per
call. Surfaced as "infinite loop" in CI when the success_handler thread
emitted that log spam against an already-closed stderr during test
teardown.
Replace the get_model_info() call with supports_function_calling(),
which uses _get_model_info_helper directly and does not call
get_supported_openai_params. Measured drop from 332 to 2
_get_model_info_helper calls per first uncached lookup.
Also swap the test model from Qwen/Qwen3.5-9B (not in model_cost map)
back to a mapped serverless model, Qwen/Qwen2.5-7B-Instruct-Turbo. The
mapping gap is what made the recursion's tail end raise up into the
success handler during teardown in the first place.
Mixtral-8x7B-Instruct-v0.1 is no longer on Together AI's serverless tier
and now requires a dedicated endpoint, causing multiple tests to fail in CI:
- test_together_ai.py::TestTogetherAI::test_empty_tools
- test_completion.py::test_completion_together_ai_stream
- test_completion.py::test_customprompt_together_ai
- test_completion.py::test_completion_custom_provider_model_name
- test_text_completion.py::test_async_text_completion_together_ai
Qwen/Qwen3.5-9B is currently serverless on Together AI and supports
function calling, satisfying BaseLLMChatTest capability requirements.
The DeepInfra tests were making real API calls and failing with AuthenticationError.
Mock the HTTP layer to verify request shape instead.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Llama-3.2-3B-Instruct-Turbo is no longer available as a serverless model
on Together AI. Switch to Llama-3.3-70B-Instruct-Turbo which is still
available and has cost data in the model prices map.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- test_hanging_request_azure: mock httpx.AsyncClient.send to simulate slow
response instead of racing real network latency against a 10ms timeout.
The old non-existent deployment (gpt-4o-new-test) returned 404 faster
than the timeout, causing NotFoundError instead of APITimeoutError.
- test_completion_together_ai_llama: update model from deprecated
Meta-Llama-3.1-8B-Instruct-Turbo to Llama-3.2-3B-Instruct-Turbo
(Together AI removed the old model from serverless).
- conftest.py: clear litellm.callbacks list before each test to prevent
proxy hooks (SkillsInjectionHook, VirtualKeyModelMaxBudgetLimiter)
from leaking across tests via Router initialization.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
hosted_vllm no longer uses the OpenAI client, so these tests
that mock the OpenAI client are not applicable to hosted_vllm.
Removes hosted_vllm from:
- test_openai_compatible_custom_api_base
- test_openai_compatible_custom_api_video
- Filter skip_mcp_handler and other internal params in fallback_utils.py before calling acompletion
Fixes issue where internal parameters were being passed to provider APIs causing errors
- Remove deployment field from GCS bucket logger test metadata
Fixes model name mismatch where deployment field was overriding the model in logging
- Update Bedrock Titan test to use non-deprecated model (titan-text-express-v1)
Fixes test failure due to deprecated amazon.titan-text-lite-v1 model