Commit Graph

127 Commits

Author SHA1 Message Date
stuxf
a6c30b30bf
build: migrate packaging, CI, and Docker from Poetry to uv (#25007)
* build: migrate packaging metadata to uv

* ci: move automation and local tooling to uv

* docker: migrate image builds and runtime setup to uv

* docs: update install and deployment guidance for uv

* chore: align auxiliary scripts and tests with uv

* test: harden test_litellm isolation

* fix: keep release and health check images self-contained

* build: pin uv tooling and health check deps

* test: isolate bedrock image request formatting from suite state

* test: cover sandbox executor requirements flow

* ci: fix circleci no-op command steps

* ci: fix circleci publish workflow parsing

* fix: stabilize remaining uv migration CI checks

* ci: increase matrix test timeout headroom

* fix: restore published docker and license coverage

* fix: restore proxy runtime build parity

* fix: restore proxy extras parity and venv migrations

* ci: persist uv path across circleci steps

* fix: keep psycopg binary in default test env

* docker: preserve prisma cache across stages

* test: run local proxy checks through uv python

* build: restore runtime deps moved into ci

* build: refresh uv lock after upstream merge

* fix: restore module import in test_check_migration after merge

The conflict resolution imported only the function but the test body
references check_migration as a module throughout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: revert dependency promotions, remove nodejs-wheel-binaries, fix Docker layer caching

- Move google-generativeai, Pillow, tenacity back to ci group (they are
  lazily imported and bloat the base SDK install needlessly)
- Remove nodejs-wheel-binaries from extra_proxy and proxy-dev (redundant
  in Docker where system Node.js is already installed via apk)
- Remove all nodejs-wheel node replacement and venv npm patching blocks
  from Dockerfiles since the wheel is no longer installed
- Add --no-default-groups to CodSpeed benchmark workflow so the benchmark
  environment matches the old minimal pip install footprint
- Apply standard uv two-phase Docker pattern: copy metadata first, install
  deps (cached layer), then copy source and install project
- Replace CircleCI enterprise no-op with proper uv sync command

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: regenerate uv.lock after removing nodejs-wheel-binaries

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): use cache/restore instead of cache to prevent cache poisoning

The old workflow used actions/cache/restore (read-only). The uv migration
changed it to actions/cache (read-write), which zizmor flags as a cache
poisoning risk. Restore the safer read-only variant.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): disable setup-uv built-in cache to silence cache-poisoning alert

The setup-uv action enables caching by default, which zizmor flags as a
cache poisoning risk. Disable it since we already use a read-only
cache/restore step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): disable setup-uv cache in publish workflow

Silences zizmor cache-poisoning alert. Publishing workflow runs
infrequently on protected branches so caching adds no real benefit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(test): remove duplicate verbose_logger mock in test_check_migration

The logger was patched twice — first via mocker.patch() then via
mocker.patch.object(autospec=True). The second call fails because
autospec cannot inspect an already-mocked attribute. Remove the
redundant first patch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): free disk space before Docker build in test-server-root-path

The Dockerfile.non_root build ran out of disk on the CI runner. Remove
Android SDK, .NET, Boost, and GHC toolchains (~12GB) to free space.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:46:23 -07:00
Yuneng Jiang
85f72c9d24
[Fix] Remove unused aioboto3 dependency and botocore conflict workarounds
aioboto3 was listed as a dependency for async sagemaker calls but is not
imported anywhere in the codebase — async calls use httpx + botocore SigV4
instead. Removing it eliminates the unresolvable botocore version conflict
between boto3 and aiobotocore, along with all grep -v / --no-deps workarounds
across Dockerfiles and CI.

Also addresses Greptile review feedback: collapse redundant grpcio
python-version markers, bump pyproject.toml cryptography to 46.0.5 to
match Docker (GHSA-r6ph-v2qm-q3c2), and fix misleading .npmrc comment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 14:25:44 -07:00
Yuneng Jiang
821a634d25
[Fix] Handle boto3/aioboto3 botocore conflict across CI and Docker builds
boto3==1.42.80 and aioboto3==15.5.0 have incompatible botocore version
ranges. No aioboto3 release supports botocore 1.42.x yet. Both uv and
pip 26.0.1 reject the resolution.

Fix: filter aioboto3 out of requirements.txt at install time, then
install aioboto3+aiobotocore with --no-deps to bypass resolution.
Added wrapt and aioitertools to requirements.txt as pinned transitive
deps of aiobotocore (skipped by --no-deps). Fixed pip stdin handling
(/dev/stdin). Applied to all 5 Dockerfiles and all CircleCI install
paths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 12:27:21 -07:00
Yuneng Jiang
5f63873dca
[Infra] Pin all Docker build dependencies to exact versions
Pin every dependency across all Docker builds so upgrades are intentional.
Verified by building all 3 production images and diffing pip freeze against
known-good v1.83.0-nightly baselines — zero version drift.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:05:39 -07:00
yuneng-jiang
d3587b1d8e fix: bump PyJWT to 2.12.0 in all Dockerfiles and tar to 7.5.11
All Dockerfiles were pinning PyJWT 2.9.0 (Dockerfile, Dockerfile.database,
Dockerfile.dev) or had a stale wheel build for 2.9.0 (Dockerfile.non_root).
Updated to 2.12.0 to match pyproject.toml. Also bumps tar to 7.5.11 in
Dockerfile.non_root for security.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 19:54:54 -07:00
yuneng-jiang
6a90596377 updating Dockerfile to tar 7.5.11 2026-03-13 11:16:17 -07:00
Krish Dholakia
e7714f0ce6
Fix CVEs: bump tar/minimatch/pypdf + harden Docker SBOM patching (#23082)
* fix(docker): bump tar/minimatch/pypdf for CVE fixes + harden SBOM patching

- Bump tar 7.5.8→7.5.10, minimatch 10.2.1→10.2.4, pypdf 6.6.2→6.7.3
- Add sed-based SBOM metadata patching with properly indented find/sed
- Add npm package manager cleanup (apk del / apt-get purge) to remove
  stale SBOM entries from image scanners
- Scope || true to only apk del via brace grouping { ... || true; }
- Guard npm root -g with non-empty assertion to prevent silent failures
- Scope minimatch sed regex to ^10.x to avoid matching other major versions

Addresses: CVE-2026-27903, CVE-2026-27904, GHSA-qffp-2rhf-9h96, CVE-2026-27888

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(docker): scope find to /usr/local/lib /usr/lib, drop autoremove

- Replace `find /` with `find /usr/local/lib /usr/lib` to avoid
  traversing /proc, /sys, /dev during SBOM metadata patching
- Remove `apt-get autoremove -y` from Debian-based Dockerfiles to
  prevent nodejs from being removed as an auto-installed dependency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 18:31:27 -08:00
Ishaan Jaff
29e3fd5d79
[Release Fix] (#22411)
* fix(lint): suppress PLR0915 for 3 complex methods that exceed 50-statement limit

- streaming_iterator.py: _process_event (84 statements)
- transformation.py: translate_messages_to_responses_input (51 statements)
- transformation.py: transform_realtime_response (54 statements)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(mypy): resolve type errors in public_endpoints, user_api_key_auth, common_utils, transformation

- public_endpoints.py: fix _cached_endpoints type annotation
- user_api_key_auth.py: accept Optional[str] for end_user_id parameter
- common_utils.py: add NewProjectRequest/UpdateProjectRequest to Union type
- transformation.py: add ChatCompletionRedactedThinkingBlock and list[Any] to content type

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(proxy-extras): bump version to 0.4.50 and sync schema

- Bump litellm-proxy-extras from 0.4.49 to 0.4.50
- Sync schema.prisma with main proxy schema
- Includes new LiteLLM_ClaudeCodePluginTable model
- Includes new @@index([startTime, request_id]) on SpendLogs
- Update version references in requirements.txt and pyproject.toml

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(router): use string id in test_add_deployment and add defensive str() in register_model

- Change test to use string '100' instead of int 100 for model_info.id
- Add str() conversion in register_model to prevent AttributeError on non-string keys

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(security): update minimatch to 10.2.4 to fix CVE-2026-27903 and CVE-2026-27904

- Run npm audit fix in docs/my-website
- Updates minimatch from 10.2.1 to 10.2.4 (fixes HIGH severity ReDoS vulnerabilities)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(test): update realtime guardrail test assertions to match actual guardrail behavior

- test_text_message_blocked_by_guardrail_no_ai_response: allow guardrail's own block
  message text in response.done (previously expected empty content)
- test_voice_transcript_blocked_by_guardrail: allow guardrail to send response.cancel
  + block message + response.create flow (previously expected no response.create)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: revert proxy-extras version in requirements.txt and pyproject.toml

The litellm-proxy-extras 0.4.50 is not published to PyPI yet, so consumer
references must stay at 0.4.49. Only the source package pyproject.toml
should be bumped to 0.4.50 for the publish_proxy_extras CI job.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: make transcript delta check optional in voice guardrail test

The guardrail sends an error event (guardrail_violation) when blocking
voice transcripts; it does not always produce transcript deltas. Remove
the assertion requiring response.audio_transcript.delta since the error
event is the primary signal that blocked content was handled.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Add missing env keys to documentation: LITELLM_MAX_STREAMING_DURATION_SECONDS and LITELLM_USE_CHAT_COMPLETIONS_URL_FOR_ANTHROPIC_MESSAGES

These two environment variables were used in code but not documented in the
environment variables reference section of config_settings.md, causing the
test_env_keys.py CI test to fail.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Fix 13 mypy type errors across 6 files

- in_flight_requests_middleware.py: Fix type: ignore error codes from
  [union-attr] to [attr-defined], add [arg-type] for Gauge **kwargs
- transformation.py: Add [assignment] ignore for output_format reassignment,
  add fallback empty string for tool use id to fix arg-type
- responses/main.py: Remove redundant type annotation on second
  secret_fields assignment to fix no-redef
- streaming_iterator.py: Add [assignment] ignores for intermediate
  cache token assignments
- handler.py: Add [typeddict-item] ignore for AnthropicMessagesRequest
  construction from dict
- public_endpoints.py: Add [arg-type] ignore for _load_endpoints()
  return type mismatch with SupportedEndpoint model

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add auth overrides to spend tracking tests, fix realtime guardrail assertion, update UI minimatch

- Add app.dependency_overrides for user_api_key_auth in 4 spend tracking tests
  that were returning 401 Unauthorized (error_code, error_message,
  error_code_and_key_alias, key_hash)
- Fix realtime guardrail test to check ANY error event for guardrail_violation
  instead of just the first (OpenAI may send its own errors first)
- Update ui/litellm-dashboard/package-lock.json to fix minimatch vulnerability

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Fix failing MCP e2e and create_mcp_server UI tests

Test 1 (test_independent_clients_no_shared_session):
- Add allow_all_keys: true to MCP servers in test config. With master_key
  and no DB, get_allowed_mcp_servers returned empty, causing 0 tools and
  403 on tool calls. allow_all_keys bypasses per-key restrictions.
- Add asyncio.sleep(0.5) between client connections to allow MCP SDK
  TaskGroup cleanup and avoid ExceptionGroup on connection close (MCP #915).

Test 2 (create_mcp_server 'auth value is provided'):
- Use userEvent.setup({ delay: null }) for instant keystrokes to avoid
  timeout from default typing delay on CI.
- Increase per-test timeout to 15000ms for CI environments.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: stabilize proxy unit tests for parallel execution

- test_response_polling_handler: add xdist_group to prevent heavy import OOM
- test_db_schema_migration: use temp dir for worker isolation, sync schema.prisma index
- test_custom_tokenizer_bug: use lighter tokenizer to prevent OOM in parallel

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add auth overrides to more spend tracking and model info tests

- Fix test_ui_view_spend_logs_pagination missing auth override (401)
- Fix test_view_spend_tags missing auth override (401)
- Fix test_view_spend_tags_no_database missing auth override (401)
- Fix test_empty_model_list.py to use app.dependency_overrides instead of patch()
  for FastAPI dependency injection auth

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(test): use patch.object for aiohttp transport test to work in parallel execution

The @patch decorator was not intercepting the static method call in parallel
xdist workers. Using patch.object on the directly-imported class is more reliable.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(security): update minimatch from 10.2.1 to 10.2.4 in Dockerfile

The Docker image was explicitly pinning minimatch@10.2.1 which has HIGH
severity ReDoS vulnerabilities (GHSA-7r86-cg39-jmmj, GHSA-23c5-xmqv-rm74).
Update to 10.2.4 which includes fixes for both CVEs.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(ui): prevent MCP and TeamInfo test timeouts on CI

- Add userEvent.setup({ delay: null }) to all tests using userEvent in both files
- Add timeout: 15000 to tests with significant user interaction (typing, multiple clicks)
- Fixes: create_mcp_server Bearer Token test, TeamInfo cancel button test

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: stabilize parallel test execution and aiohttp transport test

- test_aiohttp_handler: rewrite transport test to not rely on static method mock
  (consistently fails in parallel xdist workers)
- test_proxy_cli: add xdist_group to prevent timeout during heavy imports
- test_swagger_chat_completions: add xdist_group to prevent timeout

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(security): add serialize-javascript override to fix GHSA-5c6j-r48x-rmvq

Add npm override for serialize-javascript>=7.0.3 in docs/my-website
to fix HIGH severity RCE vulnerability via RegExp.flags.
Also bump minimatch override to >=10.2.4.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Fix flaky tests: remove broken Vertex model, add retries for Anthropic

- Remove vertex_ai/meta/llama-4-scout-17b-16e-instruct-maas from
  test_partner_models_httpx_streaming - consistently returns 400 BadRequest
- Add @pytest.mark.flaky(retries=6, delay=10) to test_function_call_parsing
  for transient Anthropic API overload errors
- Add @pytest.mark.flaky(retries=6, delay=10) to test_openai_stream_options_call
  for transient Anthropic InternalServerError

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(ci): add xdist_group(proxy_heavy) to prevent OOM in parallel proxy tests

- Add pytestmark = pytest.mark.xdist_group('proxy_heavy') to test_proxy_utils.py
- Change test_db_schema_migration.py from schema_migration to proxy_heavy group
- Add @pytest.mark.xdist_group('proxy_heavy') to test_proxy_server.py::test_health

Groups heavy proxy tests to run on same worker, avoiding worker OOM crashes.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Fix vertex AI qwen global endpoint test to mock vertexai module import

The test_vertex_ai_qwen_global_endpoint_url test was failing because the
VertexAIPartnerModels.completion() method tries to 'import vertexai' before
any of the mocked code runs. In environments without google-cloud-aiplatform
installed, this import fails with a VertexAIError(status_code=400).

Fix by:
- Adding patch.dict('sys.modules', {'vertexai': MagicMock()}) to mock the
  vertexai module import
- Adding vertex_ai_location parameter to the acompletion call for completeness

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(ci): add xdist_group to health endpoint and watsonx tests for parallel stability

- test_health_liveliness_endpoint: add xdist_group('proxy_health') to prevent timeout
- test_watsonx_gpt_oss tests: add xdist_group('watsonx_heavy') to prevent mock interference

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(test): pre-populate WatsonX IAM token cache to prevent parallel test interference

The watsonx prompt transformation test was failing in parallel execution because
litellm.module_level_client.post mock was being interfered with by other tests.
Pre-populating the IAM token cache avoids the HTTP call entirely.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(test): add spend data polling with retries for e2e pass-through tests

- test_vertex_with_spend.test.js: Replace 15s fixed wait with polling loop
  (up to 6 attempts, 10s apart) for spend data to appear in DB
- Increase test timeout from 25s to 90s to accommodate polling
- base_anthropic_messages_tool_search_test.py: Add flaky(retries=3) for
  streaming test that depends on live Anthropic API

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(ci): reduce parallel workers from 8 to 4 for proxy tests to prevent OOM

- litellm_proxy_unit_testing_part2: -n 8 -> -n 4
- litellm_mapped_tests_proxy_part2: -n 8 -> -n 4, timeout 60 -> 120
- Worker crashes consistently caused by too many parallel proxy tests
  each loading the full FastAPI app and heavy dependency tree

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(db): add migration for SpendLogs composite index (startTime, request_id)

The @@index([startTime, request_id]) was added to schema.prisma but had no
corresponding migration. This caused test_aaaasschema_migration_check to fail
because prisma migrate diff detected the missing index.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(db): add migration for MCP available_on_public_internet default change to true

The schema.prisma changed the default for available_on_public_internet from
false to true, but no migration was created. This caused the schema migration
test to detect drift.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(test): increase server wait time and add retry to flaky external API tests

- test_basic_python_version.py: increase server startup wait from 60s to 90s
  for slower CI environments (fixes installing_litellm_on_python_3_13)
- test_a2a_agent.py: add flaky(retries=3, delay=5) for non-streaming test
  that depends on live A2A agent endpoint

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(test): add flaky retries to all intermittent external API tests for 0-fail CI

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(test): add auth overrides to file endpoint tests that return 500

The test_target_storage tests were getting 500 because the FastAPI auth
dependency wasn't overridden. Added app.dependency_overrides for proper
auth bypass in test environment.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
2026-02-28 09:46:35 -08:00
Harshit28j
3e6c10a071 security: fix critical/high CVEs in OS-level libs and NPM transitive 2026-02-24 19:40:09 +05:30
Harshit Jain
3b043ee8bf
fix critical CVE vulnerabliltes (#20683) 2026-02-07 22:23:01 -08:00
Ishaan Jaffer
ef66a6cb62 fix security scans 2026-02-07 11:15:02 -08:00
Ishaan Jaffer
a002907389 fix tar security issue with TAR 2026-01-31 11:46:53 -08:00
Cesar Garcia
2a48d12507
fix(docker): add libsndfile to main Dockerfile for ARM64 audio processing (#19776)
Fixes #16920 for users of the stable release images.

The previous fix (PR #18092) added libsndfile to docker/Dockerfile.alpine,
but stable releases are built from the main Dockerfile (Wolfi-based),
not the Alpine variant.
2026-01-28 21:33:41 -08:00
Harshit Jain
c9d5185099
fix(docker): use correct schema path for prisma generation (#19631) 2026-01-23 21:22:19 -08:00
Alexsander Hamir
1544e8f971
feat: Add line_profiler support for performance analysis and fix Windows CRLF issues in Docker builds (#18773) 2026-01-07 11:36:57 -08:00
Krish Dholakia
74ba18df55
Litellm chainguard fixes 12 02 2025 p1 (#17406)
* build: update dockerfile non root

* build: update build

* build: update non root

* build: dockerfile fixes

* build: ensure dockerfile + dockerfile.database also work
2025-12-02 22:50:13 -08:00
Dmitriy Alergant
90850bf6d5
fix: add nodejs and npm to runtime dependencies for prisma generate (#16903)
Fixes cross-platform Docker build issue where `prisma generate` fails
when building for amd64 platform from macOS. The Prisma CLI requires
Node.js and npm to be available in the runtime environment.

The Python prisma package (v0.11.0) uses nodeenv to bootstrap Node.js
if not found. However, the downloaded npm v10 fails with a
"sizeCalculation" error in minimal Chainguard environments during
cross-platform builds. Providing system nodejs/npm resolves this.

Changes:
- Added nodejs and npm to runtime dependencies (Dockerfile:51)
- This enables prisma generate to run successfully during the build

Error without fix:
npm error cannot set sizeCalculation without setting maxSize or maxEntrySize
subprocess.CalledProcessError: Command '[...nodeenv/bin/npm', 'install',
'prisma@5.4.2']' returned non-zero exit status 1.

Testing:
docker buildx build --platform linux/amd64 -t litellm:test .

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-22 19:27:06 -08:00
Ishaan Jaff
9288c8543c
fix docker (#16342) 2025-11-07 14:38:20 -08:00
Ishaan Jaffer
03e73c9b7c fix security 2025-09-26 19:31:56 -07:00
Ishaan Jaffer
f5d04c15e6 security fix 2025-09-26 19:21:06 -07:00
Mritunjay Kumar Sharma
8b0e3c14ff
chore(docker): remove redundant Admin UI build step
The Admin UI is already built before packaging, so the second invocation of docker/build_admin_ui.sh after PyJWT adjustments was unnecessary. Removing it speeds up the builder stage, reduces cache invalidation, and doesn’t change the resulting wheel or runtime image.
2025-09-18 11:09:20 -05:00
Ishaan Jaff
a328ad56e3
[Bug Fix] Fixes for using Auto Router with LiteLLM Docker Image (#13788)
* fix install auto router.sh

* fixes for Docker IMG
2025-08-19 18:36:30 -07:00
Ishaan Jaff
106a298f0a
[Feat] UI - Allow Adding LiteLLM Auto Router on UI (#12960)
* add router.json

* test_router_auto_router

* async_pre_routing_hook

* fixes for auto router

* add async_pre_routing_hook

* add LiteLLMRouterEncoder

* update test auto_router_embedding_model

* add auto_router_embedding_model

* add AutoRouter

* fix async_pre_routing_hook

* update async_pre_routing_hook

* fix auto router

* fix router.json

* working router init

* working embedding encoder

* working auto router

* test_router_auto_router

* test auto router

* add semantic-router as optional for litellm

* add extras

* semantic_router==0.1.10

* ruff fix

* use aiohttp==3.10.11

* python-dotenv==1.0.1

* test auto router

* test_router_auto_router

* semantic_router

* test_is_auto_router_deployment

* fix check

* fix docker build step

* add semantic_router

* UI  - Add auto router on litellm

* working utterances config

* fix route config builder

* kind of working add automodel router

* move loc of add deployment

* fixes for AutoRouter

* add auto_router_config in types.py

* fixes for init_auto_router_deployment

* fix adding auto router models

* working auto-router with dB

* Revert "add semantic_router"

This reverts commit 537b67288798731a119d811f643b682086377ee9.

* TestAutoRouter

* fix linting

* add semantic router to docker

* test fix

* fix router config builder

* remove export button
2025-07-24 19:58:49 -07:00
Jugal D. Bhatt
a112ec5b02
Health check app on separate port (#12718)
* add separate health app

* add new docs

* refactor

* fix colons

* Update config_settings.md

* refactor

* docs

* add unit test

* added supervisord

* remove app

* add supervisor conf

* Add markdown

* add video to md

* remove test

* docs build failure

* add to all docker files, change prod.md and add tests

* change dockerfiles

* remove extra file

* remove extra file

* remove extra file

* change apt->apk

* remove rdb file

* add fixed file
2025-07-18 11:17:15 -07:00
Krish Dholakia
64f325b92e
adds tzdata (#10796) (#11052)
With tzdata installed, the environment variable `TZ` will be respected by Python's datetime module. This means that users can specify the timezone they want LiteLLM to use.

Co-authored-by: Simon Stone <sipreuss@gmail.com>
2025-05-22 22:36:19 -07:00
Peter Dave Hello
6b67006b0c
Remove redundant apk update in Dockerfiles (cc #5016) (#9055)
The `apk` commands can utilize the `--no-cache` option, making the
`update` step superfluous and ensuring the latest packages are used
without maintaining a local cache. An additional `apk update` in the
Dockerfile will just make the image larger with no benefits.
2025-04-08 09:03:25 -07:00
Tyler Hutcherson
7864cd1f76 update redisvl dependency 2025-03-24 08:42:11 -04:00
Ishaan Jaff
60c89a3e8a
(Fix) security of base image (#7620)
* fix security of base images

* fix dockerfile
2025-01-07 20:35:57 -08:00
Ishaan Jaff
6125ba1e2b
(Feat) - allow including dd-trace in litellm base image (#7587)
* introduce USE_DDTRACE=true

* update dd tracer

* update

* bump dd trace

* use og slim image

* DD tracing

* fix _init_dd_tracer
2025-01-06 17:27:09 -08:00
Krish Dholakia
e332e93786
Litellm security fixes (#7282)
* build(Dockerfile): bump node version

* build(Dockerfile): bump python version

fix critical errors

* build(requirements.txt): fix snyk errors
2024-12-18 09:38:52 -08:00
Ishaan Jaff
d1760b1b04
(fix) clean up root repo - move entrypoint.sh and build_admin_ui to /docker (#6110)
* fix move docker files to docker folders

* move check file length

* fix docker hub deploy

* fix clean up root

* fix circle ci config
2024-10-08 11:34:43 +05:30
Ishaan Jaff
5de69cb1b2 fix using Dockerfile 2024-10-08 08:45:40 +05:30
Ishaan Jaff
d742e8cb43
(clean up) move docker files from root to docker folder (#6109)
* fix move docker files to docker folders

* move check file length

* fix docker hub deploy
2024-10-08 08:23:52 +05:30
Jacob Hagstedt P Suorra
9ec3365ba6
Upgrade dependencies in dockerfile (#5862)
* Upgrade dependencies in dockerfile

* Change apt-get to apk for alpine image

* Set requirements file to same as dockerfile

---------

Co-authored-by: Jacob Hagstedt <wcgs@novonordisk.com>
2024-09-27 07:51:20 -07:00
Ishaan Jaff
4f9f505ebe
docker - handle debian issue on docker builds (#5752) 2024-09-23 17:58:22 -07:00
superpoussin22
acfb060bf1
Correct casing (#5817)
* Update Dockerfile

correct casing

* Update Dockerfile.database

correct casing

* Update Dockerfile.alpine

correct casing

* Update Dockerfile.non_root

correct casing
2024-09-21 08:21:11 -07:00
Ishaan Jaff
143bbe32cb fix dockerfile 2024-08-20 15:38:11 -07:00
Ishaan Jaff
fef6f50e23 fic docker file to run in non root model 2024-08-13 19:29:40 -07:00
Krrish Dholakia
3c6bc031de build(dockerfile): remove --config proxy_server_config.yaml from docker run
prevents startup errors with dockerfile
2024-04-08 13:23:56 -07:00
Ishaan Jaff
19a1d999ec (feat) update docs to not include gunicorn usage 2024-03-23 17:40:22 -07:00
Ishaan Jaff
ff57887b70 (feat) bump to python 3.11 2024-03-22 14:44:41 -07:00
Krrish Dholakia
8c91156842 build: build fixes 2024-03-19 16:59:59 -07:00
Krish Dholakia
88152c77c5
Update Dockerfile 2024-03-15 21:47:13 -07:00
ishaan-jaff
dd6138ecbf (fix) run prisma generate in default dockerfile 2024-03-14 09:10:56 -07:00
ishaan-jaff
60f34066a5 (fix) default dockerfile use num_workers = 1 2024-03-13 15:23:53 -07:00
ishaan-jaff
c50b0e315a (feat) don't use --detailed_debug on all default litellm images 2024-03-06 16:31:32 -08:00
ishaan-jaff
9ba2fa44aa (fix) admin ui allow custom ui theme 2024-02-21 21:28:58 -08:00
ishaan-jaff
d07846646c (ui) fix build command 2024-02-21 21:02:46 -08:00
ishaan-jaff
c69eaebfd8 (fix) dockerfile for semantic caching 2024-02-06 19:23:27 -08:00
Krrish Dholakia
8e9197b5b4 build(proxy_cli.py): make running gunicorn an optional cli arg
when running proxy locally, running with uvicorn is much better for debugging
2024-01-29 15:32:34 -08:00