build: migrate packaging, CI, and Docker from Poetry to uv (#25007)

* build: migrate packaging metadata to uv * ci: move automation and local tooling to uv * docker: migrate image builds and runtime setup to uv * docs: update install and deployment guidance for uv * chore: align auxiliary scripts and tests with uv * test: harden test_litellm isolation * fix: keep release and health check images self-contained * build: pin uv tooling and health check deps * test: isolate bedrock image request formatting from suite state * test: cover sandbox executor requirements flow * ci: fix circleci no-op command steps * ci: fix circleci publish workflow parsing * fix: stabilize remaining uv migration CI checks * ci: increase matrix test timeout headroom * fix: restore published docker and license coverage * fix: restore proxy runtime build parity * fix: restore proxy extras parity and venv migrations * ci: persist uv path across circleci steps * fix: keep psycopg binary in default test env * docker: preserve prisma cache across stages * test: run local proxy checks through uv python * build: restore runtime deps moved into ci * build: refresh uv lock after upstream merge * fix: restore module import in test_check_migration after merge The conflict resolution imported only the function but the test body references check_migration as a module throughout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: revert dependency promotions, remove nodejs-wheel-binaries, fix Docker layer caching - Move google-generativeai, Pillow, tenacity back to ci group (they are lazily imported and bloat the base SDK install needlessly) - Remove nodejs-wheel-binaries from extra_proxy and proxy-dev (redundant in Docker where system Node.js is already installed via apk) - Remove all nodejs-wheel node replacement and venv npm patching blocks from Dockerfiles since the wheel is no longer installed - Add --no-default-groups to CodSpeed benchmark workflow so the benchmark environment matches the old minimal pip install footprint - Apply standard uv two-phase Docker pattern: copy metadata first, install deps (cached layer), then copy source and install project - Replace CircleCI enterprise no-op with proper uv sync command Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate uv.lock after removing nodejs-wheel-binaries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): use cache/restore instead of cache to prevent cache poisoning The old workflow used actions/cache/restore (read-only). The uv migration changed it to actions/cache (read-write), which zizmor flags as a cache poisoning risk. Restore the safer read-only variant. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): disable setup-uv built-in cache to silence cache-poisoning alert The setup-uv action enables caching by default, which zizmor flags as a cache poisoning risk. Disable it since we already use a read-only cache/restore step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): disable setup-uv cache in publish workflow Silences zizmor cache-poisoning alert. Publishing workflow runs infrequently on protected branches so caching adds no real benefit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(test): remove duplicate verbose_logger mock in test_check_migration The logger was patched twice — first via mocker.patch() then via mocker.patch.object(autospec=True). The second call fails because autospec cannot inspect an already-mocked attribute. Remove the redundant first patch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): free disk space before Docker build in test-server-root-path The Dockerfile.non_root build ran out of disk on the CI runner. Remove Android SDK, .NET, Boost, and GHC toolchains (~12GB) to free space. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:46:23 -07:00 · 2026-04-09 11:46:23 -07:00 · a6c30b30bf
commit a6c30b30bf
parent cd9c511df6
170 changed files with 13172 additions and 11736 deletions
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
--- a/.circleci/requirements.txt
+++ b/.circleci/requirements.txt
@ -1,21 +0,0 @@
-# used by CI/CD testing
-openai==1.100.1
-python-dotenv
-tiktoken
-importlib_metadata
-cohere
-redis==5.2.1
-redisvl==0.4.1
-anthropic
-orjson==3.10.15 # fast /embedding responses
-pydantic==2.12.5
-google-cloud-aiplatform==1.133.0
-google-cloud-iam==2.19.1
-fastapi-sso==0.16.0 
-uvloop==0.21.0
-mcp==1.26.0    # for MCP server
-semantic_router==0.1.10 # for auto-routing with litellm
-fastuuid==0.14.0
-responses==0.25.7  # for proxy client tests
-pytest-retry==1.6.3  # for automatic test retries
-litellm-proxy-extras  # for prisma migrations
--- a/.devcontainer/post-create.sh
+++ b/.devcontainer/post-create.sh
@ -1,17 +1,17 @@
 #!/usr/bin/env bash
 set -e

-echo "[post-create] Installing poetry via pip"
-python -m pip install --upgrade pip
-python -m pip install poetry
+echo "[post-create] Installing uv"
+curl -LsSf https://astral.sh/uv/0.10.9/install.sh | env UV_NO_MODIFY_PATH=1 sh
+export PATH="$HOME/.local/bin:$PATH"

-echo "[post-create] Installing Python dependencies (poetry)"
-poetry install --with dev --extras proxy
+echo "[post-create] Installing Python dependencies (uv)"
+uv sync --frozen --group proxy-dev --extra proxy

 echo "[post-create] Generating Prisma client"
-poetry run prisma generate
+uv run --no-sync prisma generate

 echo "[post-create] Installing npm dependencies"
 cd ui/litellm-dashboard && npm ci

-echo "[post-create] Done"
+echo "[post-create] Done"
--- a/.gitguardian.yaml
+++ b/.gitguardian.yaml
@ -37,7 +37,7 @@ secret:
    - "docs/**"
    - "**/*.md"
    - "**/*.lock"
-    - "poetry.lock"
+    - "uv.lock"
    - "package-lock.json"

  # Ignore security incidents with the SHA256 of the occurrence (false positives)
--- a/.github/workflows/_test-unit-base.yml
+++ b/.github/workflows/_test-unit-base.yml
@ -51,37 +51,30 @@ jobs:
        with:
          python-version: "3.12"

-      - name: Install Poetry
-        run: pip install 'poetry==2.3.2'
+      - name: Set up uv
+        uses: astral-sh/setup-uv@37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7
+        with:
+          version: "0.10.9"

-      - name: Cache Poetry dependencies
+      - name: Cache uv dependencies
        uses: actions/cache@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
        with:
          path: |
-            ~/.cache/pypoetry
-            ~/.cache/pip
+            ~/.cache/uv
            .venv
-          key: ${{ runner.os }}-poetry-${{ hashFiles('poetry.lock') }}
+          key: ${{ runner.os }}-uv-${{ hashFiles('uv.lock') }}
          restore-keys: |
-            ${{ runner.os }}-poetry-
+            ${{ runner.os }}-uv-

      - name: Install dependencies
        run: |
-          poetry config virtualenvs.in-project true
-          poetry install --with dev,proxy-dev --extras "proxy semantic-router"
-          poetry run pip install google-genai==1.22.0 \
-            google-cloud-aiplatform==1.115.0 fastapi-offline==1.7.3 python-multipart==0.0.22 openapi-core==0.23.0
-
-      - name: Setup litellm-enterprise
-        run: |
-          poetry run pip install --force-reinstall --no-deps -e enterprise/
+          uv sync --frozen --group ci --group proxy-dev --extra google --extra proxy --extra semantic-router

      - name: Generate Prisma client
        env:
          PRISMA_BINARY_CACHE_DIR: ${{ runner.temp }}/prisma-cache
        run: |
-          poetry run pip install nodejs-wheel-binaries==24.13.1
-          poetry run prisma generate --schema litellm/proxy/schema.prisma
+          uv run --no-sync prisma generate --schema litellm/proxy/schema.prisma

      - name: Run tests
        env:
@ -90,7 +83,7 @@ jobs:
          WORKERS: ${{ inputs.workers }}
          RERUNS: ${{ inputs.reruns }}
        run: |
-          poetry run pytest ${TEST_PATH:?} \
+          uv run --no-sync pytest ${TEST_PATH:?} \
            --tb=short -vv \
            --maxfail="${MAX_FAILURES}" \
            -n "${WORKERS}" \
--- a/.github/workflows/_test-unit-services-base.yml
+++ b/.github/workflows/_test-unit-services-base.yml
@ -86,44 +86,37 @@ jobs:
        with:
          python-version: "3.12"

-      - name: Install Poetry
-        run: pip install 'poetry==2.3.2'
+      - name: Set up uv
+        uses: astral-sh/setup-uv@37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7
+        with:
+          version: "0.10.9"

-      - name: Cache Poetry dependencies
+      - name: Cache uv dependencies
        uses: actions/cache@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
        with:
          path: |
-            ~/.cache/pypoetry
-            ~/.cache/pip
+            ~/.cache/uv
            .venv
-          key: ${{ runner.os }}-poetry-services-${{ hashFiles('poetry.lock') }}
+          key: ${{ runner.os }}-uv-services-${{ hashFiles('uv.lock') }}
          restore-keys: |
-            ${{ runner.os }}-poetry-services-
+            ${{ runner.os }}-uv-services-

      - name: Install dependencies
        run: |
-          poetry config virtualenvs.in-project true
-          poetry install --with dev,proxy-dev --extras "proxy semantic-router"
-          poetry run pip install google-genai==1.22.0 \
-            google-cloud-aiplatform==1.115.0 fastapi-offline==1.7.3 python-multipart==0.0.22 openapi-core==0.23.0
-
-      - name: Setup litellm-enterprise
-        run: |
-          poetry run pip install --force-reinstall --no-deps -e enterprise/
+          uv sync --frozen --group ci --group proxy-dev --extra google --extra proxy --extra semantic-router

      - name: Generate Prisma client
        env:
          PRISMA_BINARY_CACHE_DIR: ${{ runner.temp }}/prisma-cache
        run: |
-          poetry run pip install nodejs-wheel-binaries==24.13.1
-          poetry run prisma generate --schema litellm/proxy/schema.prisma
+          uv run --no-sync prisma generate --schema litellm/proxy/schema.prisma

      - name: Run Prisma migrations
        if: ${{ inputs.enable-postgres }}
        env:
          DATABASE_URL: ${{ secrets.DATABASE_URL }}
        run: |
-          poetry run prisma db push --schema litellm/proxy/schema.prisma --accept-data-loss
+          uv run --no-sync prisma db push --schema litellm/proxy/schema.prisma --accept-data-loss

      - name: Run tests
        env:
@ -134,7 +127,7 @@ jobs:
          DATABASE_URL: ${{ inputs.enable-postgres && secrets.DATABASE_URL || '' }}
        run: |
          if [ "${WORKERS}" = "0" ]; then
-            poetry run pytest ${TEST_PATH:?} \
+            uv run --no-sync pytest ${TEST_PATH:?} \
              --tb=short -vv \
              --maxfail="${MAX_FAILURES}" \
              --reruns "${RERUNS}" \
@ -144,7 +137,7 @@ jobs:
              --cov-report=xml:coverage.xml \
              --cov-config=pyproject.toml
          else
-            poetry run pytest ${TEST_PATH:?} \
+            uv run --no-sync pytest ${TEST_PATH:?} \
              --tb=short -vv \
              --maxfail="${MAX_FAILURES}" \
              -n "${WORKERS}" \
--- a/.github/workflows/auto_update_price_and_context_window.yml
+++ b/.github/workflows/auto_update_price_and_context_window.yml
@ -17,12 +17,13 @@ jobs:
      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
        with:
          persist-credentials: false
-      - name: Install Dependencies
-        run: |
-          pip install 'aiohttp==3.13.3'
+      - name: Set up uv
+        uses: astral-sh/setup-uv@37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7
+        with:
+          version: "0.10.9"
      - name: Update JSON Data
        run: |
-          python ".github/workflows/auto_update_price_and_context_window_file.py"
+          uv run --frozen --with 'aiohttp==3.13.3' python ".github/workflows/auto_update_price_and_context_window_file.py"
      - name: Create Pull Request
        run: |
          git add model_prices_and_context_window.json 
--- a/.github/workflows/codspeed.yml
+++ b/.github/workflows/codspeed.yml
@ -34,13 +34,21 @@ jobs:
        with:
          python-version: "3.12"

-      - name: Install dependencies
-        run: |
-          pip install -e "."
-          pip install pytest pytest-codspeed==4.3.0
+      - name: Set up uv
+        uses: astral-sh/setup-uv@37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7
+        with:
+          version: "0.10.9"

      - name: Run benchmarks
        uses: CodSpeedHQ/action@1c8ae4843586d3ba879736b7f6b7b0c990757fab # v4.12.1
        with:
          mode: simulation
-          run: pytest tests/benchmarks/ --codspeed
+          run: >
+            env PYTEST_DISABLE_PLUGIN_AUTOLOAD=1
+            uv run --frozen --no-default-groups
+            --with pytest==8.3.5
+            --with pytest-codspeed==4.3.0
+            pytest
+            -p pytest_codspeed.plugin
+            tests/benchmarks/
+            --codspeed
--- a/.github/workflows/llm-translation-testing.yml
+++ b/.github/workflows/llm-translation-testing.yml
@ -31,26 +31,25 @@ jobs:
        with:
          python-version: "3.11"

-      - name: Install Poetry
-        run: |
-          pip install 'poetry==2.3.2'
-          poetry config virtualenvs.create true
-          poetry config virtualenvs.in-project true
+      - name: Set up uv
+        uses: astral-sh/setup-uv@37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7
+        with:
+          version: "0.10.9"
+          enable-cache: false

-      - name: Restore Poetry dependencies cache
-        uses: actions/cache/restore@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.0.0
+      - name: Restore uv dependencies cache
+        uses: actions/cache/restore@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
        with:
          path: |
-            ~/.cache/pypoetry
+            ~/.cache/uv
            .venv
-          key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
+          key: ${{ runner.os }}-uv-${{ hashFiles('uv.lock') }}
          restore-keys: |
-            ${{ runner.os }}-poetry-
+            ${{ runner.os }}-uv-

      - name: Install dependencies
        run: |
-          poetry install --with dev
-          poetry run pip install 'pytest-xdist==3.8.0' 'pytest-timeout==2.4.0'
+          uv sync --frozen

      - name: Create test results directory
        run: mkdir -p test-results
--- a/.github/workflows/publish_to_pypi.yml
+++ b/.github/workflows/publish_to_pypi.yml
@ -24,10 +24,22 @@ jobs:
        with:
          python-version: "3.12"

+      - name: Set up uv
+        uses: astral-sh/setup-uv@37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7
+        with:
+          version: "0.10.9"
+          enable-cache: false
+
      - name: Check litellm version on PyPI
        id: check-litellm
        run: |
-          VERSION=$(grep -m1 '^version' pyproject.toml | sed 's/version = "\(.*\)"/\1/')
+          VERSION=$(python - <<'PY'
+          import tomllib
+
+          with open("pyproject.toml", "rb") as f:
+              print(tomllib.load(f)["project"]["version"])
+          PY
+          )
          echo "version=$VERSION" >> "$GITHUB_OUTPUT"
          echo "Checking if litellm $VERSION exists on PyPI..."

@ -42,43 +54,46 @@ jobs:

      - name: Sanity check proxy-extras version
        run: |
-          # Read pinned version from requirements.txt
-          REQ_VERSION=$(grep -oP 'litellm-proxy-extras==\K[0-9.]+' requirements.txt)
-          if [ -z "$REQ_VERSION" ]; then
-            echo "::error::Could not find litellm-proxy-extras version in requirements.txt"
-            exit 1
-          fi
-          echo "requirements.txt pins litellm-proxy-extras==$REQ_VERSION"
+          # Read pinned version from project optional dependencies
+          PYPROJECT_VERSION=$(python3 - <<'PY'
+          import sys
+          import tomllib

-          # Read pinned version from pyproject.toml dependency
-          PYPROJECT_VERSION=$(python3 -c "
-          import re
-          with open('pyproject.toml') as f:
-              content = f.read()
-          match = re.search(r'litellm-proxy-extras\s*=\s*\{version\s*=\s*\"([^\"]+)\"', content)
-          if match:
-              print(match.group(1).lstrip('^~>='))
-          else:
-              import sys
-              print('::error::Could not find litellm-proxy-extras dependency in pyproject.toml', file=sys.stderr)
+          with open("pyproject.toml", "rb") as f:
+              proxy_requirements = tomllib.load(f)["project"]["optional-dependencies"]["proxy"]
+
+          version = None
+          for requirement in proxy_requirements:
+              normalized = requirement.split(";", 1)[0].strip()
+              if not normalized.startswith("litellm-proxy-extras"):
+                  continue
+              parts = normalized.split("==", 1)
+              if len(parts) == 2 and parts[0].strip() == "litellm-proxy-extras":
+                  candidate = parts[1].strip()
+                  if candidate:
+                      version = candidate
+                      break
+
+          if version is None:
+              print(
+                  "::error::Could not find an exact litellm-proxy-extras pin in project.optional-dependencies.proxy",
+                  file=sys.stderr,
+              )
              sys.exit(1)
-          ")
+
+          print(version)
+          PY
+          )
          echo "pyproject.toml pins litellm-proxy-extras version: $PYPROJECT_VERSION"

-          # Check that both pinned versions match
-          if [ "$REQ_VERSION" != "$PYPROJECT_VERSION" ]; then
-            echo "::error::Version mismatch: requirements.txt has $REQ_VERSION but pyproject.toml has $PYPROJECT_VERSION"
-            exit 1
-          fi
-
          # Check that the pinned version exists on PyPI
-          echo "Checking if litellm-proxy-extras $REQ_VERSION exists on PyPI..."
-          HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "https://pypi.org/pypi/litellm-proxy-extras/$REQ_VERSION/json")
+          echo "Checking if litellm-proxy-extras $PYPROJECT_VERSION exists on PyPI..."
+          HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "https://pypi.org/pypi/litellm-proxy-extras/$PYPROJECT_VERSION/json")
          if [ "$HTTP_STATUS" != "200" ]; then
-            echo "::error::litellm-proxy-extras $REQ_VERSION is not published on PyPI yet. Publish it before releasing litellm."
+            echo "::error::litellm-proxy-extras $PYPROJECT_VERSION is not published on PyPI yet. Publish it before releasing litellm."
            exit 1
          fi
-          echo "litellm-proxy-extras $REQ_VERSION exists on PyPI. Sanity check passed."
+          echo "litellm-proxy-extras $PYPROJECT_VERSION exists on PyPI. Sanity check passed."

  publish-litellm:
    name: Publish litellm to PyPI
@ -100,16 +115,19 @@ jobs:
        with:
          python-version: "3.12"

+      - name: Set up uv
+        uses: astral-sh/setup-uv@37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7
+        with:
+          version: "0.10.9"
+          enable-cache: false
+
      - name: Copy model prices backup
        run: cp model_prices_and_context_window.json litellm/model_prices_and_context_window_backup.json

-      - name: Install build tools
-        run: python -m pip install --upgrade pip build==1.4.2
-
      - name: Build package
        run: |
          rm -rf build dist
-          python -m build
+          uv build

      - name: Verify build artifacts
        env:
@ -129,8 +147,7 @@ jobs:

      - name: Validate package metadata
        run: |
-          pip install twine==6.2.0
-          twine check dist/*
+          uv tool run --from 'twine==6.2.0' twine check dist/*

      - name: Publish to PyPI
        uses: pypa/gh-action-pypi-publish@ed0c53931b1dc9bd32cbe73a98c7f6766f8a527e # v1.13.0
--- a/.github/workflows/run_llm_translation_tests.py
+++ b/.github/workflows/run_llm_translation_tests.py
@ -325,7 +325,7 @@ def run_tests(test_path: str = "tests/llm_translation/",
    
    # Run pytest
    cmd = [
-        "poetry", "run", "pytest", test_path,
+        "uv", "run", "--no-sync", "pytest", test_path,
        f"--junitxml={junit_xml}",
        "-v",
        "--tb=short",
@ -335,7 +335,7 @@ def run_tests(test_path: str = "tests/llm_translation/",
    
    # Add timeout if pytest-timeout is installed
    try:
-        subprocess.run(["poetry", "run", "python", "-c", "import pytest_timeout"], 
+        subprocess.run(["uv", "run", "--no-sync", "python", "-c", "import pytest_timeout"], 
                      capture_output=True, check=True)
        cmd.extend(["--timeout=300"])
    except:
@ -436,4 +436,4 @@ if __name__ == "__main__":
        commit=args.commit
    )
    
-    sys.exit(exit_code)
+    sys.exit(exit_code)
--- a/.github/workflows/test-linting.yml
+++ b/.github/workflows/test-linting.yml
@ -24,26 +24,28 @@ jobs:
        with:
          python-version: "3.12"

-      - name: Install Poetry
-        run: pip install 'poetry==2.3.2'
+      - name: Set up uv
+        uses: astral-sh/setup-uv@37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7
+        with:
+          version: "0.10.9"

      - name: Clean Python cache
        run: |
          find . -type d -name "__pycache__" -exec rm -rf {} + || true
          find . -name "*.pyc" -delete || true

-      - name: Check poetry.lock is up to date
+      - name: Check uv.lock is up to date
        run: |
-          poetry check --lock || (echo "❌ poetry.lock is out of sync with pyproject.toml. Run 'poetry lock' locally and commit the result." && exit 1)
+          uv lock --check || (echo "❌ uv.lock is out of sync with pyproject.toml. Run 'uv lock' locally and commit the result." && exit 1)

      - name: Install dependencies
        run: |
-          poetry install --with dev
+          uv sync --frozen

      - name: Check Black formatting
        run: |
          cd litellm
-          poetry run black --check --exclude '/enterprise/' .
+          uv run --no-sync black --check --exclude '/enterprise/' .
          cd ..

      - name: Debug - Check file state
@ -58,28 +60,28 @@ jobs:
      - name: Run Ruff linting
        run: |
          cd litellm
-          poetry run ruff check .
+          uv run --no-sync ruff check .
          cd ..

      - name: Print OpenAI version
        run: |
-          poetry run python -c "import openai; print(f'OpenAI version: {openai.__version__}')"
+          uv run --no-sync python -c "import openai; print(f'OpenAI version: {openai.__version__}')"

      - name: Run MyPy type checking
        run: |
          cd litellm
-          poetry run mypy . 
+          uv run --no-sync mypy .
          cd ..

      - name: Check for circular imports
        run: |
          cd litellm
-          poetry run python ../tests/documentation_tests/test_circular_imports.py
+          uv run --no-sync python ../tests/documentation_tests/test_circular_imports.py
          cd ..

      - name: Check import safety
        run: |
-          poetry run python -c "from litellm import *" || (echo '🚨 import failed, this means you introduced unprotected imports! 🚨'; exit 1)
+          uv run --no-sync python -c "from litellm import *" || (echo '🚨 import failed, this means you introduced unprotected imports! 🚨'; exit 1)

  secret-scan:
    runs-on: ubuntu-latest
@ -98,18 +100,21 @@ jobs:
        with:
          python-version: "3.12"

+      - name: Set up uv
+        uses: astral-sh/setup-uv@37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7
+        with:
+          version: "0.10.9"
+
      - name: Run secret scan test
        run: |
-          pip install 'pytest==9.0.2'
-          pytest tests/litellm/test_no_hardcoded_secrets.py -v
+          uv run --frozen --with 'pytest==9.0.2' pytest tests/litellm/test_no_hardcoded_secrets.py -v

      - name: Run ggshield secret scan
        env:
          GITGUARDIAN_API_KEY: ${{ secrets.GITGUARDIAN_API_KEY }}
        run: |
          if [ -n "$GITGUARDIAN_API_KEY" ]; then
-            pip install 'ggshield==1.48.0'
-            ggshield secret scan repo .
+            uv tool run --from 'ggshield==1.48.0' ggshield secret scan repo .
          else
            echo "GITGUARDIAN_API_KEY not set, skipping ggshield scan"
          fi
--- a/.github/workflows/test-litellm.yml
+++ b/.github/workflows/test-litellm.yml
@ -31,23 +31,15 @@ jobs:
        with:
          python-version: "3.12"

-      - name: Install Poetry
-        run: pip install 'poetry==2.3.2'
+      - name: Set up uv
+        uses: astral-sh/setup-uv@37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7
+        with:
+          version: "0.10.9"

      - name: Install dependencies
        run: |
-          poetry lock
-          poetry install --with dev,proxy-dev --extras "proxy semantic-router"
-          poetry run pip install "pytest-retry==1.6.3"
-          poetry run pip install 'pytest-xdist==3.8.0'
-          poetry run pip install "google-genai==1.22.0"
-          poetry run pip install "google-cloud-aiplatform==1.115.0"
-          poetry run pip install "fastapi-offline==1.7.3"
-          poetry run pip install "python-multipart==0.0.22"
-          poetry run pip install "openapi-core==0.23.0"
-      - name: Setup litellm-enterprise as local package
-        run: |
-          poetry run pip install --force-reinstall --no-deps -e enterprise/
+          uv lock --check
+          uv sync --frozen --group ci --group proxy-dev --extra google --extra proxy --extra semantic-router
      - name: Run tests
        run: |
-          poetry run pytest tests/test_litellm --tb=short -vv --maxfail=10 -n 4 --durations=50
+          uv run --no-sync pytest tests/test_litellm --tb=short -vv --maxfail=10 -n 4 --durations=50
--- a/.github/workflows/test-mcp.yml
+++ b/.github/workflows/test-mcp.yml
@ -27,26 +27,16 @@ jobs:
        with:
          python-version: "3.12"

-      - name: Install Poetry
-        run: pip install 'poetry==2.3.2'
+      - name: Set up uv
+        uses: astral-sh/setup-uv@37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7
+        with:
+          version: "0.10.9"

      - name: Install dependencies
        run: |
-          poetry lock
-          poetry install --with dev,proxy-dev --extras "proxy semantic-router"
-          poetry run pip install "pytest==7.3.1"
-          poetry run pip install "pytest-retry==1.6.3"
-          poetry run pip install "pytest-cov==5.0.0"
-          poetry run pip install "pytest-asyncio==0.21.1"
-          poetry run pip install "respx==0.22.0"
-          poetry run pip install "pydantic==2.11.0"
-          poetry run pip install "mcp==1.25.0"
-          poetry run pip install 'pytest-xdist==3.8.0'
-
-      - name: Setup litellm-enterprise as local package
-        run: |
-          poetry run pip install --force-reinstall --no-deps -e enterprise/
+          uv lock --check
+          uv sync --frozen --group proxy-dev --extra proxy --extra semantic-router

      - name: Run MCP tests
        run: |
-          poetry run pytest tests/mcp_tests -x -vv -n 4 --cov=litellm --cov-report=xml --durations=5
+          uv run --no-sync pytest tests/mcp_tests -x -vv -n 4 --cov=litellm --cov-report=xml --durations=5
--- a/.github/workflows/test-unit-documentation.yml
+++ b/.github/workflows/test-unit-documentation.yml
@ -26,42 +26,35 @@ jobs:
        with:
          python-version: "3.12"

-      - name: Install Poetry
-        run: pip install 'poetry==2.3.2'
+      - name: Set up uv
+        uses: astral-sh/setup-uv@37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7
+        with:
+          version: "0.10.9"

-      - name: Cache Poetry dependencies
+      - name: Cache uv dependencies
        uses: actions/cache@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
        with:
          path: |
-            ~/.cache/pypoetry
-            ~/.cache/pip
+            ~/.cache/uv
            .venv
-          key: ${{ runner.os }}-poetry-${{ hashFiles('poetry.lock') }}
+          key: ${{ runner.os }}-uv-${{ hashFiles('uv.lock') }}
          restore-keys: |
-            ${{ runner.os }}-poetry-
+            ${{ runner.os }}-uv-

      - name: Install dependencies
        run: |
-          poetry config virtualenvs.in-project true
-          poetry install --with dev,proxy-dev --extras "proxy semantic-router"
-          poetry run pip install google-genai==1.22.0 \
-            google-cloud-aiplatform==1.115.0 fastapi-offline==1.7.3 python-multipart==0.0.22 openapi-core==0.23.0
-
-      - name: Setup litellm-enterprise
-        run: |
-          poetry run pip install --force-reinstall --no-deps -e enterprise/
+          uv sync --frozen --group ci --group proxy-dev --extra google --extra proxy --extra semantic-router

      - name: Generate Prisma client
        env:
          PRISMA_BINARY_CACHE_DIR: ${{ runner.temp }}/prisma-cache
        run: |
-          poetry run pip install nodejs-wheel-binaries==24.13.1
-          poetry run prisma generate --schema litellm/proxy/schema.prisma
+          uv run --no-sync prisma generate --schema litellm/proxy/schema.prisma

      # Run the same documentation tests that CircleCI ran (as direct Python scripts)
      - name: Run documentation validation tests
        run: |
-          poetry run python ./tests/documentation_tests/test_env_keys.py
-          poetry run python ./tests/documentation_tests/test_router_settings.py
-          poetry run python ./tests/documentation_tests/test_api_docs.py
-          poetry run python ./tests/documentation_tests/test_circular_imports.py
+          uv run --no-sync python ./tests/documentation_tests/test_env_keys.py
+          uv run --no-sync python ./tests/documentation_tests/test_router_settings.py
+          uv run --no-sync python ./tests/documentation_tests/test_api_docs.py
+          uv run --no-sync python ./tests/documentation_tests/test_circular_imports.py
--- a/.github/workflows/test-unit-proxy-legacy.yml
+++ b/.github/workflows/test-unit-proxy-legacy.yml
@ -50,43 +50,36 @@ jobs:
        with:
          python-version: "3.12"

-      - name: Install Poetry
-        run: pip install 'poetry==2.3.2'
+      - name: Set up uv
+        uses: astral-sh/setup-uv@37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7
+        with:
+          version: "0.10.9"

-      - name: Cache Poetry dependencies
+      - name: Cache uv dependencies
        uses: actions/cache@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
        with:
          path: |
-            ~/.cache/pypoetry
-            ~/.cache/pip
+            ~/.cache/uv
            .venv
-          key: ${{ runner.os }}-poetry-${{ hashFiles('poetry.lock') }}
+          key: ${{ runner.os }}-uv-${{ hashFiles('uv.lock') }}
          restore-keys: |
-            ${{ runner.os }}-poetry-
+            ${{ runner.os }}-uv-

      - name: Install dependencies
        run: |
-          poetry config virtualenvs.in-project true
-          poetry install --with dev,proxy-dev --extras "proxy semantic-router"
-          poetry run pip install google-genai==1.22.0 \
-            google-cloud-aiplatform==1.115.0 fastapi-offline==1.7.3 python-multipart==0.0.22 openapi-core==0.23.0
-
-      - name: Setup litellm-enterprise
-        run: |
-          poetry run pip install --force-reinstall --no-deps -e enterprise/
+          uv sync --frozen --group ci --group proxy-dev --extra google --extra proxy --extra semantic-router

      - name: Generate Prisma client
        env:
          PRISMA_BINARY_CACHE_DIR: ${{ runner.temp }}/prisma-cache
        run: |
-          poetry run pip install nodejs-wheel-binaries==24.13.1
-          poetry run prisma generate --schema litellm/proxy/schema.prisma
+          uv run --no-sync prisma generate --schema litellm/proxy/schema.prisma

      - name: Run tests - ${{ matrix.test-group.name }}
        env:
          TEST_PATH: ${{ matrix.test-group.path }}
        run: |
-          poetry run pytest ${TEST_PATH} \
+          uv run --no-sync pytest ${TEST_PATH} \
            --tb=short -vv \
            --maxfail=10 \
            -n 2 \
--- a/.github/workflows/test_server_root_path.yml
+++ b/.github/workflows/test_server_root_path.yml
@ -21,6 +21,12 @@ jobs:
        with:
          persist-credentials: false

+      - name: Free up disk space
+        run: |
+          sudo rm -rf /usr/local/lib/android /usr/share/dotnet /opt/ghc /usr/local/share/boost
+          sudo apt-get clean
+          df -h /
+
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12

--- a/AGENTS.md
+++ b/AGENTS.md
@ -121,7 +121,7 @@ LiteLLM supports MCP for agent workflows:

 ## RUNNING SCRIPTS

-Use `poetry run python script.py` to run Python scripts in the project environment (for non-test files).
+Use `uv run python script.py` to run Python scripts in the project environment (for non-test files).

 ## GITHUB TEMPLATES

@ -232,16 +232,16 @@ When opening issues or pull requests, follow these templates:

 ### Environment

- Poetry is installed in `~/.local/bin`; the update script ensures it is on `PATH`.
+- uv is installed in `~/.local/bin`; the update script ensures it is on `PATH`.
 - Python 3.12, Node 22 are pre-installed.
- The virtual environment lives under `~/.cache/pypoetry/virtualenvs/`.
+- The project virtual environment lives under `.venv/`.

 ### Running the proxy server

 Start the proxy with a config file:

 ```bash
-poetry run litellm --config dev_config.yaml --port 4000
+uv run litellm --config dev_config.yaml --port 4000
 ```

 The proxy takes ~15-20 seconds to fully start (it runs Prisma migrations on boot). Wait for `/health` to return before sending requests. Without a PostgreSQL `DATABASE_URL`, the proxy connects to a default Neon dev database embedded in the `litellm-proxy-extras` package.
@ -250,17 +250,16 @@ The proxy takes ~15-20 seconds to fully start (it runs Prisma migrations on boot

 See `CLAUDE.md` and the `Makefile` for standard commands. Key notes:

- `psycopg-binary` must be installed (`poetry run pip install psycopg-binary`) because the pytest-postgresql plugin requires it and the lock file only includes `psycopg` (no binary).
- `openapi-core` must be installed (`poetry run pip install openapi-core`) for the OpenAPI compliance tests in `tests/test_litellm/interactions/`.
+- `uv sync --group proxy-dev --extra proxy` installs the Prisma and proxy-side test dependencies used by the standard local workflow.
 - The `--timeout` pytest flag is NOT available; don't pass it.
- Unit tests: `poetry run pytest tests/test_litellm/ -x -vv -n 4`
- **Before committing, always run `poetry run black .` to format your code.** Black formatting is enforced in CI.
- If `poetry install` fails with "pyproject.toml changed significantly since poetry.lock was last generated", run `poetry lock` first to regenerate the lock file.
+- Unit tests: `uv run pytest tests/test_litellm/ -x -vv -n 4`
+- **Before committing, always run `uv run black .` to format your code.** Black formatting is enforced in CI.
+- If `uv sync` fails because the lockfile is outdated, run `uv lock` and retry.

 ### Lint

 ```bash
-cd litellm && poetry run ruff check .
+cd litellm && uv run ruff check .
 ```

 Ruff is the primary fast linter. For the full lint suite (including mypy, black, circular imports), run `make lint` per `CLAUDE.md`.
@ -271,4 +270,4 @@ Ruff is the primary fast linter. For the full lint suite (including mypy, black,
 - The proxy at port 4000 serves a **pre-built** static UI from `litellm/proxy/_experimental/out/`. After making UI code changes, you must run `npm run build` in the dashboard directory and copy the output: `cp -r ui/litellm-dashboard/out/* litellm/proxy/_experimental/out/` for the proxy to serve the updated UI.
 - SVGs used as provider logos (loaded via `<img>` tags) must NOT use `fill="currentColor"` — replace with an explicit color like `#000000` or use the `-color` variant from lobehub icons, since CSS color inheritance does not work inside `<img>` elements.
 - Provider logos live in `ui/litellm-dashboard/public/assets/logos/` (source) and `litellm/proxy/_experimental/out/assets/logos/` (pre-built). Both locations must have the file for it to work in dev and proxy-served modes.
- UI Vitest tests: `cd ui/litellm-dashboard && npx vitest run`
+- UI Vitest tests: `cd ui/litellm-dashboard && npx vitest run`
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -7,7 +7,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 ### Installation
 - `make install-dev` - Install core development dependencies
 - `make install-proxy-dev` - Install proxy development dependencies with full feature set
- `make install-test-deps` - Install all test dependencies
+- `make install-test-deps` - Install the full local test environment and generate the Prisma client

 ### Testing
 - `make test` - Run all tests
@ -20,14 +20,14 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 - `make format` - Apply Black code formatting
 - `make lint-ruff` - Run Ruff linting only
 - `make lint-mypy` - Run MyPy type checking only
- **Before committing, always run `poetry run black .` to format your code.** Black formatting is enforced in CI.
+- **Before committing, always run `uv run black .` to format your code.** Black formatting is enforced in CI.

 ### Single Test Files
- `poetry run pytest tests/path/to/test_file.py -v` - Run specific test file
- `poetry run pytest tests/path/to/test_file.py::test_function -v` - Run specific test
+- `uv run pytest tests/path/to/test_file.py -v` - Run specific test file
+- `uv run pytest tests/path/to/test_file.py::test_function -v` - Run specific test

 ### Running Scripts
- `poetry run python script.py` - Run Python scripts (use for non-test files)
+- `uv run python script.py` - Run Python scripts (use for non-test files)

 ### GitHub Issue & PR Templates
 When contributing to the project, use the appropriate templates:
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -122,9 +122,17 @@ Run all unit tests (uses parallel execution for speed):
 make test-unit
 ```

+If you're running broader test suites, proxy tests, or anything that touches PostgreSQL-backed fixtures/plugins, install the full local test environment first:
+
+```bash
+make install-test-deps
+```
+
+This syncs the locked test environment used across the repo, including `psycopg` v3 plus `psycopg-binary` (used by `pytest-postgresql`), `psycopg2-binary` (used by some proxy E2E tests), and a generated Prisma client for DB-backed proxy tests, so pytest startup matches CI without manual package installs.
+
 Run specific test files:
 ```bash
-poetry run pytest tests/test_litellm/test_your_file.py -v
+uv run pytest tests/test_litellm/test_your_file.py -v
 ```

 ### Running Linting and Formatting Checks
@ -185,7 +193,7 @@ Run `make help` to see all available commands:
 make help                       # Show all available commands
 make install-dev               # Install development dependencies
 make install-proxy-dev         # Install proxy development dependencies
-make install-test-deps         # Install test dependencies (for running tests)
+make install-test-deps         # Install the full local test environment
 make format                    # Apply Black code formatting
 make format-check              # Check Black formatting (matches CI)
 make lint                      # Run all linting checks
@ -247,7 +255,7 @@ To run the proxy server locally:
 make install-proxy-dev

 # Start the proxy server
-poetry run litellm --config your_config.yaml
+uv run litellm --config your_config.yaml
 ```

 ### Docker Development
@ -332,4 +340,4 @@ Looking for ideas? Check out:
 - 🧪 Test coverage improvements
 - 🔌 New LLM provider integrations

-Thank you for contributing to LiteLLM! 🚀 
+Thank you for contributing to LiteLLM! 🚀 
--- a/137
+++ b/137
@ -3,57 +3,75 @@ ARG LITELLM_BUILD_IMAGE=cgr.dev/chainguard/wolfi-base@sha256:a5a619c1793039dcf92

 # Runtime image
 ARG LITELLM_RUNTIME_IMAGE=cgr.dev/chainguard/wolfi-base@sha256:a5a619c1793039dcf92f02178f37c94bb3d6001403716da59d6092dfe8d9b502
+ARG UV_IMAGE=ghcr.io/astral-sh/uv:0.10.9@sha256:10902f58a1606787602f303954cea099626a4adb02acbac4c69920fe9d278f82
+
+FROM $UV_IMAGE AS uvbin

 # Builder stage
 FROM $LITELLM_BUILD_IMAGE AS builder

-# Set the working directory to /app
 WORKDIR /app
-
 USER root

-# Install build dependencies
-RUN apk add --no-cache bash gcc py3-pip python3 python3-dev openssl openssl-dev
+COPY --from=uvbin /uv /usr/local/bin/uv
+COPY --from=uvbin /uvx /usr/local/bin/uvx

-RUN python -m pip install build==1.4.2
+RUN apk add --no-cache \
+    bash \
+    gcc \
+    python3 \
+    python3-dev \
+    openssl \
+    openssl-dev \
+    nodejs \
+    npm \
+    libsndfile

-# Copy the current directory contents into the container at /app
+ENV PRISMA_BINARY_CACHE_DIR=/app/.cache/prisma-python/binaries \
+    UV_PROJECT_ENVIRONMENT=/app/.venv \
+    UV_LINK_MODE=copy \
+    XDG_CACHE_HOME=/app/.cache \
+    PATH="/app/.venv/bin:${PATH}"
+
+# Copy dependency metadata first for layer caching
+COPY pyproject.toml uv.lock ./
+COPY enterprise/pyproject.toml enterprise/
+COPY litellm-proxy-extras/pyproject.toml litellm-proxy-extras/
+
+# Install third-party dependencies (cached unless pyproject.toml/uv.lock change)
+RUN uv sync --frozen --no-install-project --no-install-workspace --no-default-groups --no-editable \
+    --extra proxy \
+    --extra proxy-runtime \
+    --extra extra_proxy \
+    --extra semantic-router \
+    --python python3
+
+# Copy full source tree
 COPY . .

-# Build Admin UI
-# Convert Windows line endings to Unix and make executable
+# Build Admin UI before final sync
 RUN sed -i 's/\r$//' docker/build_admin_ui.sh && chmod +x docker/build_admin_ui.sh && ./docker/build_admin_ui.sh

-# Build the package
-RUN rm -rf dist/* && python -m build
+# Install project and workspace packages (fast - deps already cached)
+RUN uv sync --frozen --no-default-groups --no-editable \
+    --extra proxy \
+    --extra proxy-runtime \
+    --extra extra_proxy \
+    --extra semantic-router \
+    --python python3

-# There should be only one wheel file now, assume the build only creates one
-RUN ls -1 dist/*.whl | head -1
+RUN prisma generate --schema=./schema.prisma

-# Install the package
-RUN pip install dist/*.whl
-
-# install dependencies as wheels
-RUN pip wheel --no-cache-dir --wheel-dir=/wheels/ -r requirements.txt
-
-# ensure pyjwt is used, not jwt
-RUN pip uninstall jwt -y
-RUN pip uninstall PyJWT -y
-RUN pip install PyJWT==2.12.0 --no-cache-dir
+RUN sed -i 's/\r$//' docker/entrypoint.sh && chmod +x docker/entrypoint.sh && \
+    sed -i 's/\r$//' docker/prod_entrypoint.sh && chmod +x docker/prod_entrypoint.sh

 # Runtime stage
 FROM $LITELLM_RUNTIME_IMAGE AS runtime

-# Ensure runtime stage runs as root
 USER root

-# Install runtime dependencies (libsndfile needed for audio processing on ARM64)
-RUN apk add --no-cache bash openssl tzdata nodejs npm python3 py3-pip libsndfile && \
+RUN apk add --no-cache bash openssl tzdata nodejs npm python3 libsndfile supervisor && \
    npm install -g npm@11.12.1 tar@7.5.11 glob@11.1.0 @isaacs/brace-expansion@5.0.1 minimatch@10.2.4 diff@8.0.3 && \
-    # SECURITY FIX: npm bundles tar, glob, and brace-expansion at multiple nested
-    # levels inside its dependency tree. `npm install -g <pkg>` only creates a
-    # SEPARATE global package, it does NOT replace npm's internal copies.
-    # We must find and replace EVERY copy inside npm's directory.
    GLOBAL="$(npm root -g)" && \
    find "$GLOBAL/npm" -type d -name "tar" -path "*/node_modules/tar" | while read d; do \
        rm -rf "$d" && cp -rL "$GLOBAL/tar" "$d"; \
@ -70,73 +88,24 @@ RUN apk add --no-cache bash openssl tzdata nodejs npm python3 py3-pip libsndfile
    find "$GLOBAL/npm" -type d -name "diff" -path "*/node_modules/diff" | while read d; do \
        rm -rf "$d" && cp -rL "$GLOBAL/diff" "$d"; \
    done && \
-    # SECURITY FIX: patch npm's own package.json metadata so scanners see the
-    # actual installed versions instead of the stale declared dependencies.
    find /usr/local/lib /usr/lib -path "*/node_modules/npm/package.json" -exec \
        sed -i 's/"tar": "\^7\.5\.[0-9]*"/"tar": "^7.5.10"/g; s/"minimatch": "\^10\.[0-9.]*"/"minimatch": "^10.2.4"/g' {} + 2>/dev/null && \
    npm cache clean --force && \
-    # Remove the apk-tracked npm so its stale SBOM metadata (tar 7.5.9) is
-    # no longer visible to image scanners.  The globally installed npm@latest
-    # at /usr/local/lib/node_modules/npm/ remains fully functional.
    { apk del --no-cache npm 2>/dev/null || true; }

 WORKDIR /app
-# Copy the current directory contents into the container at /app
-COPY . .
-RUN ls -la /app
+ENV PRISMA_BINARY_CACHE_DIR=/app/.cache/prisma-python/binaries \
+    XDG_CACHE_HOME=/app/.cache \
+    PATH="/app/.venv/bin:${PATH}"

-# Copy the built wheel from the builder stage to the runtime stage; assumes only one wheel file is present
-COPY --from=builder /app/dist/*.whl .
-COPY --from=builder /wheels/ /wheels/
+COPY --from=builder /app /app

-# Install the built wheel using pip; again using a wildcard if it's the only file
-RUN pip install *.whl /wheels/* --no-index --find-links=/wheels/ --no-deps && rm -f *.whl && rm -rf /wheels
-
-# Replace the nodejs-wheel-binaries bundled node with the system node (fixes CVE-2025-55130)
-RUN NODEJS_WHEEL_NODE=$(find /usr/lib -path "*/nodejs_wheel/bin/node" 2>/dev/null) && \
-    if [ -n "$NODEJS_WHEEL_NODE" ]; then cp /usr/bin/node "$NODEJS_WHEEL_NODE"; fi
-
-# Remove test files and keys from dependencies
-RUN find /usr/lib -type f -path "*/tornado/test/*" -delete && \
-    find /usr/lib -type d -path "*/tornado/test" -delete
-
-# SECURITY FIX: nodejs-wheel-binaries (pip package used by Prisma) bundles a complete
-# npm with old vulnerable deps at /usr/lib/python3.*/site-packages/nodejs_wheel/.
-# Patch every copy of tar, glob, and brace-expansion inside that tree.
-RUN GLOBAL="$(npm root -g)" && \
-    [ -n "$GLOBAL" ] || { echo "ERROR: npm root -g returned empty; aborting"; exit 1; } && \
-    find /usr/lib -type d -name "tar" -path "*/node_modules/tar" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/tar" "$d"; \
-    done && \
-    find /usr/lib -type d -name "glob" -path "*/node_modules/glob" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/glob" "$d"; \
-    done && \
-    find /usr/lib -type d -name "brace-expansion" -path "*/node_modules/@isaacs/brace-expansion" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/@isaacs/brace-expansion" "$d"; \
-    done && \
-    find /usr/lib -type d -name "minimatch" -path "*/node_modules/minimatch" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/minimatch" "$d"; \
-    done && \
-    find /usr/lib -type d -name "diff" -path "*/node_modules/diff" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/diff" "$d"; \
-    done
-
-# Install semantic_router and aurelio-sdk using script
-# Convert Windows line endings to Unix and make executable
-RUN sed -i 's/\r$//' docker/install_auto_router.sh && chmod +x docker/install_auto_router.sh && ./docker/install_auto_router.sh
-
-# Generate prisma client using the correct schema
-RUN prisma generate --schema=./litellm/proxy/schema.prisma
-# Convert Windows line endings to Unix for entrypoint scripts
-RUN sed -i 's/\r$//' docker/entrypoint.sh && chmod +x docker/entrypoint.sh
-RUN sed -i 's/\r$//' docker/prod_entrypoint.sh && chmod +x docker/prod_entrypoint.sh
+RUN find /app/.venv -type f -path "*/tornado/test/*" -delete && \
+    find /app/.venv -type d -path "*/tornado/test" -delete

 EXPOSE 4000/tcp

-RUN apk add --no-cache supervisor
 COPY docker/supervisord.conf /etc/supervisord.conf

 ENTRYPOINT ["docker/prod_entrypoint.sh"]
-
-# Append "--detailed_debug" to the end of CMD to view detailed debug logs
 CMD ["--port", "4000"]
--- a/GEMINI.md
+++ b/GEMINI.md
@ -22,11 +22,11 @@ This file provides guidance to Gemini when working with code in this repository.
 - `make lint-mypy` - Run MyPy type checking only

 ### Single Test Files
- `poetry run pytest tests/path/to/test_file.py -v` - Run specific test file
- `poetry run pytest tests/path/to/test_file.py::test_function -v` - Run specific test
+- `uv run pytest tests/path/to/test_file.py -v` - Run specific test file
+- `uv run pytest tests/path/to/test_file.py::test_function -v` - Run specific test

 ### Running Scripts
- `poetry run python script.py` - Run Python scripts (use for non-test files)
+- `uv run python script.py` - Run Python scripts (use for non-test files)

 ### GitHub Issue & PR Templates
 When contributing to the project, use the appropriate templates:
@ -105,4 +105,4 @@ LiteLLM is a unified interface for 100+ LLM providers with two main components:
 ### Enterprise Features
 - Enterprise-specific code in `enterprise/` directory
 - Optional features enabled via environment variables
- Separate licensing and authentication for enterprise features
+- Separate licensing and authentication for enterprise features
--- a/88
+++ b/88
@ -15,7 +15,7 @@ help:
 	@echo "  make install-proxy-dev  - Install proxy development dependencies"
 	@echo "  make install-dev-ci     - Install dev dependencies (CI-compatible, pins OpenAI)"
 	@echo "  make install-proxy-dev-ci - Install proxy dev dependencies (CI-compatible)"
-	@echo "  make install-test-deps  - Install test dependencies"
+	@echo "  make install-test-deps  - Install the full local test environment"
 	@echo "  make install-helm-unittest - Install helm unittest plugin"
 	@echo "  make format             - Apply Black code formatting"
 	@echo "  make format-check       - Check Black code formatting (matches CI)"
@ -40,49 +40,44 @@ help:
 	@echo "  make test-integration   - Run integration tests"
 	@echo "  make test-unit-helm     - Run helm unit tests"

-# Keep PIP simple for edge cases:
-PIP := $(shell command -v pip > /dev/null 2>&1 && echo "pip" || echo "python3 -m pip")
+UV := uv
+UV_RUN := $(UV) run --no-sync

 # Show info
 info:
-	@echo "PIP: $(PIP)"
+	@echo "UV: $(UV)"

 # Installation targets
 install-dev:
-	poetry install --with dev
+	$(UV) sync --frozen

 install-proxy-dev:
-	poetry install --with dev,proxy-dev --extras proxy
+	$(UV) sync --frozen --group proxy-dev --extra proxy

 # CI-compatible installations (matches GitHub workflows exactly)
 install-dev-ci:
-	$(PIP) install openai==2.8.0
-	poetry install --with dev
-	$(PIP) install openai==2.8.0
+	$(UV) sync --frozen

 install-proxy-dev-ci:
-	poetry install --with dev,proxy-dev --extras proxy
-	$(PIP) install openai==2.8.0
+	$(UV) sync --frozen --group proxy-dev --extra proxy

 install-test-deps: install-proxy-dev
-	poetry run $(PIP) install "pytest-retry==1.6.3"
-	poetry run $(PIP) install pytest-xdist
-	poetry run $(PIP) install openapi-core
-	cd enterprise && poetry run $(PIP) install -e . && cd ..
+	$(UV) sync --frozen --all-groups --all-extras
+	$(UV_RUN) prisma generate --schema litellm/proxy/schema.prisma

 install-helm-unittest:
 	helm plugin install https://github.com/helm-unittest/helm-unittest --version v0.4.4 || echo "ignore error if plugin exists"

 # Formatting
 format: install-dev
-	cd litellm && poetry run black . && cd ..
+	cd litellm && $(UV_RUN) black . && cd ..

 format-check: install-dev
-	cd litellm && poetry run black --check . && cd ..
+	cd litellm && $(UV_RUN) black --check . && cd ..

 # Linting targets
 lint-ruff: install-dev
-	cd litellm && poetry run ruff check . && cd ..
+	cd litellm && $(UV_RUN) ruff check . && cd ..

 # faster linter for developing ...
 # inspiration from:
@ -96,37 +91,36 @@ lint-format-changed: install-dev
 			$$start = $$1; $$count = $$2 || 1; $$end = $$start + $$count - 1; \
 			print "$$file:$$start:1-$$end:999\n"; \
 		}' | \
-	while read range; do \
-		file="$${range%%:*}"; \
-		lines="$${range#*:}"; \
-		echo "Formatting $$file (lines $$lines)"; \
-		poetry run ruff format --range "$$lines" "$$file"; \
-	done
+		while read range; do \
+			file="$${range%%:*}"; \
+			lines="$${range#*:}"; \
+			echo "Formatting $$file (lines $$lines)"; \
+			$(UV_RUN) ruff format --range "$$lines" "$$file"; \
+		done

 lint-ruff-dev: install-dev
 	@tmpfile=$$(mktemp /tmp/ruff-dev.XXXXXX) && \
 	cd litellm && \
-	(poetry run ruff check . --output-format=pylint || true) > "$$tmpfile" && \
-	poetry run diff-quality --violations=pylint "$$tmpfile" --compare-branch=origin/main && \
+	($(UV_RUN) ruff check . --output-format=pylint || true) > "$$tmpfile" && \
+	$(UV_RUN) diff-quality --violations=pylint "$$tmpfile" --compare-branch=origin/main && \
 	cd .. ; \
 	rm -f "$$tmpfile"

 lint-ruff-FULL-dev: install-dev
 	@files=$$(git diff --name-only origin/main -- '*.py'); \
-	if [ -n "$$files" ]; then echo "$$files" | xargs poetry run ruff check; \
+	if [ -n "$$files" ]; then echo "$$files" | xargs $(UV_RUN) ruff check; \
 	else echo "No changed .py files to check."; fi

 lint-mypy: install-dev
-	poetry run $(PIP) install types-requests types-setuptools types-redis types-PyYAML
-	cd litellm && poetry run mypy . --ignore-missing-imports && cd ..
+	cd litellm && $(UV_RUN) mypy . --ignore-missing-imports && cd ..

 lint-black: format-check

 check-circular-imports: install-dev
-	cd litellm && poetry run python ../tests/documentation_tests/test_circular_imports.py && cd ..
+	cd litellm && $(UV_RUN) python ../tests/documentation_tests/test_circular_imports.py && cd ..

 check-import-safety: install-dev
-	@poetry run python -c "from litellm import *; print('[from litellm import *] OK! no issues!');" || (echo '🚨 import failed, this means you introduced unprotected imports! 🚨'; exit 1)
+	@$(UV_RUN) python -c "from litellm import *; print('[from litellm import *] OK! no issues!');" || (echo '🚨 import failed, this means you introduced unprotected imports! 🚨'; exit 1)

 # Combined linting (matches test-linting.yml workflow)
 lint: format-check lint-ruff lint-mypy check-circular-imports check-import-safety
@ -135,46 +129,46 @@ lint: format-check lint-ruff lint-mypy check-circular-imports check-import-safet
 lint-dev: lint-format-changed lint-mypy check-circular-imports check-import-safety

 # Testing targets
-test:
-	poetry run pytest tests/
+test: install-test-deps
+	$(UV_RUN) pytest tests/

 test-unit: install-test-deps
-	poetry run pytest tests/test_litellm -x -vv -n 4
+	$(UV_RUN) pytest tests/test_litellm -x -vv -n 4

 # Matrix test targets (matching CI workflow groups)
 test-unit-llms: install-test-deps
-	poetry run pytest tests/test_litellm/llms --tb=short -vv -n 4 --durations=20
+	$(UV_RUN) pytest tests/test_litellm/llms --tb=short -vv -n 4 --durations=20

 test-unit-proxy-guardrails: install-test-deps
-	poetry run pytest tests/test_litellm/proxy/guardrails tests/test_litellm/proxy/management_endpoints tests/test_litellm/proxy/management_helpers --tb=short -vv -n 4 --durations=20
+	$(UV_RUN) pytest tests/test_litellm/proxy/guardrails tests/test_litellm/proxy/management_endpoints tests/test_litellm/proxy/management_helpers --tb=short -vv -n 4 --durations=20

 test-unit-proxy-core: install-test-deps
-	poetry run pytest tests/test_litellm/proxy/auth tests/test_litellm/proxy/client tests/test_litellm/proxy/db tests/test_litellm/proxy/hooks tests/test_litellm/proxy/policy_engine --tb=short -vv -n 4 --durations=20
+	$(UV_RUN) pytest tests/test_litellm/proxy/auth tests/test_litellm/proxy/client tests/test_litellm/proxy/db tests/test_litellm/proxy/hooks tests/test_litellm/proxy/policy_engine --tb=short -vv -n 4 --durations=20

 test-unit-proxy-misc: install-test-deps
-	poetry run pytest tests/test_litellm/proxy/_experimental tests/test_litellm/proxy/agent_endpoints tests/test_litellm/proxy/anthropic_endpoints tests/test_litellm/proxy/common_utils tests/test_litellm/proxy/discovery_endpoints tests/test_litellm/proxy/experimental tests/test_litellm/proxy/google_endpoints tests/test_litellm/proxy/health_endpoints tests/test_litellm/proxy/image_endpoints tests/test_litellm/proxy/middleware tests/test_litellm/proxy/openai_files_endpoint tests/test_litellm/proxy/pass_through_endpoints tests/test_litellm/proxy/prompts tests/test_litellm/proxy/public_endpoints tests/test_litellm/proxy/response_api_endpoints tests/test_litellm/proxy/spend_tracking tests/test_litellm/proxy/ui_crud_endpoints tests/test_litellm/proxy/vector_store_endpoints tests/test_litellm/proxy/test_*.py --tb=short -vv -n 4 --durations=20
+	$(UV_RUN) pytest tests/test_litellm/proxy/_experimental tests/test_litellm/proxy/agent_endpoints tests/test_litellm/proxy/anthropic_endpoints tests/test_litellm/proxy/common_utils tests/test_litellm/proxy/discovery_endpoints tests/test_litellm/proxy/experimental tests/test_litellm/proxy/google_endpoints tests/test_litellm/proxy/health_endpoints tests/test_litellm/proxy/image_endpoints tests/test_litellm/proxy/middleware tests/test_litellm/proxy/openai_files_endpoint tests/test_litellm/proxy/pass_through_endpoints tests/test_litellm/proxy/prompts tests/test_litellm/proxy/public_endpoints tests/test_litellm/proxy/response_api_endpoints tests/test_litellm/proxy/spend_tracking tests/test_litellm/proxy/ui_crud_endpoints tests/test_litellm/proxy/vector_store_endpoints tests/test_litellm/proxy/test_*.py --tb=short -vv -n 4 --durations=20

 test-unit-integrations: install-test-deps
-	poetry run pytest tests/test_litellm/integrations --tb=short -vv -n 4 --durations=20
+	$(UV_RUN) pytest tests/test_litellm/integrations --tb=short -vv -n 4 --durations=20

 test-unit-core-utils: install-test-deps
-	poetry run pytest tests/test_litellm/litellm_core_utils --tb=short -vv -n 2 --durations=20
+	$(UV_RUN) pytest tests/test_litellm/litellm_core_utils --tb=short -vv -n 2 --durations=20

 test-unit-other: install-test-deps
-	poetry run pytest tests/test_litellm/caching tests/test_litellm/responses tests/test_litellm/secret_managers tests/test_litellm/vector_stores tests/test_litellm/a2a_protocol tests/test_litellm/anthropic_interface tests/test_litellm/completion_extras tests/test_litellm/containers tests/test_litellm/enterprise tests/test_litellm/experimental_mcp_client tests/test_litellm/google_genai tests/test_litellm/images tests/test_litellm/interactions tests/test_litellm/passthrough tests/test_litellm/router_strategy tests/test_litellm/router_utils tests/test_litellm/types --tb=short -vv -n 4 --durations=20
+	$(UV_RUN) pytest tests/test_litellm/caching tests/test_litellm/responses tests/test_litellm/secret_managers tests/test_litellm/vector_stores tests/test_litellm/a2a_protocol tests/test_litellm/anthropic_interface tests/test_litellm/completion_extras tests/test_litellm/containers tests/test_litellm/enterprise tests/test_litellm/experimental_mcp_client tests/test_litellm/google_genai tests/test_litellm/images tests/test_litellm/interactions tests/test_litellm/passthrough tests/test_litellm/router_strategy tests/test_litellm/router_utils tests/test_litellm/types --tb=short -vv -n 4 --durations=20

 test-unit-root: install-test-deps
-	poetry run pytest tests/test_litellm/test_*.py --tb=short -vv -n 4 --durations=20
+	$(UV_RUN) pytest tests/test_litellm/test_*.py --tb=short -vv -n 4 --durations=20

 # Proxy unit tests (tests/proxy_unit_tests split alphabetically)
 test-proxy-unit-a: install-test-deps
-	poetry run pytest tests/proxy_unit_tests/test_[a-o]*.py --tb=short -vv -n 2 --durations=20
+	$(UV_RUN) pytest tests/proxy_unit_tests/test_[a-o]*.py --tb=short -vv -n 2 --durations=20

 test-proxy-unit-b: install-test-deps
-	poetry run pytest tests/proxy_unit_tests/test_[p-z]*.py --tb=short -vv -n 2 --durations=20
+	$(UV_RUN) pytest tests/proxy_unit_tests/test_[p-z]*.py --tb=short -vv -n 2 --durations=20

-test-integration:
-	poetry run pytest tests/ -k "not test_litellm"
+test-integration: install-test-deps
+	$(UV_RUN) pytest tests/ -k "not test_litellm"

 test-unit-helm: install-helm-unittest
 	helm unittest -f 'tests/*.yaml' deploy/charts/litellm-helm
@ -188,6 +182,6 @@ test-llm-translation-single: install-test-deps
 	@echo "Running single LLM translation test file..."
 	@if [ -z "$(FILE)" ]; then echo "Usage: make test-llm-translation-single FILE=test_filename.py"; exit 1; fi
 	@mkdir -p test-results
-	poetry run pytest tests/llm_translation/$(FILE) \
+	$(UV_RUN) pytest tests/llm_translation/$(FILE) \
 		--junitxml=test-results/junit.xml \
 		-v --tb=short --maxfail=100 --timeout=300
--- a/README.md
+++ b/README.md
@ -50,7 +50,7 @@
 ### Python SDK

 ```shell
-pip install litellm
+uv add litellm
 ```

 ```python
@ -72,7 +72,7 @@ response = completion(model="anthropic/claude-sonnet-4-20250514", messages=[{"ro
 [**Getting Started - E2E Tutorial**](https://docs.litellm.ai/docs/proxy/docker_quick_start) - Setup virtual keys, make your first request

 ```shell
-pip install 'litellm[proxy]'
+uv tool install 'litellm[proxy]'
 litellm --model gpt-4o
 ```

@ -394,8 +394,8 @@ Support for more providers. Missing a provider or LLM Platform, raise a [feature
 ### Backend
 1. (In root) create virtual environment `python -m venv .venv`
 2. Activate virtual environment `source .venv/bin/activate`
-3. Install dependencies `pip install -e ".[all]"`
-4. `pip install prisma`
+3. Install dependencies `uv sync --all-extras --group proxy-dev`
+4. `uv run prisma generate`
 5. `prisma generate`
 6. Start proxy backend `python litellm/proxy/proxy_cli.py`

@ -450,7 +450,7 @@ We welcome contributions to LiteLLM! Whether you're fixing bugs, adding features

 ## Quick Start for Contributors

-This requires poetry to be installed.
+This requires uv to be installed.

 ```bash
 git clone https://github.com/BerriAI/litellm.git
@ -504,4 +504,3 @@ All these checks must pass before your PR can be merged.
 <a href="https://github.com/BerriAI/litellm/graphs/contributors">
  <img src="https://contrib.rocks/image?repo=BerriAI/litellm" />
 </a>
-
--- a/docker/Dockerfile.alpine
+++ b/docker/Dockerfile.alpine
@ -3,55 +3,66 @@ ARG LITELLM_BUILD_IMAGE=python:3.11-alpine@sha256:f07e2ace46f560f09a6eeec7b4913b

 # Runtime image
 ARG LITELLM_RUNTIME_IMAGE=python:3.11-alpine@sha256:f07e2ace46f560f09a6eeec7b4913b80ee99546e749ef82342a419a326620856
+ARG UV_IMAGE=ghcr.io/astral-sh/uv:0.10.9@sha256:10902f58a1606787602f303954cea099626a4adb02acbac4c69920fe9d278f82
+
+FROM $UV_IMAGE AS uvbin

-# Builder stage
 FROM $LITELLM_BUILD_IMAGE AS builder

-# Set the working directory to /app
 WORKDIR /app

-# Install build dependencies
-RUN apk add --no-cache gcc python3-dev musl-dev
+COPY --from=uvbin /uv /usr/local/bin/uv
+COPY --from=uvbin /uvx /usr/local/bin/uvx

-RUN pip install --upgrade pip==26.0.1 && \
-    pip install build==1.4.2
+RUN apk add --no-cache gcc python3-dev musl-dev nodejs npm libsndfile

-# Copy the current directory contents into the container at /app
+ENV PRISMA_BINARY_CACHE_DIR=/app/.cache/prisma-python/binaries \
+    UV_PROJECT_ENVIRONMENT=/app/.venv \
+    UV_LINK_MODE=copy \
+    XDG_CACHE_HOME=/app/.cache \
+    PATH="/app/.venv/bin:${PATH}"
+
+# Copy dependency metadata first for layer caching
+COPY pyproject.toml uv.lock ./
+COPY enterprise/pyproject.toml enterprise/
+COPY litellm-proxy-extras/pyproject.toml litellm-proxy-extras/
+
+# Install third-party dependencies (cached unless pyproject.toml/uv.lock change)
+RUN uv sync --frozen --no-install-project --no-install-workspace --no-default-groups --no-editable \
+    --extra proxy \
+    --extra proxy-runtime \
+    --extra extra_proxy \
+    --extra semantic-router \
+    --python python3
+
+# Copy full source tree
 COPY . .

-# Build the package
-RUN rm -rf dist/* && python -m build
+# Install project and workspace packages (fast - deps already cached)
+RUN uv sync --frozen --no-default-groups --no-editable \
+    --extra proxy \
+    --extra proxy-runtime \
+    --extra extra_proxy \
+    --extra semantic-router \
+    --python python3

-# There should be only one wheel file now, assume the build only creates one
-RUN ls -1 dist/*.whl | head -1
+RUN prisma generate --schema=./schema.prisma

-# Install the package
-RUN pip install dist/*.whl
+RUN sed -i 's/\r$//' docker/entrypoint.sh && chmod +x docker/entrypoint.sh && \
+    sed -i 's/\r$//' docker/prod_entrypoint.sh && chmod +x docker/prod_entrypoint.sh

-# install dependencies as wheels
-RUN pip wheel --no-cache-dir --wheel-dir=/wheels/ -r requirements.txt
-
-# Runtime stage
 FROM $LITELLM_RUNTIME_IMAGE AS runtime

-# Update dependencies and clean up, install libsndfile for audio processing
-RUN apk upgrade --no-cache && apk add --no-cache libsndfile
+RUN apk upgrade --no-cache && apk add --no-cache libsndfile nodejs npm

 WORKDIR /app
+ENV PRISMA_BINARY_CACHE_DIR=/app/.cache/prisma-python/binaries \
+    XDG_CACHE_HOME=/app/.cache \
+    PATH="/app/.venv/bin:${PATH}"

-# Copy the built wheel from the builder stage to the runtime stage; assumes only one wheel file is present
-COPY --from=builder /app/dist/*.whl .
-COPY --from=builder /wheels/ /wheels/
-
-# Install the built wheel using pip; again using a wildcard if it's the only file
-RUN pip install *.whl /wheels/* --no-index --find-links=/wheels/ --no-deps && rm -f *.whl && rm -rf /wheels
-
-# Convert Windows line endings to Unix for entrypoint scripts
-RUN sed -i 's/\r$//' docker/entrypoint.sh && chmod +x docker/entrypoint.sh
-RUN sed -i 's/\r$//' docker/prod_entrypoint.sh && chmod +x docker/prod_entrypoint.sh
+COPY --from=builder /app /app

 EXPOSE 4000/tcp

-# Set your entrypoint and command
 ENTRYPOINT ["docker/prod_entrypoint.sh"]
 CMD ["--port", "4000"]
--- a/docker/Dockerfile.database
+++ b/docker/Dockerfile.database
@ -3,53 +3,72 @@ ARG LITELLM_BUILD_IMAGE=cgr.dev/chainguard/wolfi-base@sha256:a5a619c1793039dcf92

 # Runtime image
 ARG LITELLM_RUNTIME_IMAGE=cgr.dev/chainguard/wolfi-base@sha256:a5a619c1793039dcf92f02178f37c94bb3d6001403716da59d6092dfe8d9b502
-# Builder stage
+ARG UV_IMAGE=ghcr.io/astral-sh/uv:0.10.9@sha256:10902f58a1606787602f303954cea099626a4adb02acbac4c69920fe9d278f82
+
+FROM $UV_IMAGE AS uvbin
+
 FROM $LITELLM_BUILD_IMAGE AS builder

-# Set the working directory to /app
 WORKDIR /app
-
 USER root

-# Install build dependencies
+COPY --from=uvbin /uv /usr/local/bin/uv
+COPY --from=uvbin /uvx /usr/local/bin/uvx
+
 RUN apk add --no-cache \
    bash \
    gcc \
-    py3-pip \
    python3 \
    python3-dev \
    openssl \
-    openssl-dev
+    openssl-dev \
+    nodejs \
+    npm \
+    libsndfile

-RUN python -m pip install build==1.4.2
+ENV PRISMA_BINARY_CACHE_DIR=/app/.cache/prisma-python/binaries \
+    UV_PROJECT_ENVIRONMENT=/app/.venv \
+    UV_LINK_MODE=copy \
+    XDG_CACHE_HOME=/app/.cache \
+    PATH="/app/.venv/bin:${PATH}"

-# Copy the current directory contents into the container at /app
+# Copy dependency metadata first for layer caching
+COPY pyproject.toml uv.lock ./
+COPY enterprise/pyproject.toml enterprise/
+COPY litellm-proxy-extras/pyproject.toml litellm-proxy-extras/
+
+# Install third-party dependencies (cached unless pyproject.toml/uv.lock change)
+RUN uv sync --frozen --no-install-project --no-install-workspace --no-default-groups --no-editable \
+    --extra proxy \
+    --extra proxy-runtime \
+    --extra extra_proxy \
+    --extra semantic-router \
+    --python python3
+
+# Copy full source tree
 COPY . .

-# Build Admin UI
-# Convert Windows line endings to Unix and make executable
+# Build Admin UI before final sync
 RUN sed -i 's/\r$//' docker/build_admin_ui.sh && chmod +x docker/build_admin_ui.sh && ./docker/build_admin_ui.sh

-# Build the package
-RUN rm -rf dist/* && python -m build
+# Install project and workspace packages (fast - deps already cached)
+RUN uv sync --frozen --no-default-groups --no-editable \
+    --extra proxy \
+    --extra proxy-runtime \
+    --extra extra_proxy \
+    --extra semantic-router \
+    --python python3

-# There should be only one wheel file now, assume the build only creates one
-RUN ls -1 dist/*.whl | head -1
+RUN prisma generate --schema=./schema.prisma

-# Install the package
-RUN pip install dist/*.whl
+RUN sed -i 's/\r$//' docker/entrypoint.sh && chmod +x docker/entrypoint.sh && \
+    sed -i 's/\r$//' docker/prod_entrypoint.sh && chmod +x docker/prod_entrypoint.sh

-# install dependencies as wheels
-RUN pip wheel --no-cache-dir --wheel-dir=/wheels/ -r requirements.txt
-
-# Runtime stage
 FROM $LITELLM_RUNTIME_IMAGE AS runtime

-# Ensure runtime stage runs as root
 USER root

-# Install runtime dependencies
-RUN apk add --no-cache bash openssl tzdata nodejs npm python3 py3-pip libsndfile && \
+RUN apk add --no-cache bash openssl tzdata nodejs npm python3 libsndfile supervisor && \
    npm install -g npm@11.12.1 tar@7.5.11 glob@11.1.0 @isaacs/brace-expansion@5.0.1 minimatch@10.2.4 diff@8.0.3 && \
    GLOBAL="$(npm root -g)" && \
    find "$GLOBAL/npm" -type d -name "tar" -path "*/node_modules/tar" | while read d; do \
@ -73,66 +92,18 @@ RUN apk add --no-cache bash openssl tzdata nodejs npm python3 py3-pip libsndfile
    { apk del --no-cache npm 2>/dev/null || true; }

 WORKDIR /app
-# Copy the current directory contents into the container at /app
-COPY . .
-RUN ls -la /app
+ENV PRISMA_BINARY_CACHE_DIR=/app/.cache/prisma-python/binaries \
+    XDG_CACHE_HOME=/app/.cache \
+    PATH="/app/.venv/bin:${PATH}"

-# Copy the built wheel from the builder stage to the runtime stage; assumes only one wheel file is present
-COPY --from=builder /app/dist/*.whl .
-COPY --from=builder /wheels/ /wheels/
+COPY --from=builder /app /app

-# Install the built wheel using pip; again using a wildcard if it's the only file
-RUN pip install *.whl /wheels/* --no-index --find-links=/wheels/ --no-deps && rm -f *.whl && rm -rf /wheels
+RUN find /app/.venv -type f -path "*/tornado/test/*" -delete && \
+    find /app/.venv -type d -path "*/tornado/test" -delete

-# SECURITY FIX: nodejs-wheel-binaries (pip package used by Prisma) bundles a complete
-# npm with old vulnerable deps at /usr/lib/python3.*/site-packages/nodejs_wheel/.
-# Patch every copy of tar, glob, and brace-expansion inside that tree.
-RUN GLOBAL="$(npm root -g)" && \
-    [ -n "$GLOBAL" ] || { echo "ERROR: npm root -g returned empty; aborting"; exit 1; } && \
-    find /usr/lib -type d -name "tar" -path "*/node_modules/tar" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/tar" "$d"; \
-    done && \
-    find /usr/lib -type d -name "glob" -path "*/node_modules/glob" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/glob" "$d"; \
-    done && \
-    find /usr/lib -type d -name "brace-expansion" -path "*/node_modules/@isaacs/brace-expansion" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/@isaacs/brace-expansion" "$d"; \
-    done && \
-    find /usr/lib -type d -name "minimatch" -path "*/node_modules/minimatch" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/minimatch" "$d"; \
-    done && \
-    find /usr/lib -type d -name "diff" -path "*/node_modules/diff" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/diff" "$d"; \
-    done
-
-# Install semantic_router and aurelio-sdk using script
-# Convert Windows line endings to Unix and make executable
-RUN sed -i 's/\r$//' docker/install_auto_router.sh && chmod +x docker/install_auto_router.sh && ./docker/install_auto_router.sh
-
-# ensure pyjwt is used, not jwt
-RUN pip uninstall jwt -y
-RUN pip uninstall PyJWT -y
-RUN pip install PyJWT==2.12.0 --no-cache-dir
-
-# Build Admin UI (runtime stage)
-# Convert Windows line endings to Unix and make executable
-RUN sed -i 's/\r$//' docker/build_admin_ui.sh && chmod +x docker/build_admin_ui.sh && ./docker/build_admin_ui.sh
-
-# Generate prisma client
-RUN prisma generate
-# Convert Windows line endings to Unix for entrypoint scripts
-RUN sed -i 's/\r$//' docker/entrypoint.sh && chmod +x docker/entrypoint.sh
-RUN sed -i 's/\r$//' docker/prod_entrypoint.sh && chmod +x docker/prod_entrypoint.sh
 EXPOSE 4000/tcp

-RUN apk add --no-cache supervisor
 COPY docker/supervisord.conf /etc/supervisord.conf

-# # Set your entrypoint and command
-
-
 ENTRYPOINT ["docker/prod_entrypoint.sh"]
-
-# Append "--detailed_debug" to the end of CMD to view detailed debug logs 
-# CMD ["--port", "4000", "--detailed_debug"]
 CMD ["--port", "4000"]
--- a/docker/Dockerfile.dev
+++ b/docker/Dockerfile.dev
@ -3,60 +3,70 @@ ARG LITELLM_BUILD_IMAGE=python:3.13-slim@sha256:739e7213785e88c0f702dcdc12c0973a

 # Runtime image
 ARG LITELLM_RUNTIME_IMAGE=python:3.13-slim@sha256:739e7213785e88c0f702dcdc12c0973afcbd606dbf021a589cab77d6b00b579d
+ARG UV_IMAGE=ghcr.io/astral-sh/uv:0.10.9@sha256:10902f58a1606787602f303954cea099626a4adb02acbac4c69920fe9d278f82
+
+FROM $UV_IMAGE AS uvbin

-# Builder stage
 FROM $LITELLM_BUILD_IMAGE AS builder

-# Set the working directory to /app
 WORKDIR /app
-
 USER root

-# Install build dependencies in one layer
+COPY --from=uvbin /uv /usr/local/bin/uv
+COPY --from=uvbin /uvx /usr/local/bin/uvx
+
 RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc \
    g++ \
    python3-dev \
    libssl-dev \
    pkg-config \
-    && rm -rf /var/lib/apt/lists/* \
-    && pip install --upgrade pip==26.0.1 build==1.4.2
+    nodejs \
+    npm \
+    && rm -rf /var/lib/apt/lists/*

-# Copy requirements first for better layer caching
-COPY requirements.txt .
+ENV PRISMA_BINARY_CACHE_DIR=/app/.cache/prisma-python/binaries \
+    UV_PROJECT_ENVIRONMENT=/app/.venv \
+    UV_LINK_MODE=copy \
+    XDG_CACHE_HOME=/app/.cache \
+    PATH="/app/.venv/bin:${PATH}"

-# Install Python dependencies with cache mount for faster rebuilds
-RUN --mount=type=cache,target=/root/.cache/pip \
-    pip wheel --no-cache-dir --wheel-dir=/wheels/ -r requirements.txt
+# Copy dependency metadata first for layer caching
+COPY pyproject.toml uv.lock ./
+COPY enterprise/pyproject.toml enterprise/
+COPY litellm-proxy-extras/pyproject.toml litellm-proxy-extras/

-# Fix JWT dependency conflicts early
-RUN pip uninstall jwt -y || true && \
-    pip uninstall PyJWT -y || true && \
-    pip install PyJWT==2.12.0 --no-cache-dir
+# Install third-party dependencies (cached unless pyproject.toml/uv.lock change)
+RUN uv sync --frozen --no-install-project --no-install-workspace --no-default-groups --no-editable \
+    --extra proxy \
+    --extra proxy-runtime \
+    --extra extra_proxy \
+    --extra semantic-router \
+    --python python

-# Copy only necessary files for build
-COPY pyproject.toml README.md schema.prisma poetry.lock ./
-COPY litellm/ ./litellm/
-COPY enterprise/ ./enterprise/
-COPY docker/ ./docker/
+# Copy full source tree
+COPY . .

-# Build Admin UI once
-# Convert Windows line endings to Unix and make executable
+# Build Admin UI before final sync
 RUN sed -i 's/\r$//' docker/build_admin_ui.sh && chmod +x docker/build_admin_ui.sh && ./docker/build_admin_ui.sh

-# Build the package
-RUN rm -rf dist/* && python -m build
+# Install project and workspace packages (fast - deps already cached)
+RUN uv sync --frozen --no-default-groups --no-editable \
+    --extra proxy \
+    --extra proxy-runtime \
+    --extra extra_proxy \
+    --extra semantic-router \
+    --python python

-# Install the built package
-RUN pip install dist/*.whl
+RUN prisma generate --schema=./schema.prisma
+
+RUN sed -i 's/\r$//' docker/entrypoint.sh && chmod +x docker/entrypoint.sh && \
+    sed -i 's/\r$//' docker/prod_entrypoint.sh && chmod +x docker/prod_entrypoint.sh

-# Runtime stage
 FROM $LITELLM_RUNTIME_IMAGE AS runtime

-# Ensure runtime stage runs as root
 USER root

-# Install only runtime dependencies
 RUN apt-get update && apt-get upgrade -y \
        libxml2 \
        libexpat1 \
@ -72,9 +82,9 @@ RUN apt-get update && apt-get upgrade -y \
        libc6 \
    && apt-get install -y --no-install-recommends \
        libssl3 \
-    libatomic1 \
-    nodejs \
-    npm \
+        libatomic1 \
+        nodejs \
+        npm \
    && rm -rf /var/lib/apt/lists/* \
    && npm install -g npm@11.12.1 tar@7.5.11 glob@11.1.0 @isaacs/brace-expansion@5.0.1 minimatch@10.2.4 diff@8.0.3 \
    && GLOBAL="$(npm root -g)" \
@ -99,53 +109,13 @@ RUN apt-get update && apt-get upgrade -y \
    && apt-get purge -y npm

 WORKDIR /app
+ENV PRISMA_BINARY_CACHE_DIR=/app/.cache/prisma-python/binaries \
+    XDG_CACHE_HOME=/app/.cache \
+    PATH="/app/.venv/bin:${PATH}"

-# Copy only necessary runtime files
-COPY docker/entrypoint.sh docker/prod_entrypoint.sh ./docker/
-COPY litellm/ ./litellm/
-COPY pyproject.toml README.md schema.prisma poetry.lock ./
-
-# Copy pre-built wheels and install everything at once
-COPY --from=builder /wheels/ /wheels/
-COPY --from=builder /app/dist/*.whl .
-
-# Install all dependencies in one step with no-cache for smaller image
-RUN pip install --no-cache-dir *.whl /wheels/* --no-index --find-links=/wheels/ --no-deps && \
-    rm -f *.whl && \
-    rm -rf /wheels
-
-# SECURITY FIX: nodejs-wheel-binaries (pip package used by Prisma) bundles a complete
-# npm with old vulnerable deps at /usr/lib/python3.*/site-packages/nodejs_wheel/.
-# Patch every copy of tar, glob, and brace-expansion inside that tree.
-RUN GLOBAL="$(npm root -g)" && \
-    [ -n "$GLOBAL" ] || { echo "ERROR: npm root -g returned empty; aborting"; exit 1; } && \
-    find /usr/lib -type d -name "tar" -path "*/node_modules/tar" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/tar" "$d"; \
-    done && \
-    find /usr/lib -type d -name "glob" -path "*/node_modules/glob" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/glob" "$d"; \
-    done && \
-    find /usr/lib -type d -name "brace-expansion" -path "*/node_modules/@isaacs/brace-expansion" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/@isaacs/brace-expansion" "$d"; \
-    done && \
-    find /usr/lib -type d -name "minimatch" -path "*/node_modules/minimatch" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/minimatch" "$d"; \
-    done && \
-    find /usr/lib -type d -name "diff" -path "*/node_modules/diff" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/diff" "$d"; \
-    done
-
-# Generate prisma client and set permissions
-# Convert Windows line endings to Unix for entrypoint scripts
-RUN prisma generate && \
-    sed -i 's/\r$//' docker/entrypoint.sh && \
-    sed -i 's/\r$//' docker/prod_entrypoint.sh && \
-    chmod +x docker/entrypoint.sh && \
-    chmod +x docker/prod_entrypoint.sh
+COPY --from=builder /app /app

 EXPOSE 4000/tcp

 ENTRYPOINT ["docker/prod_entrypoint.sh"]
-
-# Append "--detailed_debug" to the end of CMD to view detailed debug logs 
-CMD ["--port", "4000"]
+CMD ["--port", "4000"]
--- a/docker/Dockerfile.health_check
+++ b/docker/Dockerfile.health_check
@ -1,16 +1,22 @@
+ARG UV_IMAGE=ghcr.io/astral-sh/uv:0.10.9@sha256:10902f58a1606787602f303954cea099626a4adb02acbac4c69920fe9d278f82
+FROM $UV_IMAGE AS uvbin
+
 FROM python:3.13-slim@sha256:739e7213785e88c0f702dcdc12c0973afcbd606dbf021a589cab77d6b00b579d

 WORKDIR /app

-# Copy health check script and requirements
+# Copy the uv binary and the health check script.
+COPY --from=uvbin /uv /usr/local/bin/uv
+COPY pyproject.toml uv.lock /app/
 COPY scripts/health_check/health_check_client.py /app/health_check_client.py
-COPY scripts/health_check/health_check_requirements.txt /app/requirements.txt

-# Install dependencies
-RUN pip install --no-cache-dir -r requirements.txt
-
-# Make script executable
-RUN chmod +x /app/health_check_client.py
+# Resolve and install the health-check dependencies from the project lockfile
+# so the runtime image stays self-contained and reproducible.
+RUN uv export --frozen --no-default-groups --only-group healthcheck --no-emit-project --no-hashes --output-file /tmp/health-check-requirements.txt \
+  && uv pip install --system -r /tmp/health-check-requirements.txt \
+  && rm /tmp/health-check-requirements.txt \
+  && rm /app/pyproject.toml /app/uv.lock \
+  && chmod +x /app/health_check_client.py

 # Run as non-root user
 RUN adduser --disabled-password --gecos "" --uid 1001 healthcheck
--- a/docker/Dockerfile.non_root
+++ b/docker/Dockerfile.non_root
@ -2,51 +2,84 @@
 ARG LITELLM_BUILD_IMAGE=cgr.dev/chainguard/wolfi-base@sha256:a5a619c1793039dcf92f02178f37c94bb3d6001403716da59d6092dfe8d9b502
 ARG LITELLM_RUNTIME_IMAGE=cgr.dev/chainguard/wolfi-base@sha256:a5a619c1793039dcf92f02178f37c94bb3d6001403716da59d6092dfe8d9b502
 ARG PROXY_EXTRAS_SOURCE=published
+ARG UV_IMAGE=ghcr.io/astral-sh/uv:0.10.9@sha256:10902f58a1606787602f303954cea099626a4adb02acbac4c69920fe9d278f82
+
+FROM $UV_IMAGE AS uvbin

-# -----------------
-# Builder Stage
-# -----------------
 FROM $LITELLM_BUILD_IMAGE AS builder
 ARG PROXY_EXTRAS_SOURCE
 WORKDIR /app
 USER root

-# Install build dependencies with retry logic (includes node for UI build)
+COPY --from=uvbin /uv /usr/local/bin/uv
+COPY --from=uvbin /uvx /usr/local/bin/uvx
+
 RUN for i in 1 2 3; do \
    apk add --no-cache \
-    python3 \
-    python3-dev \
-    py3-pip \
-    clang \
-    llvm \
-    lld \
-    gcc \
-    linux-headers \
-    build-base \
-    bash \
-    nodejs \
-    npm && break || sleep 5; \
-    done \
-  && pip install --no-cache-dir --upgrade pip==26.0.1 build==1.4.2
+      python3 \
+      python3-dev \
+      clang \
+      llvm \
+      lld \
+      gcc \
+      linux-headers \
+      build-base \
+      bash \
+      coreutils \
+      curl \
+      openssl \
+      openssl-dev \
+      nodejs \
+      npm \
+      libsndfile && break || sleep 5; \
+    done

-# Cache Python dependencies
-COPY requirements.txt .
-RUN pip wheel --no-cache-dir --wheel-dir=/wheels/ -r requirements.txt \
-  && pip wheel --no-cache-dir --wheel-dir=/wheels/ "semantic_router==0.1.11" "aurelio-sdk==0.0.19" "PyJWT==2.12.0"
+ENV UV_PROJECT_ENVIRONMENT=/app/.venv \
+    UV_LINK_MODE=copy \
+    NVM_DIR=/root/.nvm \
+    PATH="/root/.nvm/versions/node/v20.20.2/bin:/app/.venv/bin:${PATH}" \
+    LITELLM_NON_ROOT=true \
+    PRISMA_BINARY_CACHE_DIR=/app/.cache/prisma-python/binaries \
+    PRISMA_CLI_BINARY_TARGETS="debian-openssl-3.0.x" \
+    XDG_CACHE_HOME=/app/.cache

-# Copy source after dependency layers
+# Copy dependency metadata first for layer caching
+COPY pyproject.toml uv.lock ./
+COPY enterprise/pyproject.toml enterprise/
+COPY litellm-proxy-extras/pyproject.toml litellm-proxy-extras/
+
+# Install third-party dependencies (cached unless pyproject.toml/uv.lock change)
+RUN uv sync --frozen --no-install-project --no-install-workspace --no-default-groups --no-editable \
+    --extra proxy \
+    --extra proxy-runtime \
+    --extra extra_proxy \
+    --extra semantic-router \
+    --python python3
+
+# Copy full source tree
 COPY . .

 # Set non-root flag for build time consistency
 ENV LITELLM_NON_ROOT=true

-# Build Admin UI using the upstream command order while keeping a single RUN layer
+# Build Admin UI once and stage the static output for the runtime image.
 # NOTE: .npmrc files (which may set ignore-scripts=true and min-release-age=3d)
 # are temporarily renamed during npm install/ci so they don't block lifecycle
 # scripts needed by the build. This is safe because npm ci installs from
 # package-lock.json with pinned versions + integrity hashes.
-RUN mkdir -p /var/lib/litellm/ui && \
+RUN mkdir -p /var/lib/litellm/ui /var/lib/litellm/assets && \
    ([ -f /app/.npmrc ] && mv /app/.npmrc /app/.npmrc.bak || true) && \
+    NVM_VERSION="v0.40.4" && \
+    NVM_CHECKSUM="4b7412c49960c7d31e8df72da90c1fb5b8cccb419ac99537b737028d497aba4f" && \
+    NODE_VERSION="v20.20.2" && \
+    NVM_SCRIPT="/tmp/install-nvm.sh" && \
+    curl -fsSL "https://raw.githubusercontent.com/nvm-sh/nvm/${NVM_VERSION}/install.sh" -o "$NVM_SCRIPT" && \
+    echo "${NVM_CHECKSUM}  ${NVM_SCRIPT}" | sha256sum -c - && \
+    bash "$NVM_SCRIPT" && \
+    export NVM_DIR="$HOME/.nvm" && \
+    . "$NVM_DIR/nvm.sh" && \
+    nvm install "${NODE_VERSION}" && \
+    nvm use "${NODE_VERSION}" && \
    npm install -g npm@11.12.1 && \
    npm install -g node-gyp@12.2.0 && \
    ln -sf "$(npm root -g)/node-gyp" "$(npm root -g)/npm/node_modules/node-gyp" && \
@ -56,12 +89,11 @@ RUN mkdir -p /var/lib/litellm/ui && \
      cp /app/enterprise/enterprise_ui/enterprise_colors.json ./ui_colors.json; \
    fi && \
    ([ -f .npmrc ] && mv .npmrc .npmrc.bak || true) && \
-    npm ci && \
+    npm ci --no-audit --no-fund && \
    ([ -f .npmrc.bak ] && mv .npmrc.bak .npmrc || true) && \
    ([ -f /app/.npmrc.bak ] && mv /app/.npmrc.bak /app/.npmrc || true) && \
    npm run build && \
    cp -r /app/ui/litellm-dashboard/out/* /var/lib/litellm/ui/ && \
-    mkdir -p /var/lib/litellm/assets && \
    cp /app/litellm/proxy/logo.jpg /var/lib/litellm/assets/logo.jpg && \
    ( cd /var/lib/litellm/ui && \
      for html_file in *.html; do \
@ -74,175 +106,106 @@ RUN mkdir -p /var/lib/litellm/ui && \
      touch .litellm_ui_ready ) && \
    cd /app/ui/litellm-dashboard && rm -rf ./out

-# Build litellm wheel and place it in wheels dir (replace any PyPI wheels)
-RUN rm -rf dist/* && python -m build && \
-  rm -f /wheels/litellm-*.whl && \
-  cp dist/*.whl /wheels/
-
-# Optionally build local litellm-proxy-extras wheel
-RUN if [ "$PROXY_EXTRAS_SOURCE" = "local" ]; then \
-      cd /app/litellm-proxy-extras && rm -rf dist && python -m build && \
-      cp dist/*.whl /wheels/; \
+RUN if [ "$PROXY_EXTRAS_SOURCE" = "published" ]; then \
+      uv sync --frozen --no-default-groups --no-editable \
+        --extra proxy \
+        --extra proxy-runtime \
+        --extra extra_proxy \
+        --extra semantic-router \
+        --python python3 \
+        --no-sources-package litellm-proxy-extras; \
+    else \
+      uv sync --frozen --no-default-groups --no-editable \
+        --extra proxy \
+        --extra proxy-runtime \
+        --extra extra_proxy \
+        --extra semantic-router \
+        --python python3; \
    fi

-# Pre-cache Prisma binaries in the builder stage
-ENV PRISMA_BINARY_CACHE_DIR=/app/.cache/prisma-python/binaries \
-    PRISMA_CLI_BINARY_TARGETS="debian-openssl-3.0.x" \
-    XDG_CACHE_HOME=/app/.cache \
-    PATH="/usr/lib/python3.13/site-packages/nodejs/bin:${PATH}"
-
-RUN pip install --no-cache-dir prisma==0.11.0 nodejs-wheel-binaries==24.13.1 \
-    && mkdir -p /app/.cache/npm
-
-RUN NPM_CONFIG_CACHE=/app/.cache/npm \
-    python -c "import prisma.cli.prisma as p; p.ensure_cached()"
-
-RUN prisma generate && \
+RUN mkdir -p /app/.cache/npm && \
+    prisma generate --schema=./schema.prisma && \
    prisma --version && \
    prisma migrate diff --from-empty --to-schema-datamodel ./schema.prisma --script > /dev/null 2>&1 || true

-# -----------------
-# Runtime Stage
-# -----------------
+RUN sed -i 's/\r$//' docker/entrypoint.sh && chmod +x docker/entrypoint.sh && \
+    sed -i 's/\r$//' docker/prod_entrypoint.sh && chmod +x docker/prod_entrypoint.sh
+
 FROM $LITELLM_RUNTIME_IMAGE AS runtime
 ARG PROXY_EXTRAS_SOURCE
 WORKDIR /app
 USER root

-# Install runtime dependencies with retry
 RUN for i in 1 2 3; do \
    apk upgrade --no-cache && break || sleep 5; \
-    done \
-  && for i in 1 2 3; do \
-    apk add --no-cache python3 py3-pip bash openssl tzdata nodejs npm supervisor && break || sleep 5; \
-    done \
-  && apk upgrade --no-cache nodejs \
-  && npm install -g npm@11.12.1 tar@7.5.11 glob@11.1.0 @isaacs/brace-expansion@5.0.1 minimatch@10.2.4 diff@8.0.3 \
-  && GLOBAL="$(npm root -g)" \
-  && find "$GLOBAL/npm" -type d -name "tar" -path "*/node_modules/tar" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/tar" "$d"; \
-     done \
-  && find "$GLOBAL/npm" -type d -name "glob" -path "*/node_modules/glob" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/glob" "$d"; \
-     done \
-  && find "$GLOBAL/npm" -type d -name "brace-expansion" -path "*/node_modules/@isaacs/brace-expansion" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/@isaacs/brace-expansion" "$d"; \
-     done \
-  && find "$GLOBAL/npm" -type d -name "minimatch" -path "*/node_modules/minimatch" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/minimatch" "$d"; \
-     done \
-  && find "$GLOBAL/npm" -type d -name "diff" -path "*/node_modules/diff" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/diff" "$d"; \
-     done \
-  && find /usr/local/lib /usr/lib -path "*/node_modules/npm/package.json" -exec \
-        sed -i 's/"tar": "\^7\.5\.[0-9]*"/"tar": "^7.5.10"/g; s/"minimatch": "\^10\.[0-9.]*"/"minimatch": "^10.2.4"/g' {} + 2>/dev/null \
-  && npm cache clean --force \
-  && { apk del --no-cache npm 2>/dev/null || true; }
+    done && \
+    for i in 1 2 3; do \
+      apk add --no-cache python3 bash openssl tzdata nodejs npm supervisor libsndfile && break || sleep 5; \
+    done && \
+    apk upgrade --no-cache nodejs && \
+    npm install -g npm@11.12.1 tar@7.5.11 glob@11.1.0 @isaacs/brace-expansion@5.0.1 minimatch@10.2.4 diff@8.0.3 && \
+    GLOBAL="$(npm root -g)" && \
+    find "$GLOBAL/npm" -type d -name "tar" -path "*/node_modules/tar" | while read d; do \
+      rm -rf "$d" && cp -rL "$GLOBAL/tar" "$d"; \
+    done && \
+    find "$GLOBAL/npm" -type d -name "glob" -path "*/node_modules/glob" | while read d; do \
+      rm -rf "$d" && cp -rL "$GLOBAL/glob" "$d"; \
+    done && \
+    find "$GLOBAL/npm" -type d -name "brace-expansion" -path "*/node_modules/@isaacs/brace-expansion" | while read d; do \
+      rm -rf "$d" && cp -rL "$GLOBAL/@isaacs/brace-expansion" "$d"; \
+    done && \
+    find "$GLOBAL/npm" -type d -name "minimatch" -path "*/node_modules/minimatch" | while read d; do \
+      rm -rf "$d" && cp -rL "$GLOBAL/minimatch" "$d"; \
+    done && \
+    find "$GLOBAL/npm" -type d -name "diff" -path "*/node_modules/diff" | while read d; do \
+      rm -rf "$d" && cp -rL "$GLOBAL/diff" "$d"; \
+    done && \
+    find /usr/local/lib /usr/lib -path "*/node_modules/npm/package.json" -exec \
+      sed -i 's/"tar": "\^7\.5\.[0-9]*"/"tar": "^7.5.10"/g; s/"minimatch": "\^10\.[0-9.]*"/"minimatch": "^10.2.4"/g' {} + 2>/dev/null && \
+    npm cache clean --force && \
+    { apk del --no-cache npm 2>/dev/null || true; }

-# Copy artifacts from builder
-COPY --from=builder /app/requirements.txt /app/requirements.txt
-COPY --from=builder /app/docker/entrypoint.sh /app/docker/prod_entrypoint.sh /app/docker/
-COPY --from=builder /app/docker/supervisord.conf /etc/supervisord.conf
-COPY --from=builder /app/schema.prisma /app/
-# Keep enterprise bridge module in runtime so `enterprise.enterprise_hooks`
-# can load and register managed enterprise hooks (e.g. managed_files).
-COPY --from=builder /app/enterprise /app/enterprise
-# Copy prisma_migration.py for Helm migrations job compatibility
-COPY --from=builder /app/litellm/proxy/prisma_migration.py /app/litellm/proxy/prisma_migration.py
-COPY --from=builder /wheels/ /wheels/
+COPY --from=builder /app /app
 COPY --from=builder /var/lib/litellm/ui /var/lib/litellm/ui
 COPY --from=builder /var/lib/litellm/assets /var/lib/litellm/assets
-COPY --from=builder /app/.cache /app/.cache
-COPY --from=builder /app/litellm-proxy-extras /app/litellm-proxy-extras
-COPY --from=builder \
-  /usr/lib/python3.13/site-packages/nodejs* \
-  /usr/lib/python3.13/site-packages/prisma* \
-  /usr/lib/python3.13/site-packages/tomlkit* \
-  /usr/lib/python3.13/site-packages/nodeenv* \
-  /usr/lib/python3.13/site-packages/
-COPY --from=builder /usr/bin/prisma /usr/bin/prisma
+COPY --from=builder /app/docker/supervisord.conf /etc/supervisord.conf

-# Final runtime environment configuration
-ENV PRISMA_BINARY_CACHE_DIR=/app/.cache/prisma-python/binaries \
+ENV PATH="/app/.venv/bin:${PATH}" \
+    PRISMA_BINARY_CACHE_DIR=/app/.cache/prisma-python/binaries \
    PRISMA_CLI_BINARY_TARGETS="debian-openssl-3.0.x" \
    HOME=/app \
    LITELLM_NON_ROOT=true \
-    XDG_CACHE_HOME=/app/.cache
-
-# Install packages from wheels and optional extras without network
-RUN pip install --no-index --find-links=/wheels/ -r requirements.txt && \
-    pip install --no-index --find-links=/wheels/ /wheels/litellm-*-py3-none-any.whl && \
-    pip install --no-index --find-links=/wheels/ --no-deps semantic_router==0.1.11 && \
-    pip install --no-index --find-links=/wheels/ aurelio-sdk==0.0.19 && \
-    if [ "$PROXY_EXTRAS_SOURCE" = "local" ]; then \
-      if ls /wheels/litellm_proxy_extras-*.whl >/dev/null 2>&1; then \
-        pip install --no-index --find-links=/wheels/ /wheels/litellm_proxy_extras-*.whl; \
-      else \
-        echo "litellm_proxy_extras wheel not found; skipping local install"; \
-      fi; \
-    fi
-
-# SECURITY FIX: nodejs-wheel-binaries (pip package used by Prisma) bundles a complete
-# npm with old vulnerable deps at /usr/lib/python3.*/site-packages/nodejs_wheel/.
-# Patch every copy of tar, glob, and brace-expansion inside that tree.
-RUN GLOBAL="$(npm root -g)" && \
-    [ -n "$GLOBAL" ] || { echo "ERROR: npm root -g returned empty; aborting"; exit 1; } && \
-    find /usr/lib -type d -name "tar" -path "*/node_modules/tar" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/tar" "$d"; \
-    done && \
-    find /usr/lib -type d -name "glob" -path "*/node_modules/glob" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/glob" "$d"; \
-    done && \
-    find /usr/lib -type d -name "brace-expansion" -path "*/node_modules/@isaacs/brace-expansion" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/@isaacs/brace-expansion" "$d"; \
-    done && \
-    find /usr/lib -type d -name "minimatch" -path "*/node_modules/minimatch" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/minimatch" "$d"; \
-    done && \
-    find /usr/lib -type d -name "diff" -path "*/node_modules/diff" | while read d; do \
-        rm -rf "$d" && cp -rL "$GLOBAL/diff" "$d"; \
-    done
-
-# Permissions, cleanup, and Prisma prep
-# Convert Windows line endings to Unix for entrypoint scripts
-RUN sed -i 's/\r$//' docker/entrypoint.sh && \
-    sed -i 's/\r$//' docker/prod_entrypoint.sh && \
-    chmod +x docker/entrypoint.sh docker/prod_entrypoint.sh && \
-    mkdir -p /nonexistent /.npm /var/lib/litellm/assets /var/lib/litellm/ui && \
-    chown -R nobody:nogroup /app /var/lib/litellm/ui /var/lib/litellm/assets /nonexistent /.npm && \
-    pip uninstall jwt -y || true && \
-    pip uninstall PyJWT -y || true && \
-    pip install --no-index --find-links=/wheels/ PyJWT==2.12.0 --no-cache-dir && \
-    rm -rf /wheels && \
-    PRISMA_PATH=$(python -c "import os, prisma; print(os.path.dirname(prisma.__file__))") && \
-    chown -R nobody:nogroup $PRISMA_PATH && \
-    LITELLM_PKG_MIGRATIONS_PATH="$(python -c 'import os, litellm_proxy_extras; print(os.path.dirname(litellm_proxy_extras.__file__))' 2>/dev/null || echo '')/migrations" && \
-    [ -n "$LITELLM_PKG_MIGRATIONS_PATH" ] && chown -R nobody:nogroup $LITELLM_PKG_MIGRATIONS_PATH && \
-    LITELLM_PROXY_EXTRAS_PATH=$(python -c "import os, litellm_proxy_extras; print(os.path.dirname(litellm_proxy_extras.__file__))" 2>/dev/null || echo "") && \
-    chgrp -R 0 $PRISMA_PATH /var/lib/litellm/ui /var/lib/litellm/assets && \
-    [ -n "$LITELLM_PROXY_EXTRAS_PATH" ] && chgrp -R 0 $LITELLM_PROXY_EXTRAS_PATH || true && \
-    chmod -R g=u $PRISMA_PATH /var/lib/litellm/ui /var/lib/litellm/assets && \
-    [ -n "$LITELLM_PROXY_EXTRAS_PATH" ] && chmod -R g=u $LITELLM_PROXY_EXTRAS_PATH || true && \
-    chmod -R g+w $PRISMA_PATH /var/lib/litellm/ui /var/lib/litellm/assets && \
-    [ -n "$LITELLM_PROXY_EXTRAS_PATH" ] && chmod -R g+w $LITELLM_PROXY_EXTRAS_PATH || true && \
-    chmod -R g+rX $PRISMA_PATH && \
-    chmod -R g+rX /app/.cache && \
-    mkdir -p /tmp/.npm /nonexistent /.npm
-
-# Switch to non-root user for runtime
-USER nobody
-
-# Generate Prisma client as nobody user to ensure correct file ownership
-RUN prisma generate
-
-# Prisma runtime knobs for offline containers
-ENV PRISMA_SKIP_POSTINSTALL_GENERATE=1 \
+    XDG_CACHE_HOME=/app/.cache \
+    PRISMA_SKIP_POSTINSTALL_GENERATE=1 \
    PRISMA_HIDE_UPDATE_MESSAGE=1 \
    PRISMA_ENGINES_CHECKSUM_IGNORE_MISSING=1 \
    NPM_CONFIG_CACHE=/app/.cache/npm \
    NPM_CONFIG_PREFER_OFFLINE=true \
    PRISMA_OFFLINE_MODE=true

+RUN sed -i 's/\r$//' docker/entrypoint.sh && \
+    sed -i 's/\r$//' docker/prod_entrypoint.sh && \
+    chmod +x docker/entrypoint.sh docker/prod_entrypoint.sh && \
+    mkdir -p /nonexistent /.npm /var/lib/litellm/assets /var/lib/litellm/ui /tmp/.npm && \
+    chown -R nobody:nogroup /app /var/lib/litellm/ui /var/lib/litellm/assets /nonexistent /.npm /tmp/.npm && \
+    PRISMA_PATH=$(python -c "import os, prisma; print(os.path.dirname(prisma.__file__))") && \
+    chown -R nobody:nogroup "$PRISMA_PATH" && \
+    LITELLM_PKG_MIGRATIONS_PATH="$(python -c 'import os, litellm_proxy_extras; print(os.path.dirname(litellm_proxy_extras.__file__))' 2>/dev/null || echo '')/migrations" && \
+    [ -n "$LITELLM_PKG_MIGRATIONS_PATH" ] && chown -R nobody:nogroup "$LITELLM_PKG_MIGRATIONS_PATH" || true && \
+    LITELLM_PROXY_EXTRAS_PATH=$(python -c "import os, litellm_proxy_extras; print(os.path.dirname(litellm_proxy_extras.__file__))" 2>/dev/null || echo "") && \
+    chgrp -R 0 "$PRISMA_PATH" /var/lib/litellm/ui /var/lib/litellm/assets && \
+    [ -n "$LITELLM_PROXY_EXTRAS_PATH" ] && chgrp -R 0 "$LITELLM_PROXY_EXTRAS_PATH" || true && \
+    chmod -R g=u "$PRISMA_PATH" /var/lib/litellm/ui /var/lib/litellm/assets && \
+    [ -n "$LITELLM_PROXY_EXTRAS_PATH" ] && chmod -R g=u "$LITELLM_PROXY_EXTRAS_PATH" || true && \
+    chmod -R g+w "$PRISMA_PATH" /var/lib/litellm/ui /var/lib/litellm/assets && \
+    [ -n "$LITELLM_PROXY_EXTRAS_PATH" ] && chmod -R g+w "$LITELLM_PROXY_EXTRAS_PATH" || true && \
+    chmod -R g+rX "$PRISMA_PATH" /var/lib/litellm/ui /var/lib/litellm/assets /app/.cache
+
+USER nobody
+
+RUN prisma generate --schema=./schema.prisma
+
 EXPOSE 4000/tcp
+
 ENTRYPOINT ["/app/docker/prod_entrypoint.sh"]
 CMD ["--port", "4000"]
--- a/docker/build_from_pip/Dockerfile.build_from_pip
+++ b/docker/build_from_pip/Dockerfile.build_from_pip
@ -1,27 +1,53 @@
+ARG UV_IMAGE=ghcr.io/astral-sh/uv:0.10.9@sha256:10902f58a1606787602f303954cea099626a4adb02acbac4c69920fe9d278f82
+FROM $UV_IMAGE AS uvbin
+
 FROM python:3.13-slim@sha256:739e7213785e88c0f702dcdc12c0973afcbd606dbf021a589cab77d6b00b579d

+ARG LITELLM_VERSION=1.83.0
+
 WORKDIR /app

-ENV HOME=/home/litellm
-ENV PATH="${HOME}/venv/bin:$PATH"
+COPY --from=uvbin /uv /usr/local/bin/uv
+COPY --from=uvbin /uvx /usr/local/bin/uvx

-# Install runtime dependencies needed for building native extensions
 RUN apt-get update && \
-    apt-get install -y --no-install-recommends gcc libffi-dev && \
+    apt-get install -y --no-install-recommends gcc libffi-dev nodejs npm && \
    rm -rf /var/lib/apt/lists/*

-RUN python -m venv ${HOME}/venv
-RUN ${HOME}/venv/bin/pip install --no-cache-dir --upgrade pip==26.0.1
+ENV UV_PROJECT_ENVIRONMENT=/app/.venv \
+    UV_LINK_MODE=copy \
+    PATH="/app/.venv/bin:${PATH}"

-COPY docker/build_from_pip/requirements.txt .
-RUN --mount=type=cache,target=${HOME}/.cache/pip \
-    ${HOME}/venv/bin/pip install -r requirements.txt
-
-# Copy Prisma schema file
 COPY schema.prisma .

-# Generate prisma client
-RUN prisma generate
+# This image is specifically for validating/installing the published PyPI
+# artifact, not the checked-out source tree.
+# Keep the moved proxy-runtime packages explicit until the published PyPI
+# artifact includes that extra; newer releases will simply dedupe these.
+RUN uv venv --python python && \
+    uv pip install --python /app/.venv/bin/python \
+      "litellm[proxy,proxy-runtime]==${LITELLM_VERSION}" \
+      "google-cloud-aiplatform==1.133.0" \
+      "google-genai==1.37.0" \
+      "anthropic[vertex]==0.84.0" \
+      "grpcio==1.78.0" \
+      "prometheus-client==0.20.0" \
+      "langfuse==2.59.7" \
+      "opentelemetry-api==1.28.0" \
+      "opentelemetry-sdk==1.28.0" \
+      "opentelemetry-exporter-otlp==1.28.0" \
+      "ddtrace==2.19.0" \
+      "sentry-sdk==2.21.0" \
+      "mangum==0.17.0" \
+      "azure-ai-contentsafety==1.0.0" \
+      "azure-storage-file-datalake==12.20.0" \
+      "pypdf==6.7.5" \
+      "llm-sandbox==0.3.31" \
+      "detect-secrets==1.5.0" \
+      "prisma==0.11.0" \
+      "openai==2.24.0"
+
+RUN prisma generate --schema=./schema.prisma

 EXPOSE 4000/tcp

--- a/docker/build_from_pip/requirements.txt
+++ b/docker/build_from_pip/requirements.txt
@ -1,6 +0,0 @@
-litellm[proxy]==1.83.0
-prometheus_client==0.20.0
-langfuse==2.59.7
-prisma==0.11.0
-openai==2.24.0
-ddtrace==2.19.0 # for advanced DD tracing / profiling
--- a/docker/entrypoint.sh
+++ b/docker/entrypoint.sh
@ -1,13 +1,16 @@
 #!/bin/bash
-echo $(pwd)
+set -euo pipefail

-# Run the Python migration script
-python3 litellm/proxy/prisma_migration.py
+REPO_ROOT="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")/.." && pwd)"
+VENV_PYTHON="$REPO_ROOT/.venv/bin/python"
+MIGRATION_SCRIPT="$REPO_ROOT/litellm/proxy/prisma_migration.py"

-# Check if the Python script executed successfully
-if [ $? -eq 0 ]; then
-    echo "Migration script ran successfully!"
+if [ -x "$VENV_PYTHON" ]; then
+    "$VENV_PYTHON" "$MIGRATION_SCRIPT"
+elif command -v uv >/dev/null 2>&1; then
+    (cd "$REPO_ROOT" && uv run --no-sync python "$MIGRATION_SCRIPT")
 else
-    echo "Migration script failed!"
-    exit 1
+    python3 "$MIGRATION_SCRIPT"
 fi
+
+echo "Migration script ran successfully!"
--- a/docker/install_auto_router.sh
+++ b/docker/install_auto_router.sh
@ -1,3 +1,4 @@
 #!/bin/bash
-pip install semantic_router==0.1.11 --no-deps
-pip install aurelio-sdk==0.0.19 --no-deps
+set -euo pipefail
+
+# semantic-router dependencies are installed via `uv sync`.
--- a/docs/my-website/Dockerfile
+++ b/docs/my-website/Dockerfile
@ -1,9 +1,32 @@
+ARG UV_IMAGE=ghcr.io/astral-sh/uv:0.10.9
+
+FROM $UV_IMAGE AS uvbin
+
 FROM python:3.14.0a3-slim

+COPY --from=uvbin /uv /usr/local/bin/uv
+COPY --from=uvbin /uvx /usr/local/bin/uvx
 COPY . /app
 WORKDIR /app
-RUN pip install -r requirements.txt
+
+ENV UV_PROJECT_ENVIRONMENT=/app/.venv \
+    UV_LINK_MODE=copy \
+    PATH="/app/.venv/bin:${PATH}"
+
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    gcc \
+    python3-dev \
+    libssl-dev \
+    pkg-config \
+    && rm -rf /var/lib/apt/lists/*
+
+RUN uv sync --frozen --no-default-groups --no-editable \
+    --extra proxy \
+    --extra proxy-runtime \
+    --extra extra_proxy \
+    --extra semantic-router \
+    --python python

 EXPOSE $PORT 

-CMD litellm --host 0.0.0.0 --port $PORT --workers 10 --config config.yaml
+CMD ["sh", "-c", "litellm --host 0.0.0.0 --port $PORT --workers 10 --config config.yaml"]
--- a/docs/my-website/docs/adding_provider/generic_prompt_management_api.md
+++ b/docs/my-website/docs/adding_provider/generic_prompt_management_api.md
@ -378,7 +378,7 @@ if __name__ == "__main__":

 1. Install dependencies:
 ```bash
-pip install fastapi uvicorn
+uv add fastapi uvicorn
 ```

 2. Save the code above to `prompt_server.py`
--- a/docs/my-website/docs/caching/all_caches.md
+++ b/docs/my-website/docs/caching/all_caches.md
@ -23,7 +23,7 @@ import TabItem from '@theme/TabItem';

 Install redis
 ```shell
-pip install redis
+uv add redis
 ```

 For the hosted version you can setup your own Redis DB here: https://redis.io/try-free/
@ -55,7 +55,7 @@ response2 = completion(
 For GCP Memorystore Redis with IAM authentication:

 ```shell
-pip install google-cloud-iam
+uv add google-cloud-iam
 ```

 ```python
@ -150,7 +150,7 @@ response2 = completion(

 Install boto3
 ```shell
-pip install boto3
+uv add boto3
 ```

 Set AWS environment variables
@ -187,7 +187,7 @@ response2 = completion(

 Install azure-storage-blob and azure-identity
 ```shell
-pip install azure-storage-blob azure-identity
+uv add azure-storage-blob azure-identity
 ```

 ```python
@ -219,7 +219,7 @@ response2 = completion(

 Install redisvl client
 ```shell
-pip install redisvl==0.4.1
+uv add redisvl==0.4.1
 ```

 For the hosted version you can setup your own Redis DB here: https://redis.io/try-free/
@ -366,7 +366,7 @@ response2 = completion(
 Install the disk caching extra:

 ```shell
-pip install "litellm[caching]"
+uv add "litellm[caching]"
 ```

 Then you can use the disk cache as follows.
--- a/docs/my-website/docs/completion/message_sanitization.md
+++ b/docs/my-website/docs/completion/message_sanitization.md
@ -401,7 +401,7 @@ response = litellm.completion(

 3. Ensure you're using a recent version of LiteLLM:
   ```bash
-   pip install --upgrade litellm
+   uv add --upgrade-package litellm litellm
   ```

 ### Unexpected Dummy Tool Results
--- a/docs/my-website/docs/contributing.md
+++ b/docs/my-website/docs/contributing.md
@ -29,7 +29,7 @@ general_settings:
 Start the proxy on port 4000:

 ```bash
-poetry run litellm --config config.yaml --port 4000
+uv run litellm --config config.yaml --port 4000
 ```

 The UI comes pre-built in the repo. Access it at `http://localhost:4000/ui`
--- a/docs/my-website/docs/default_code_snippet.md
+++ b/docs/my-website/docs/default_code_snippet.md
@ -16,7 +16,7 @@ If you want to use the non-hosted version, [go here](https://docs.litellm.ai/doc


 ```
-pip install litellm
+uv add litellm
 ```

 <QueryParamReader/>
--- a/docs/my-website/docs/extras/contributing_code.md
+++ b/docs/my-website/docs/extras/contributing_code.md
@ -41,7 +41,7 @@ git clone https://github.com/BerriAI/litellm.git
 Step 2: Install dev dependencies

 ```shell
-poetry install --with dev --extras proxy
+uv sync --group dev --extra proxy
 ```

 ### 2. Adding tests
--- a/docs/my-website/docs/index.md
+++ b/docs/my-website/docs/index.md
@ -26,13 +26,13 @@ import Image from '@theme/IdealImage';
 ## Installation

 ```shell
-pip install litellm
+uv add litellm
 ```

 To run the full Proxy Server (LLM Gateway):

 ```shell
-pip install 'litellm[proxy]'
+uv tool install 'litellm[proxy]'
 ```

 ---
@ -336,7 +336,7 @@ The proxy is a self-hosted OpenAI-compatible gateway. Any client that works with
 #### Step 1 — Start the proxy

 <Tabs>
-<TabItem value="pip" label="pip">
+<TabItem value="cli" label="LiteLLM CLI">

 ```shell
 litellm --model huggingface/bigcode/starcoder
--- a/docs/my-website/docs/integrations/letta.md
+++ b/docs/my-website/docs/integrations/letta.md
@ -16,7 +16,7 @@ Letta allows you to build LLM agents that can:
 ## Prerequisites

 ```bash
-pip install letta litellm
+uv add letta litellm
 ```

 ## Quick Start
@ -910,7 +910,7 @@ for model in models:
 ```

 ### Common SDK Issues
- **Import errors**: Ensure `pip install litellm letta` is run
+- **Import errors**: Ensure `uv add litellm letta` is run
 - **Model format**: Use `provider/model` format (e.g., `openai/gpt-4`)
 - **API key format**: Different providers have different key formats
 - **Rate limits**: Implement exponential backoff for retries
--- a/docs/my-website/docs/langchain/langchain.md
+++ b/docs/my-website/docs/langchain/langchain.md
@ -5,7 +5,7 @@ import TabItem from '@theme/TabItem';

 ## Pre-Requisites
 ```shell
-!pip install litellm langchain
+!uv add litellm langchain
 ```
 ## Quick Start

--- a/docs/my-website/docs/learn/gateway_quickstart.md
+++ b/docs/my-website/docs/learn/gateway_quickstart.md
@ -13,7 +13,7 @@ If you need a Docker or database-first setup, use the [Docker + Database tutoria
 ## 1. Install The Gateway

 ```bash
-pip install 'litellm[proxy]'
+uv tool install 'litellm[proxy]'
 ```

 ## 2. Set One Provider Key
--- a/docs/my-website/docs/learn/sdk_quickstart.md
+++ b/docs/my-website/docs/learn/sdk_quickstart.md
@ -11,7 +11,7 @@ Use this path if you are integrating LiteLLM directly into application code.
 ## 1. Install LiteLLM

 ```bash
-pip install litellm==1.82.6
+uv add 'litellm==1.82.6'
 ```

 ## 2. Set Provider Credentials
--- a/docs/my-website/docs/load_test.md
+++ b/docs/my-website/docs/load_test.md
@ -17,7 +17,7 @@ model_list:
      api_base: https://exampleopenaiendpoint-production.up.railway.app/
 ```

-2. `pip install locust`
+2. `uv add locust`

 3. Create a file called `locustfile.py` on your local machine. Copy the contents from the litellm load test located [here](https://github.com/BerriAI/litellm/blob/main/.github/workflows/locustfile.py)

--- a/docs/my-website/docs/load_test_advanced.md
+++ b/docs/my-website/docs/load_test_advanced.md
@ -70,7 +70,7 @@ litellm_settings:
  callbacks: ["prometheus"] # Enterprise LiteLLM Only - use prometheus to get metrics on your load test
 ```

-2. `pip install locust`
+2. `uv add locust`

 3. Create a file called `locustfile.py` on your local machine. Copy the contents from the litellm load test located [here](https://github.com/BerriAI/litellm/blob/main/.github/workflows/locustfile.py)

@ -138,7 +138,7 @@ litellm_settings:
  callbacks: ["prometheus"] # Enterprise LiteLLM Only - use prometheus to get metrics on your load test
 ```

-2. `pip install locust`
+2. `uv add locust`

 3. Create a file called `locustfile.py` on your local machine. Copy the contents from the litellm load test located [here](https://github.com/BerriAI/litellm/blob/main/.github/workflows/locustfile.py)

--- a/docs/my-website/docs/mcp_aws_sigv4.md
+++ b/docs/my-website/docs/mcp_aws_sigv4.md
@ -224,7 +224,7 @@ SigV4-authenticated MCP servers skip the standard health check on proxy startup.
 Install the `botocore` package:

 ```bash
-pip install botocore
+uv add botocore
 ```

 `botocore` is used for SigV4 credential handling and is required when using `aws_sigv4` auth.
--- a/docs/my-website/docs/mcp_oauth.md
+++ b/docs/my-website/docs/mcp_oauth.md
@ -205,7 +205,7 @@ sequenceDiagram
 Use [BerriAI/mock-oauth2-mcp-server](https://github.com/BerriAI/mock-oauth2-mcp-server) to test locally:

 ```bash title="Terminal 1 - Start mock server" showLineNumbers
-pip install fastapi uvicorn
+uv add fastapi uvicorn
 python mock_oauth2_mcp_server.py  # starts on :8765
 ```

--- a/docs/my-website/docs/observability/braintrust.md
+++ b/docs/my-website/docs/observability/braintrust.md
@ -9,7 +9,7 @@ import TabItem from '@theme/TabItem';
 ## Quick Start

 ```python
-# pip install braintrust
+# uv add braintrust
 import litellm
 import os

--- a/docs/my-website/docs/observability/lago.md
+++ b/docs/my-website/docs/observability/lago.md
@ -22,7 +22,7 @@ litellm.callbacks = ["lago"] # logs cost + usage of successful calls to lago
 <TabItem value="sdk" label="SDK">

 ```python
-# pip install lago 
+# uv add lago 
 import litellm
 import os

--- a/docs/my-website/docs/observability/langfuse_integration.md
+++ b/docs/my-website/docs/observability/langfuse_integration.md
@ -26,9 +26,9 @@ For Langfuse v3, we recommend using the [Langfuse OTEL](./langfuse_otel_integrat
 ## Usage with LiteLLM Python SDK

 ### Pre-Requisites
-Ensure you have run `pip install langfuse` for this integration
+Ensure you have run `uv add langfuse` for this integration
 ```shell
-pip install langfuse==2.59.7 litellm
+uv add langfuse==2.59.7 litellm
 ```

 ### Quick Start
@ -44,7 +44,7 @@ litellm.success_callback = ["langfuse"]
 litellm.failure_callback = ["langfuse"] # logs errors to langfuse
 ```
 ```python
-# pip install langfuse 
+# uv add langfuse 
 import litellm
 import os

@ -335,7 +335,7 @@ Be aware that if you are continuing an existing trace, and you set `update_trace

 ## Troubleshooting & Errors
 ### Data not getting logged to Langfuse ? 
- Ensure you're on the latest version of langfuse `pip install langfuse -U`. The latest version allows litellm to log JSON input/outputs to langfuse
+- Ensure you're on the latest version of langfuse `uv add langfuse -U`. The latest version allows litellm to log JSON input/outputs to langfuse
 - Follow [this checklist](https://langfuse.com/faq/all/missing-traces) if you don't see any traces in langfuse.

 ## Support & Talk to Founders
--- a/docs/my-website/docs/observability/langfuse_otel_integration.md
+++ b/docs/my-website/docs/observability/langfuse_otel_integration.md
@ -24,7 +24,7 @@ The Langfuse OpenTelemetry integration allows you to send LiteLLM traces and obs
 2. **API Keys**: Get your public and secret keys from your Langfuse project settings
 3. **Dependencies**: Install required packages:
   ```bash
-   pip install litellm opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp
+   uv add litellm opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp
   ```

 ## Configuration
--- a/docs/my-website/docs/observability/langsmith_integration.md
+++ b/docs/my-website/docs/observability/langsmith_integration.md
@ -18,7 +18,7 @@ join our [discord](https://discord.gg/wuPM9dRgDw)

 ## Pre-Requisites
 ```shell
-pip install litellm
+uv add litellm
 ```

 ## Quick Start
--- a/docs/my-website/docs/observability/levo_integration.md
+++ b/docs/my-website/docs/observability/levo_integration.md
@ -36,7 +36,7 @@ Send all your LLM requests and responses to Levo for monitoring and analysis usi
 **1. Install OpenTelemetry dependencies:**

 ```bash
-pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp-proto-http opentelemetry-exporter-otlp-proto-grpc
+uv add opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp-proto-http opentelemetry-exporter-otlp-proto-grpc
 ```

 **2. Enable Levo callback in your LiteLLM config:**
@ -133,7 +133,7 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
   ```

 4. **Check for initialization errors**: Look for errors in LiteLLM startup logs. Common issues:
-   - Missing OpenTelemetry packages: Install with `pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp-proto-http opentelemetry-exporter-otlp-proto-grpc`
+   - Missing OpenTelemetry packages: Install with `uv add opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp-proto-http opentelemetry-exporter-otlp-proto-grpc`
   - Missing required environment variables: All four required variables must be set
   - Invalid collector URL: Ensure the URL is correct and reachable

@ -150,7 +150,7 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
 - Solution: Set the `LEVOAI_COLLECTOR_URL` environment variable with your collector endpoint URL from Levo support.

 **Error: "No module named 'opentelemetry'"**
- Solution: Install OpenTelemetry packages: `pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp-proto-http opentelemetry-exporter-otlp-proto-grpc`
+- Solution: Install OpenTelemetry packages: `uv add opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp-proto-http opentelemetry-exporter-otlp-proto-grpc`

 ## Additional Resources

--- a/docs/my-website/docs/observability/literalai_integration.md
+++ b/docs/my-website/docs/observability/literalai_integration.md
@ -11,7 +11,7 @@ import Image from '@theme/IdealImage';
 Ensure you have the `literalai` package installed:

 ```shell
-pip install literalai litellm
+uv add literalai litellm
 ```

 ## Quick Start
--- a/docs/my-website/docs/observability/logfire_integration.md
+++ b/docs/my-website/docs/observability/logfire_integration.md
@ -17,11 +17,11 @@ join our [discord](https://discord.gg/wuPM9dRgDw)
 Ensure you have installed the following packages to use this integration

 ```shell
-pip install litellm
+uv add litellm

-pip install opentelemetry-api==1.25.0
-pip install opentelemetry-sdk==1.25.0
-pip install opentelemetry-exporter-otlp==1.25.0
+uv add opentelemetry-api==1.25.0
+uv add opentelemetry-sdk==1.25.0
+uv add opentelemetry-exporter-otlp==1.25.0
 ```

 ## Quick Start
@ -33,7 +33,7 @@ litellm.callbacks = ["logfire"]
 ```

 ```python
-# pip install logfire
+# uv add logfire
 import litellm
 import os

--- a/docs/my-website/docs/observability/lunary_integration.md
+++ b/docs/my-website/docs/observability/lunary_integration.md
@ -15,7 +15,7 @@ You can reach out to us anytime by [email](mailto:hello@lunary.ai) or directly [
 ### Pre-Requisites

 ```shell
-pip install litellm lunary
+uv add litellm lunary
 ```

 ### Quick Start
@ -124,7 +124,7 @@ my_chain("Chain input")
 ### Step1: Install dependencies and set your environment variables 
 Install the dependencies
 ```shell
-pip install litellm lunary
+uv add litellm lunary
 ```

 Get you Lunary public key from from https://app.lunary.ai/settings 
--- a/docs/my-website/docs/observability/mlflow.md
+++ b/docs/my-website/docs/observability/mlflow.md
@ -17,7 +17,7 @@ MLflow’s integration with LiteLLM supports advanced observability compatible w
 Install MLflow:

 ```shell
-pip install "litellm[mlflow]"
+uv add "litellm[mlflow]"
 ```

 To enable MLflow auto tracing for LiteLLM:
@ -167,7 +167,7 @@ This approach generates a unified trace, combining your custom Python code with
 For using `mlflow` on LiteLLM Proxy Server, you need to install the `mlflow` package on your docker container.

 ```shell
-pip install "mlflow>=3.1.4"
+uv add "mlflow>=3.1.4"
 ```

 ### Configuration
--- a/docs/my-website/docs/observability/openmeter.md
+++ b/docs/my-website/docs/observability/openmeter.md
@ -28,7 +28,7 @@ litellm.callbacks = ["openmeter"] # logs cost + usage of successful calls to ope
 <TabItem value="sdk" label="SDK">

 ```python
-# pip install openmeter 
+# uv add openmeter 
 import litellm
 import os

--- a/docs/my-website/docs/observability/opentelemetry_integration.md
+++ b/docs/my-website/docs/observability/opentelemetry_integration.md
@ -27,7 +27,7 @@ USE_OTEL_LITELLM_REQUEST_SPAN=true
 Install the OpenTelemetry SDK:

 ```
-pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp
+uv add opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp
 ```

 Set the environment variables (different providers may require different variables):
@ -63,7 +63,7 @@ OTEL_EXPORTER_OTLP_PROTOCOL=grpc
 OTEL_EXPORTER_OTLP_HEADERS="api-key=key,other-config-value=value"
 ```

-> Note: OTLP gRPC requires `grpcio`. Install via `pip install "litellm[grpc]"` (or `grpcio`).
+> Note: OTLP gRPC requires `grpcio`. Install via `uv add "litellm[grpc]"` (or `grpcio`).

 </TabItem>

@ -75,7 +75,7 @@ OTEL_ENDPOINT="https://api.lmnr.ai:8443"
 OTEL_HEADERS="authorization=Bearer <project-api-key>"
 ```

-> Note: OTLP gRPC requires `grpcio`. Install via `pip install "litellm[grpc]"` (or `grpcio`).
+> Note: OTLP gRPC requires `grpcio`. Install via `uv add "litellm[grpc]"` (or `grpcio`).

 </TabItem>

--- a/docs/my-website/docs/observability/phoenix_integration.md
+++ b/docs/my-website/docs/observability/phoenix_integration.md
@ -22,7 +22,7 @@ Use just 2 lines of code, to instantly log your responses **across all providers
 You can also use the instrumentor option instead of the callback, which you can find [here](https://docs.arize.com/phoenix/tracing/integrations-tracing/litellm).

 ```bash
-pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp litellm[proxy]
+uv add opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp litellm[proxy]
 ```
 ```python
 litellm.callbacks = ["arize_phoenix"]
@ -73,7 +73,7 @@ environment_variables:
    PHOENIX_COLLECTOR_HTTP_ENDPOINT: "https://app.phoenix.arize.com/s/<space-name>/v1/traces" # OPTIONAL - For setting the HTTP endpoint
 ```

-> Note: If you set the gRPC endpoint, install `grpcio` via `pip install "litellm[grpc]"` (or `grpcio`).
+> Note: If you set the gRPC endpoint, install `grpcio` via `uv add "litellm[grpc]"` (or `grpcio`).

 2. Start the proxy

--- a/docs/my-website/docs/observability/qualifire_integration.md
+++ b/docs/my-website/docs/observability/qualifire_integration.md
@ -23,7 +23,7 @@ Looking for Qualifire Guardrails? Check out the [Qualifire Guardrails Integratio
 2. Get your API key and webhook URL from the Qualifire dashboard

 ```bash
-pip install litellm
+uv add litellm
 ```

 ## Quick Start
--- a/docs/my-website/docs/observability/raw_request_response.md
+++ b/docs/my-website/docs/observability/raw_request_response.md
@ -12,7 +12,7 @@ See the raw request/response sent by LiteLLM in your logging provider (OTEL/Lang
 <TabItem value="sdk" label="SDK">

 ```python
-# pip install langfuse 
+# uv add langfuse 
 import litellm
 import os

--- a/docs/my-website/docs/observability/scrub_data.md
+++ b/docs/my-website/docs/observability/scrub_data.md
@ -60,7 +60,7 @@ litellm.callbacks = [customHandler]
 3. Test it!

 ```python
-# pip install langfuse 
+# uv add langfuse 

 import os
 import litellm
--- a/docs/my-website/docs/observability/signoz.md
+++ b/docs/my-website/docs/observability/signoz.md
@ -17,7 +17,7 @@ Instrumenting LiteLLM in your AI applications with telemetry ensures full observ
 - A [SigNoz Cloud account](https://signoz.io/teams/) with an active ingestion key
 - Internet access to send telemetry data to SigNoz Cloud
 - [LiteLLM](https://www.litellm.ai/) SDK or Proxy integration
- For Python: `pip` installed for managing Python packages and _(optional but recommended)_ a Python virtual environment to isolate dependencies
+- For Python: `uv` installed for managing Python packages and _(optional but recommended)_ a Python virtual environment to isolate dependencies

 ## Monitoring LiteLLM

@ -37,7 +37,7 @@ No-code auto-instrumentation is recommended for quick setup with minimal code ch
 **Step 1:** Install the necessary packages in your Python environment.

 ```bash
-pip install \
+uv add \
  opentelemetry-api \
  opentelemetry-distro \
  opentelemetry-exporter-otlp \
@ -99,7 +99,7 @@ OTEL_PYTHON_DISABLED_INSTRUMENTATIONS=openai \
 opentelemetry-instrument <your_run_command>
 ```

-> Note: OTLP gRPC requires `grpcio`. Install via `pip install "litellm[grpc]"` (or `grpcio`).
+> Note: OTLP gRPC requires `grpcio`. Install via `uv add "litellm[grpc]"` (or `grpcio`).

 > 📌 Note: We're using `OTEL_PYTHON_DISABLED_INSTRUMENTATIONS=openai` in the run command to disable the OpenAI instrumentor for tracing. This avoids conflicts with LiteLLM's native telemetry/instrumentation, ensuring that telemetry is captured exclusively through LiteLLM's built-in instrumentation.

@ -120,7 +120,7 @@ Code-based instrumentation gives you fine-grained control over your telemetry co
 **Step 1:** Install the necessary packages in your Python environment.

 ```bash
-pip install \
+uv add \
  opentelemetry-api \
  opentelemetry-sdk \
  opentelemetry-exporter-otlp \
@ -338,7 +338,7 @@ You can also check out our custom LiteLLM SDK dashboard [here](https://signoz.i
 **Step 1:** Install the necessary packages in your Python environment.

 ```bash
-pip install opentelemetry-api \
+uv add opentelemetry-api \
  opentelemetry-sdk \
  opentelemetry-exporter-otlp \
  'litellm[proxy]'
@ -364,7 +364,7 @@ export OTEL_METRICS_EXPORTER="otlp"
 export OTEL_LOGS_EXPORTER="otlp"
 ```

-> Note: OTLP gRPC requires `grpcio`. Install via `pip install "litellm[grpc]"` (or `grpcio`).
+> Note: OTLP gRPC requires `grpcio`. Install via `uv add "litellm[grpc]"` (or `grpcio`).

 - Set the `<region>` to match your SigNoz Cloud [region](https://signoz.io/docs/ingestion/signoz-cloud/overview/#endpoint)
 - Replace `<your_ingestion_key>` with your SigNoz [ingestion key](https://signoz.io/docs/ingestion/signoz-cloud/keys/)
--- a/docs/my-website/docs/observability/slack_integration.md
+++ b/docs/my-website/docs/observability/slack_integration.md
@ -13,7 +13,7 @@ join our [discord](https://discord.gg/wuPM9dRgDw)

 ### Step 1
 ```shell
-pip install litellm
+uv add litellm
 ```

 ### Step 2
--- a/docs/my-website/docs/observability/sumologic_integration.md
+++ b/docs/my-website/docs/observability/sumologic_integration.md
@ -25,7 +25,7 @@ join our [discord](https://discord.gg/wuPM9dRgDw)
 For more details, see the [HTTP Logs & Metrics Source](https://www.sumologic.com/help/docs/send-data/hosted-collectors/http-source/logs-metrics/) documentation.

 ```shell
-pip install litellm
+uv add litellm
 ```

 ## Quick Start
--- a/docs/my-website/docs/observability/wandb_integration.md
+++ b/docs/my-website/docs/observability/wandb_integration.md
@ -21,9 +21,9 @@ join our [discord](https://discord.gg/wuPM9dRgDw)
 ::: 

 ## Pre-Requisites
-Ensure you have run `pip install wandb` for this integration
+Ensure you have run `uv add wandb` for this integration
 ```shell
-pip install wandb litellm
+uv add wandb litellm
 ```

 ## Quick Start
@ -33,7 +33,7 @@ Use just 2 lines of code, to instantly log your responses **across all providers
 litellm.success_callback = ["wandb"]
 ```
 ```python
-# pip install wandb 
+# uv add wandb 
 import litellm
 import os

--- a/docs/my-website/docs/pass_through/bedrock.md
+++ b/docs/my-website/docs/pass_through/bedrock.md
@ -566,7 +566,7 @@ You can use the [LangChain AWS SDK](https://python.langchain.com/docs/integratio
 **1. Install LangChain AWS**:

 ```bash showLineNumbers
-pip install langchain-aws
+uv add langchain-aws
 ```

 **2. Setup LiteLLM Proxy**:
--- a/docs/my-website/docs/projects/Harbor.md
+++ b/docs/my-website/docs/projects/Harbor.md
@ -5,7 +5,7 @@

 ```bash
 # Install
-pip install harbor
+uv add harbor

 # Run a benchmark with any LiteLLM-supported model
 harbor run --dataset terminal-bench@2.0 \
--- a/docs/my-website/docs/projects/openai-agents.md
+++ b/docs/my-website/docs/projects/openai-agents.md
@ -12,7 +12,7 @@ The [OpenAI Agents SDK](https://github.com/openai/openai-agents-python) is a lig
 ### 1. Install Dependencies

 ```bash
-pip install "openai-agents[litellm]"
+uv add "openai-agents[litellm]"
 ```

 ### 2. Add Model to Config
--- a/docs/my-website/docs/providers/azure/azure.md
+++ b/docs/my-website/docs/providers/azure/azure.md
@ -1143,7 +1143,7 @@ In production, [Router connects to a Redis Cache](#redis-queue) to track usage a
 #### Quick Start

 ```python
-pip install litellm
+uv add litellm
 ```

 ```python
--- a/docs/my-website/docs/providers/azure_ai.md
+++ b/docs/my-website/docs/providers/azure_ai.md
@ -121,7 +121,7 @@ response = completion(
 See all litellm.completion supported params [here](../completion/input.md#translated-openai-params)

 ```python
-# !pip install litellm
+# !uv add litellm
 from litellm import completion
 import os
 ## set ENV variables
--- a/docs/my-website/docs/providers/bedrock.md
+++ b/docs/my-website/docs/providers/bedrock.md
@ -16,7 +16,7 @@ ALL Bedrock models (Anthropic, Meta, Deepseek, Mistral, Amazon, etc.) are Suppor

 LiteLLM requires `boto3` to be installed on your system for Bedrock requests
 ```shell
-pip install boto3>=1.28.57
+uv add boto3>=1.28.57
 ```

 :::info
--- a/docs/my-website/docs/providers/bedrock_realtime_with_audio.md
+++ b/docs/my-website/docs/providers/bedrock_realtime_with_audio.md
@ -319,7 +319,7 @@ Complete working examples are available in the LiteLLM repository:
 ## Requirements

 ```bash
-pip install litellm websockets pyaudio
+uv add litellm websockets pyaudio
 ```

 ## AWS Configuration
--- a/docs/my-website/docs/providers/bytez.md
+++ b/docs/my-website/docs/providers/bytez.md
@ -126,7 +126,7 @@ If you wish to use custom formatting, please let us know via either [help@bytez.
 See all litellm.completion supported params [here](https://docs.litellm.ai/docs/completion/input)

 ```py
-# !pip install litellm
+# !uv add litellm
 from litellm import completion
 import os
 ## set ENV variables
@ -160,7 +160,7 @@ Any kwarg supported by huggingface we also support! (Provided the model supports
 Example `repetition_penalty`

 ```py
-# !pip install litellm
+# !uv add litellm
 from litellm import completion
 import os
 ## set ENV variables
--- a/docs/my-website/docs/providers/clarifai.md
+++ b/docs/my-website/docs/providers/clarifai.md
@ -14,7 +14,7 @@ Anthropic, OpenAI, Qwen, xAI, Gemini and most of Open soured LLMs are Supported
 ## Pre-Requisites

 ```bash
-pip install litellm
+uv add litellm
 ```

 ## Required Environment Variables
--- a/docs/my-website/docs/providers/databricks.md
+++ b/docs/my-website/docs/providers/databricks.md
@ -59,7 +59,7 @@ If no credentials are provided, LiteLLM will use the Databricks SDK for automati
 from litellm import completion

 # No environment variables needed - uses Databricks SDK unified auth
-# Requires: pip install databricks-sdk
+# Requires: uv add databricks-sdk
 response = completion(
    model="databricks/databricks-dbrx-instruct",
    messages=[{"role": "user", "content": "Hello!"}],
@ -220,7 +220,7 @@ response = completion(
 See all litellm.completion supported params [here](../completion/input.md#translated-openai-params)

 ```python
-# !pip install litellm
+# !uv add litellm
 from litellm import completion
 import os
 ## set ENV variables
@ -457,7 +457,7 @@ For embedding models, databricks lets you pass in an additional param 'instructi


 ```python
-# !pip install litellm
+# !uv add litellm
 from litellm import embedding
 import os
 ## set ENV variables
--- a/docs/my-website/docs/providers/huggingface.md
+++ b/docs/my-website/docs/providers/huggingface.md
@ -341,7 +341,7 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
 <TabItem value="python" label="python">

 ```python
-# pip install openai
+# uv add openai
 from openai import OpenAI

 client = OpenAI(
--- a/docs/my-website/docs/providers/langgraph.md
+++ b/docs/my-website/docs/providers/langgraph.md
@ -187,7 +187,7 @@ Before using LiteLLM with LangGraph, you need a running LangGraph server.
 ### 1. Install the LangGraph CLI

 ```bash
-pip install "langgraph-cli[inmem]"
+uv add "langgraph-cli[inmem]"
 ```

 ### 2. Create a new LangGraph project
@ -200,7 +200,7 @@ cd my-agent
 ### 3. Install dependencies

 ```bash
-pip install -e .
+uv add -e .
 ```

 ### 4. Set your API key
--- a/docs/my-website/docs/providers/oci.md
+++ b/docs/my-website/docs/providers/oci.md
@ -80,7 +80,7 @@ Use an OCI SDK `Signer` object for authentication. This method:

 To use this method, install the OCI SDK:
 ```bash
-pip install oci
+uv add oci
 ```

 This method is an alternative when using the LiteLLM SDK on Oracle Cloud Infrastructure (instances or Oracle Kubernetes Engine).
--- a/docs/my-website/docs/providers/ollama.md
+++ b/docs/my-website/docs/providers/ollama.md
@ -49,7 +49,7 @@ for chunk in response:
 ## Example usage - Streaming + Acompletion
 Ensure you have async_generator installed for using ollama acompletion with streaming
 ```shell
-pip install async_generator
+uv add async_generator
 ```

 ```python
--- a/docs/my-website/docs/providers/petals.md
+++ b/docs/my-website/docs/providers/petals.md
@ -8,7 +8,7 @@ Petals: https://github.com/bigscience-workshop/petals
 ## Pre-Requisites
 Ensure you have `petals` installed
 ```shell
-pip install git+https://github.com/bigscience-workshop/petals
+uv add git+https://github.com/bigscience-workshop/petals
 ```

 ## Usage
--- a/docs/my-website/docs/providers/predibase.md
+++ b/docs/my-website/docs/providers/predibase.md
@ -186,7 +186,7 @@ model_list:
 See all litellm.completion supported params [here](https://docs.litellm.ai/docs/completion/input)

 ```python
-# !pip install litellm
+# !uv add litellm
 from litellm import completion
 import os
 ## set ENV variables
@ -219,7 +219,7 @@ Send params [not supported by `litellm.completion()`](https://docs.litellm.ai/do
 Example `adapter_id`, `adapter_source` are Predibase specific param - [See List](https://github.com/BerriAI/litellm/blob/8a35354dd6dbf4c2fcefcd6e877b980fcbd68c58/litellm/llms/predibase.py#L54)

 ```python
-# !pip install litellm
+# !uv add litellm
 from litellm import completion
 import os
 ## set ENV variables
--- a/docs/my-website/docs/providers/pydantic_ai_agent.md
+++ b/docs/my-website/docs/providers/pydantic_ai_agent.md
@ -23,7 +23,7 @@ LiteLLM requires Pydantic AI agents to follow the [A2A (Agent-to-Agent) protocol
 #### Install Dependencies

 ```bash
-pip install pydantic-ai fasta2a uvicorn
+uv add pydantic-ai fasta2a uvicorn
 ```

 #### Create Agent
--- a/docs/my-website/docs/providers/replicate.md
+++ b/docs/my-website/docs/providers/replicate.md
@ -231,7 +231,7 @@ Model Name                  | Function Call
 See all litellm.completion supported params [here](https://docs.litellm.ai/docs/completion/input)

 ```python
-# !pip install litellm
+# !uv add litellm
 from litellm import completion
 import os
 ## set ENV variables
@ -264,7 +264,7 @@ Send params [not supported by `litellm.completion()`](https://docs.litellm.ai/do
 Example `seed`, `min_tokens` are Replicate specific param

 ```python
-# !pip install litellm
+# !uv add litellm
 from litellm import completion
 import os
 ## set ENV variables
--- a/docs/my-website/docs/providers/sap.md
+++ b/docs/my-website/docs/providers/sap.md
@ -51,7 +51,7 @@ The resource group is typically configured separately in your AI Core deployment
 ### Step 1: Install LiteLLM

 ```bash
-pip install litellm
+uv add litellm
 ```

 ### Step 2: Set Your Credentials
--- a/docs/my-website/docs/providers/vertex.md
+++ b/docs/my-website/docs/providers/vertex.md
@ -1216,7 +1216,7 @@ curl http://0.0.0.0:4000/chat/completions \
 </Tabs>

 ## Pre-requisites
-* `pip install google-cloud-aiplatform` (pre-installed on proxy docker image)
+* `uv add google-cloud-aiplatform` (pre-installed on proxy docker image)
 * Authentication: 
    * run `gcloud auth application-default login` See [Google Cloud Docs](https://cloud.google.com/docs/authentication/external/set-up-adc)
    * Alternatively you can set `GOOGLE_APPLICATION_CREDENTIALS`
--- a/docs/my-website/docs/providers/vllm.md
+++ b/docs/my-website/docs/providers/vllm.md
@ -517,11 +517,11 @@ curl -X POST http://0.0.0.0:4000/chat/completions \
 </Tabs>


-## (Deprecated) for `vllm pip package` 
+## (Deprecated) for packaged `vllm` installs
 ### Using - `litellm.completion`

 ```
-pip install litellm vllm
+uv add litellm vllm
 ```
 ```python
 import litellm 
@ -616,4 +616,3 @@ test_vllm_custom_model()
 ```

 [Implementation Code](https://github.com/BerriAI/litellm/blob/6b3cb1898382f2e4e80fd372308ea232868c78d1/litellm/utils.py#L1414)
-
--- a/docs/my-website/docs/proxy/caching.md
+++ b/docs/my-website/docs/proxy/caching.md
@ -214,7 +214,7 @@ For GCP Memorystore Redis with IAM authentication, install the required dependen
 :::

 ```shell
-pip install google-cloud-iam
+uv add google-cloud-iam
 ```

 <Tabs>
--- a/docs/my-website/docs/proxy/deploy.md
+++ b/docs/my-website/docs/proxy/deploy.md
@ -32,10 +32,10 @@ docker pull docker.litellm.ai/berriai/litellm:main-latest

 </TabItem>

-<TabItem value="pip" label="LiteLLM CLI (pip package)">
+<TabItem value="cli" label="LiteLLM CLI">

 ```shell
-$ pip install 'litellm[proxy]'
+$ uv tool install 'litellm[proxy]'
 ```

 </TabItem>
@ -191,33 +191,32 @@ EXPOSE 4000/tcp
 CMD ["--port", "4000", "--config", "config.yaml", "--detailed_debug"]
 ```

-### Build from litellm `pip` package
+### Build from published LiteLLM packages

-Follow these instructions to build a docker container from the litellm pip package. If your company has a strict requirement around security / building images you can follow these steps.
+Follow these instructions to build a Docker container from published LiteLLM packages. If your company has a strict requirement around security or image provenance, you can follow these steps.

-**Note:** You'll need to copy the `schema.prisma` file from the [litellm repository](https://github.com/BerriAI/litellm/blob/main/schema.prisma) to your build directory alongside the Dockerfile and requirements.txt.
+**Note:** Copy the `schema.prisma` file from the [LiteLLM repository](https://github.com/BerriAI/litellm/blob/main/schema.prisma) into your build directory alongside this Dockerfile.

 Dockerfile 

 ```shell
 FROM cgr.dev/chainguard/python:latest-dev
+ARG UV_IMAGE=ghcr.io/astral-sh/uv:0.10.9

 USER root
 WORKDIR /app

-ENV HOME=/home/litellm
-ENV PATH="${HOME}/venv/bin:$PATH"
+ENV UV_TOOL_BIN_DIR=/usr/local/bin

 # Install runtime dependencies
 RUN apk update && \
    apk add --no-cache gcc python3-dev openssl openssl-dev

-RUN python -m venv ${HOME}/venv
-RUN ${HOME}/venv/bin/pip install --no-cache-dir --upgrade pip
+COPY --from=$UV_IMAGE /uv /usr/local/bin/uv
+COPY --from=$UV_IMAGE /uvx /usr/local/bin/uvx

-COPY requirements.txt .
-RUN --mount=type=cache,target=${HOME}/.cache/pip \
-    ${HOME}/venv/bin/pip install -r requirements.txt
+RUN uv tool install 'litellm[proxy,proxy-runtime,extra_proxy]==1.57.3' \
+    --python python

 # Copy Prisma schema file
 COPY schema.prisma .
@ -232,22 +231,12 @@ CMD ["--port", "4000"]
 ```


-Example `requirements.txt`
-
-```shell
-litellm[proxy]==1.57.3 # Specify the litellm version you want to use
-litellm-enterprise
-prometheus_client
-langfuse
-prisma
-```
-
 Build the docker image

 ```shell
 docker build \
-  -f Dockerfile.build_from_pip \
-  -t litellm-proxy-with-pip-5 .
+  -f Dockerfile \
+  -t litellm-proxy-from-package-5 .
 ```

 Run the docker image
@ -258,7 +247,7 @@ docker run \
    -e OPENAI_API_KEY="sk-1222" \
    -e DATABASE_URL="postgresql://xxxxxxxxx \
    -p 4000:4000 \
-    litellm-proxy-with-pip-5 \
+    litellm-proxy-from-package-5 \
    --config /app/config.yaml --detailed_debug
 ```

@ -760,7 +749,7 @@ RUN chmod +x ./docker/entrypoint.sh
 EXPOSE 4000/tcp

 # 👉 Key Change: Install hypercorn
-RUN pip install hypercorn
+RUN uv add hypercorn

 # Override the CMD instruction with your desired command and arguments
 # WARNING: FOR PROD DO NOT USE `--detailed_debug` it slows down response times, instead use the following CMD
--- a/docs/my-website/docs/proxy/docker_quick_start.md
+++ b/docs/my-website/docs/proxy/docker_quick_start.md
@ -70,15 +70,15 @@ curl -X POST 'http://0.0.0.0:4000/chat/completions' \
 }'
 ```

-:::tip Already have pip installed?
-You can skip the curl install and run `litellm --setup` directly after `pip install 'litellm[proxy]'`.
+:::tip Already have uv installed?
+You can skip the curl install and run `litellm --setup` directly after `uv tool install 'litellm[proxy]'`.
 :::

 ---

 ## Pre-Requisites 

-Choose your install method. **Docker Compose** users complete their full setup inside the tab and are done. **Docker** and **pip** users continue with the steps below the tabs.
+Choose your install method. **Docker Compose** users complete their full setup inside the tab and are done. **Docker** and **LiteLLM CLI** users continue with the steps below the tabs.

 <Tabs>

@ -92,10 +92,10 @@ docker pull docker.litellm.ai/berriai/litellm:main-latest

 </TabItem>

-<TabItem value="pip" label="LiteLLM CLI (pip package)">
+<TabItem value="cli" label="LiteLLM CLI">

 ```shell
-$ pip install 'litellm[proxy]'
+$ uv tool install 'litellm[proxy]'
 ```

 </TabItem>
@ -269,7 +269,7 @@ Virtual keys let you track spend, set rate limits, and control model access per
 </Tabs>

 :::note Docker Compose users
-Your setup is complete — the steps below are for **Docker** and **pip** users only.
+Your setup is complete — the steps below are for **Docker** and **LiteLLM CLI** users only.
 :::

 ---
@ -336,7 +336,7 @@ docker run \

 </TabItem>

-<TabItem value="pip" label="LiteLLM CLI (pip package)">
+<TabItem value="cli" label="LiteLLM CLI">

 ```shell
 $ litellm --config /app/config.yaml --detailed_debug
@ -463,7 +463,7 @@ Track spend and control model access via virtual keys for the proxy.
 Your Postgres container is already running — skip ahead to [Create Key w/ RPM Limit](#create-key-w-rpm-limit) below.
 :::

-**Docker / pip users** — you need a Postgres database (e.g. [Supabase](https://supabase.com/), [Neon](https://neon.tech/), or self-hosted). Add `general_settings` to your `config.yaml`:
+**Docker / LiteLLM CLI users** — you need a Postgres database (e.g. [Supabase](https://supabase.com/), [Neon](https://neon.tech/), or self-hosted). Add `general_settings` to your `config.yaml`:

 ```yaml
 model_list:
--- a/docs/my-website/docs/proxy/guardrails/lasso_security.md
+++ b/docs/my-website/docs/proxy/guardrails/lasso_security.md
@ -11,7 +11,7 @@ Use [Lasso Security](https://www.lasso.security/) to protect your LLM applicatio
 The Lasso guardrail requires the `ulid-py` package (version 1.1.0 or higher) for generating unique conversation identifiers:

 ```shell
-pip install ulid-py>=1.1.0
+uv add ulid-py>=1.1.0
 ```

 This package is used to create lexicographically sortable identifiers for tracking conversations and sessions in the Lasso Security platform.
--- a/docs/my-website/docs/proxy/logging.md
+++ b/docs/my-website/docs/proxy/logging.md
@ -351,7 +351,7 @@ We will use the `--config` to set `litellm.success_callback = ["langfuse"]` this
 **Step 1** Install langfuse

 ```shell
-pip install langfuse>=2.0.0
+uv add langfuse>=2.0.0
 ```

 **Step 2**: Create a `config.yaml` file and set `litellm_settings`: `success_callback`
@ -982,7 +982,7 @@ OTEL_ENDPOINT="http:/0.0.0.0:4317"
 OTEL_HEADERS="x-honeycomb-team=<your-api-key>" # Optional
 ```

-> Note: OTLP gRPC requires `grpcio`. Install via `pip install "litellm[grpc]"` (or `grpcio`).
+> Note: OTLP gRPC requires `grpcio`. Install via `uv add "litellm[grpc]"` (or `grpcio`).

 Add `otel` as a callback on your `litellm_config.yaml`

@ -1587,7 +1587,7 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
 #### Step1: Install dependencies and set your environment variables 
 Install the dependencies
 ```shell
-pip install litellm lunary
+uv add litellm lunary
 ```

 Get you Lunary public key from from https://app.lunary.ai/settings 
@ -2516,7 +2516,7 @@ If api calls fail (llm/database) you can log those to Sentry:
 **Step 1** Install Sentry

 ```shell
-pip install --upgrade sentry-sdk
+uv add --upgrade sentry-sdk
 ```

 **Step 2**: Save your Sentry_DSN and add `litellm_settings`: `failure_callback`
--- a/docs/my-website/docs/proxy/prometheus.md
+++ b/docs/my-website/docs/proxy/prometheus.md
@ -9,7 +9,7 @@ LiteLLM Exposes a `/metrics` endpoint for Prometheus to Poll

 ## Quick Start

-If you're using the LiteLLM CLI with `litellm --config proxy_config.yaml` then you need to `pip install prometheus_client==0.20.0`. **This is already pre-installed on the litellm Docker image**
+If you're using the LiteLLM CLI with `litellm --config proxy_config.yaml` then you need to `uv add prometheus_client==0.20.0`. **This is already pre-installed on the litellm Docker image**

 Add this to your proxy config.yaml 
 ```yaml
--- a/docs/my-website/docs/proxy/pyroscope_profiling.md
+++ b/docs/my-website/docs/proxy/pyroscope_profiling.md
@ -7,13 +7,13 @@ LiteLLM proxy can send continuous CPU profiles to [Grafana Pyroscope](https://gr
 1. **Install the optional dependency** (required only when enabling Pyroscope):

   ```bash
-   pip install pyroscope-io
+   uv add pyroscope-io
   ```

   Or install the proxy extra:

   ```bash
-   pip install "litellm[proxy]"
+   uv add "litellm[proxy]"
   ```

 2. **Set environment variables** before starting the proxy:
--- a/docs/my-website/docs/proxy/quick_start.md
+++ b/docs/my-website/docs/proxy/quick_start.md
@ -13,7 +13,7 @@ LiteLLM Server (LLM Gateway) manages:
 * **Load Balancing**: between [Multiple Models](#multiple-models---quick-start) + [Deployments of the same model](#multiple-instances-of-1-model) - LiteLLM proxy can handle 1.5k+ requests/second during load tests.

 ```shell
-$ pip install 'litellm[proxy]'
+$ uv tool install 'litellm[proxy]'
 ```

 ## Quick Start - LiteLLM Proxy CLI
--- a/docs/my-website/docs/proxy/user_keys.md
+++ b/docs/my-website/docs/proxy/user_keys.md
@ -881,7 +881,7 @@ Credits [@vividfog](https://github.com/ollama/ollama/issues/305#issuecomment-175
 <TabItem value="aider" label="Aider">

 ```shell
-$ pip install aider 
+$ uv add aider 

 $ aider --openai-api-base http://0.0.0.0:4000 --openai-api-key fake-key
 ```
@ -889,7 +889,7 @@ $ aider --openai-api-base http://0.0.0.0:4000 --openai-api-key fake-key
 <TabItem value="autogen" label="AutoGen">

 ```python
-pip install pyautogen
+uv add pyautogen
 ```

 ```python
--- a/docs/my-website/docs/proxy_api.md
+++ b/docs/my-website/docs/proxy_api.md
@ -66,16 +66,16 @@ git clone https://github.com/krrishdholakia/open-interpreter-litellm-fork
 ```
 To run it do: 
 ```
-poetry build 
+uv build 

 # call gpt-4 - always add 'litellm_proxy/' in front of the model name
-poetry run interpreter --model litellm_proxy/gpt-4
+uv run interpreter --model litellm_proxy/gpt-4

 # call llama-70b - always add 'litellm_proxy/' in front of the model name
-poetry run interpreter --model litellm_proxy/togethercomputer/llama-2-70b-chat
+uv run interpreter --model litellm_proxy/togethercomputer/llama-2-70b-chat

 # call claude-2 - always add 'litellm_proxy/' in front of the model name
-poetry run interpreter --model litellm_proxy/claude-2
+uv run interpreter --model litellm_proxy/claude-2
 ```

 And that's it! 
@ -83,4 +83,4 @@ And that's it!
 Now you can call any model you like!


-Want us to add more models? [Let us know!](https://github.com/BerriAI/litellm/issues/new/choose)
+Want us to add more models? [Let us know!](https://github.com/BerriAI/litellm/issues/new/choose)
--- a/Show More
+++ b/Show More