litellm/litellm
Yassin Kortam 014cb8fa9d
feat: add componentized proxy deployment with gateway, backend, ui, and migrations (#27557)
Split the monolithic LiteLLM proxy into independently scalable Kubernetes components to allow separate horizontal scaling of the LLM data plane and management API surfaces

- Add DatabaseURLSettings pydantic-settings model that assembles DATABASE_URL (and optional DATABASE_URL_READ_REPLICA) from discrete DATABASE_* env vars before Prisma initializes, supporting both IAM token auth (minting short-lived RDS tokens) and password auth; replaces the CLI-only path that componentized entrypoints bypass
- Add gateway component (port 4000) that trims the proxy route table to the LLM data-plane surface (chat, embeddings, completions, audio, realtime, provider passthroughs, health/metrics) via an allowlist applied inside the lifespan context so plugin-registered routes are captured
- Add backend component (port 4001) that exposes the management/admin surface (keys, users, teams, orgs, spend analytics, model management, SSO, audit logs) with a complementary allowlist
- Add ui component — Next.js static export served by nginx (port 3000) with RSC payload routing, asset prefix aliasing, and SPA fallback for dashboard routes
- Add migrations component with dedicated Dockerfile that runs prisma migrate deploy via a Helm pre-install/pre-upgrade Job, eliminating per-pod schema contention on the Prisma advisory lock
- Add Helm chart (helm/litellm) with separate Deployments, Services, HPAs, and ConfigMap for each component; shared _helpers.tpl emits DATABASE_*, IAM_TOKEN_DB_AUTH, REDIS_*, and DISABLE_SCHEMA_UPDATE env vars from chart values; ingress template routes traffic to the correct component by path prefix
- Add comprehensive tests for DatabaseURLSettings covering IAM auth, password auth, read replica fallbacks, operator-pinned URL preservation, and percent-encoding; add coverage test asserting gateway + backend allowlist union equals the full proxy route set
- Add pydantic-settings>=2.14.1 as a proxy extra dependency and update liccheck allowlist

Co-authored-by: Yassin Kortam <yassinkortam@g.ucla.edu>
2026-05-16 09:25:17 -07:00
..
a2a_protocol
anthropic_interface
assistants
batch_completion
batches fix(vertex-ai): fix zero cost/usage on completed Vertex AI batch jobs (#27912) 2026-05-15 04:47:02 -07:00
caching chore(caching): remove allow_legacy_unscoped_cache_hits opt-in 2026-05-04 22:16:30 +00:00
completion_extras fix(responses): normalize chat tool_choice for completions→responses bridge (#27634) 2026-05-11 10:24:34 -07:00
compression
containers
endpoints/speech/speech_to_completion_bridge
evals
experimental_mcp_client feat(mcp): add OBO MCP Auth (#27421) 2026-05-07 15:35:21 -07:00
files thread trusted params through vertex file content 2026-05-01 18:24:22 -07:00
fine_tuning
google_genai feat(proxy): LiteLLM headers on Google native generateContent routes (#25500) 2026-04-29 12:34:14 -07:00
images
integrations feat: add OTEL GenAI latest-experimental semantic convention support (#27418) 2026-05-15 17:20:04 -07:00
interactions
litellm_core_utils fix(bedrock-converse): drop blank-text fallback for empty thinking blocks (#27850) 2026-05-13 10:04:59 -07:00
llms fix(bedrock-mantle): use /anthropic/v1/messages path for Mantle endpo… (#27976) 2026-05-15 13:31:59 -07:00
ocr chore: reject bare str at file-input sinks to prevent local-file read (#27762) 2026-05-12 16:40:07 -07:00
passthrough fix(passthrough): log when streaming spend-tracking flush fails to schedule 2026-04-30 02:39:29 +00:00
proxy feat: add componentized proxy deployment with gateway, backend, ui, and migrations (#27557) 2026-05-16 09:25:17 -07:00
proxy_auth
rag style: black formatting 2026-04-25 14:47:54 -07:00
realtime_api
rerank_api Fix review 2026-04-30 09:10:24 +05:30
responses fix(responses): preserve cache_control in Responses API -> Chat Completion transformation (#27727) 2026-05-13 12:17:06 -07:00
router_strategy feat: add weighted-routing failover (#27980) 2026-05-15 17:28:54 +00:00
router_utils fix(responses): register cooldowns on failure + fail fast on stale encrypted_content (#27820) 2026-05-13 09:03:13 -07:00
search
secret_managers Implement normalize_nonempty_secret_str function to trim whitespace from secrets and treat empty values as unset. Update proxy_server to use this function for Grafana credentials. Enhance tests to validate the new normalization behavior. 2026-05-04 18:17:31 +00:00
skills chore(proxy): scope skills and container resources 2026-04-30 18:23:58 -07:00
types feat(mcp): add delegate_auth_to_upstream flag for PKCE passthrough (#27834) 2026-05-13 12:06:13 -07:00
vector_store_files
vector_stores chore(vector stores): tighten managed store access 2026-04-30 15:04:25 -07:00
videos
__init__.py Add Bedrock Claude Platform route (#27678) 2026-05-11 15:50:54 -07:00
_internal_context.py
_lazy_imports_registry.py fix(utils): import get_secret at runtime (#28014) 2026-05-15 14:01:18 -07:00
_lazy_imports.py
_logging.py fix(bedrock-mantle): use /anthropic/v1/messages path for Mantle endpo… (#27976) 2026-05-15 13:31:59 -07:00
_redis_credential_provider.py feat: add ability to auth to azure with token (#27556) 2026-05-09 22:34:09 +00:00
_redis.py fix: Fix Redis Sentinel client handling to solve authentication error… (#26302) 2026-05-13 14:06:58 -07:00
_service_logger.py
_uuid.py
_version.py
anthropic_beta_headers_config.json fix(anthropic,bedrock,databricks): four reasoning_effort follow-ups 2026-05-03 10:03:53 -07:00
anthropic_beta_headers_manager.py
blog_posts.json
budget_manager.py docs(budget_manager): add docstring to BudgetManager.reset_cost (#27867) 2026-05-13 13:28:22 -07:00
constants.py feat(mcp): support MCP access group names in URL-based namespacing (#27726) 2026-05-13 20:20:38 -07:00
cost_calculator.py fix(vertex-ai): fix zero cost/usage on completed Vertex AI batch jobs (#27912) 2026-05-15 04:47:02 -07:00
cost.json
exceptions.py fix(proxy): invoke post-call guardrails on pass-through endpoint responses (#20270) (#26262) 2026-04-27 08:58:22 +05:30
main.py Match litellm.completion supported model parameters with proxy model info (#27720) 2026-05-12 08:25:01 -07:00
model_prices_and_context_window_backup.json fix(pricing): GPT-4o-Transcribe Pricing (#27875) 2026-05-13 17:42:05 -07:00
mypy.ini
policy_templates_backup.json
provider_endpoints_support_backup.json
py.typed
router.py feat: add weighted-routing failover (#27980) 2026-05-15 17:28:54 +00:00
scheduler.py
setup_wizard.py
timeout.py docs: add class docstring to _LoopWrapper (#27870) 2026-05-13 13:54:00 -07:00
utils.py fix(utils): import get_secret at runtime (#28014) 2026-05-15 14:01:18 -07:00