* feat(xai): add grok-4.20 beta 2 models with pricing (#23900)
Add three grok-4.20 beta 2 model variants from xAI:
- grok-4.20-multi-agent-beta-0309 (reasoning + multi-agent)
- grok-4.20-beta-0309-reasoning (reasoning)
- grok-4.20-beta-0309-non-reasoning
Pricing (from https://docs.x.ai/docs/models):
- Input: $2.00/1M tokens ($0.20/1M cached)
- Output: $6.00/1M tokens
- Context: 2M tokens
All variants support vision, function calling, tool choice, and web search.
Closes LIT-2171
* docs: add Quick Install section for litellm --setup wizard (#23905)
* docs: add Quick Install section for litellm --setup wizard
* docs: clarify setup wizard is for local/beginner use
* feat(setup): interactive setup wizard + install.sh (#23644)
* feat(setup): add interactive setup wizard + install.sh
Adds `litellm --setup` — a Claude Code-style TUI onboarding wizard that
guides users through provider selection, API key entry, and proxy config
generation, then optionally starts the proxy immediately.
- litellm/setup_wizard.py: wizard with ASCII art, numbered provider menu
(OpenAI, Anthropic, Azure, Gemini, Bedrock, Ollama), API key prompts,
port/master-key config, and litellm_config.yaml generation
- litellm/proxy/proxy_cli.py: adds --setup flag that invokes the wizard
- scripts/install.sh: curl-installable script (detect OS/Python, pip
install litellm[proxy], launch wizard)
Usage:
curl -fsSL https://raw.githubusercontent.com/BerriAI/litellm/main/scripts/install.sh | sh
litellm --setup
* fix(install.sh): remove orange color, add LITELLM_BRANCH env var for branch installs
* fix(install.sh): install from git branch so --setup is available for QA
* fix(install.sh): remove stale LITELLM_BRANCH reference that caused unbound variable error
* fix(install.sh): force-reinstall from git to bypass cached PyPI version
* fix(install.sh): show pip progress bar during install
* fix(install.sh): always launch wizard via $PYTHON_BIN -m litellm, not PATH binary
* fix(install.sh): use litellm.proxy.proxy_cli module (no __main__.py exists)
* fix(install.sh): suppress RuntimeWarning from module invocation
* fix(install.sh): use Python bin-dir litellm binary to avoid CWD sys.path shadowing
* fix(install.sh): use sysconfig.get_path('scripts') to find pip-installed litellm binary
* fix(install.sh): redirect stdin from /dev/tty on exec so wizard gets terminal, not exhausted pipe
* fix(install.sh): warn about git clone duration, drop --no-cache-dir so re-runs are faster
* feat(setup_wizard): arrow-key selector, updated model names
* fix(setup_wizard): use sysconfig binary to start proxy, not python -m litellm
* feat(setup_wizard): credential validation after key entry + clear next-steps after proxy start
* style(install.sh): show git clone warning in blue
* refactor(setup_wizard): class with static methods, use check_valid_key from litellm.utils
* address greptile review: fix yaml escaping, port validation, display name collisions, tests
- setup_wizard.py: add _yaml_escape() for safe YAML embedding of API keys
- setup_wizard.py: add _styled_input() with readline ANSI ignore markers
- setup_wizard.py: change DIVIDER to _divider() fn to avoid import-time color capture
- setup_wizard.py: validate port range 1-65535, initialize before loop
- setup_wizard.py: qualify azure display names (azure-gpt-4o) to avoid collision with openai
- setup_wizard.py: work on env_copy in _build_config to avoid mutating caller's dict
- setup_wizard.py: skip model_list entries for providers with no credentials
- setup_wizard.py: prompt for azure deployment name
- setup_wizard.py: wrap os.execlp in try/except with friendly fallback
- setup_wizard.py: wrap config write in try/except OSError
- setup_wizard.py: fix _validate_and_report to use two print lines (no \r overwrite)
- setup_wizard.py: add .gitignore tip next to key storage notice
- setup_wizard.py: fix run_setup_wizard() return type annotation to None
- scripts/install.sh: drop pipefail (not supported by dash on Ubuntu when invoked as sh)
- scripts/install.sh: use litellm[proxy] from PyPI (not hardcoded dev branch)
- scripts/install.sh: guard /dev/tty read with -r check for Docker/CI compat
- scripts/install.sh: remove --force-reinstall to avoid downgrading dependencies
- tests/test_litellm/test_setup_wizard.py: 13 unit tests for _build_config and _yaml_escape
* style: black format setup_wizard.py
* fix: address remaining greptile issues - Windows compat, YAML quoting, credential flow
- guard termios/tty imports with try/except ImportError for Windows compat
- quote master_key as YAML double-quoted scalar (same as env vars)
- remove unused port param from _build_config signature
- _validate_and_report now returns the final key so re-entered creds are stored
- add test for master_key YAML quoting
* fix: add --port to suggested command, guard /dev/tty exec in install.sh
* fix: quote api_base in YAML, skip azure if no deployment, only redraw on state change
* fix: address greptile review comments
- _yaml_escape: add control character escaping (\n, \r, \t)
- test: fix tautological assertion in test_build_config_azure_no_deployment_skipped
- test: add tests for control character escaping in _yaml_escape
* feat(ui): remove Chat UI page link and banner from sidebar and playground (#23908)
* feat(guardrails): MCPJWTSigner - built-in guardrail for zero trust MCP auth (#23897)
* Allow pre_mcp_call guardrail hooks to mutate outbound MCP headers
* Enhance MCPServerManager to support hook-modified arguments and extra headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present.
* Refactor MCPServerManager to raise HTTPException for extra headers in OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings.
* Allow pre_mcp_call guardrail hooks to mutate outbound MCP headers
* Enhance MCPServerManager to support hook-modified arguments and extra headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present.
* Refactor MCPServerManager to raise HTTPException for extra headers in OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings.
* feat(guardrails): add MCPJWTSigner built-in guardrail for zero trust MCP auth
Signs outbound MCP tool calls with a LiteLLM-issued RS256 JWT so MCP servers
can trust a single signing authority instead of every upstream IdP.
Enable in config.yaml:
guardrails:
- guardrail_name: mcp-jwt-signer
litellm_params:
guardrail: mcp_jwt_signer
mode: pre_mcp_call
default_on: true
JWT carries sub (user_id), act.sub (team_id, RFC 8693), tool-level scope, iss,
aud, iat/exp/nbf. RSA-2048 keypair auto-generated at startup unless
MCP_JWT_SIGNING_KEY env var is set.
Adds /.well-known/jwks.json endpoint and jwks_uri to /.well-known/openid-configuration
so MCP servers can verify LiteLLM-issued tokens via OIDC discovery.
* Update MCPServerManager to raise HTTPException with status code 400 for extra headers in OpenAPI-backed servers. Adjust tests to verify the correct status code and exception message.
* fix: address P1 issues in MCPJWTSigner
- OpenAPI servers: warn + skip header injection instead of 500
- JWKS Cache-Control: 5min for auto-generated keys, 1h for persistent
- sub claim: fallback to apikey:{token_hash} for anonymous callers
- ttl_seconds: validate > 0 at init time
* docs: add MCP zero trust auth guide with architecture diagram
* docs: add FastMCP JWT verification guide to zero trust doc
* fix: address remaining Greptile review issues (round 2)
- mcp_server_manager: warn when hook Authorization overwrites existing header
- __init__: remove _mcp_jwt_signer_instance from __all__ (private internal)
- discoverable_endpoints: copy dict instead of mutating in-place on OIDC augmentation
- test docstring: reflect warn-and-continue behavior for OpenAPI servers
- test: update scope assertions for least-privilege (no mcp:tools/list on tool-call JWTs)
* fix: address Greptile round 3 feedback
- initialize_guardrail: validate mode='pre_mcp_call' at init time — misconfigured
mode silently bypasses JWT injection, which is a zero-trust bypass
- _build_claims: remove duplicate inline 'import re' (module-level import already present)
- _types.py: add TODO comment explaining jwt_claims is forward-compat plumbing
for a follow-up PR that will forward upstream IdP claims into outbound MCP JWTs
* feat(mcp_jwt_signer): add verify+re-sign, claim ops, two-token model, configurable scopes
Addresses all missing pieces from the scoping doc review:
FR-5 (Verify + re-sign): MCPJWTSigner now accepts access_token_discovery_uri
and token_introspection_endpoint. When set, the incoming Bearer token is
extracted from raw_headers (threaded through pre_call_tool_check), verified
against the IdP's JWKS (JWT) or introspected (opaque), and only re-signed if
valid. Falls back to user_api_key_dict.jwt_claims for LiteLLM JWT-auth mode.
FR-12 (Configurable end-user identity mapping): end_user_claim_sources
ordered list drives sub resolution — sources: token:<claim>, litellm:user_id,
litellm:email, litellm:end_user_id, litellm:team_id.
FR-13 (Claim operations): add_claims (insert-if-absent), set_claims (always
override), remove_claims (delete) applied in that order.
FR-14 (Two-token model): channel_token_audience + channel_token_ttl issue a
second JWT injected as x-mcp-channel-token: Bearer <token>.
FR-15 (Incoming claim validation): required_claims raises HTTP 403 when any
listed claim is absent; optional_claims passes listed claims from verified
token into the outbound JWT.
FR-9 (Debug headers): debug_headers: true emits x-litellm-mcp-debug with kid,
sub, iss, exp, scope.
FR-10 (Configurable scopes): allowed_scopes replaces auto-generation. Also
fixed: tool-call JWTs no longer grant mcp:tools/list (overpermission).
P1 fixes:
- proxy/utils.py: _convert_mcp_hook_response_to_kwargs merges rather than
replaces extra_headers, preserving headers from prior guardrails.
- mcp_server_manager.py: warns when hook injects Authorization alongside a
server-configured authentication_token (previously silent).
- mcp_server_manager.py: pre_call_tool_check now accepts raw_headers and
extracts incoming_bearer_token so FR-5 verification has the raw token.
- proxy/utils.py: remove stray inline import inspect inside loop (pre-existing
lint error, now cleaned up).
Tests: 43 passing (28 new tests covering all FR flags + P1 fixes).
* feat(mcp_jwt_signer): add verify+re-sign, claim ops, two-token model, configurable scopes (core)
Remaining files from the FR implementation:
mcp_jwt_signer.py — full rewrite with all new params:
FR-5: access_token_discovery_uri, token_introspection_endpoint,
verify_issuer, verify_audience + _verify_incoming_jwt(),
_introspect_opaque_token()
FR-12: end_user_claim_sources ordered resolution chain
FR-13: add_claims, set_claims, remove_claims
FR-14: channel_token_audience, channel_token_ttl → x-mcp-channel-token
FR-15: required_claims (raises 403), optional_claims (passthrough)
FR-9: debug_headers → x-litellm-mcp-debug
FR-10: allowed_scopes; tool-call JWTs no longer over-grant tools/list
mcp_server_manager.py:
- pre_call_tool_check gains raw_headers param to extract incoming_bearer_token
- Silent Authorization override warning fixed: now fires when server has
authentication_token AND hook injects Authorization
tests/test_mcp_jwt_signer.py:
28 new tests covering all FR flags + P1 fixes (43 total, all passing)
* fix(mcp_jwt_signer): address pre-landing review issues
- Remove stale TODO comment on UserAPIKeyAuth.jwt_claims — the field is
already populated and consumed by MCPJWTSigner in the same PR
- Fix _get_oidc_discovery to only cache the OIDC discovery doc when
jwks_uri is present; a malformed/empty doc now retries on the next
request instead of being permanently cached until proxy restart
- Add FR-5 test coverage for _fetch_jwks (cache hit/miss),
_get_oidc_discovery (cache/no-cache on bad doc), _verify_incoming_jwt
(valid token, expired token), _introspect_opaque_token (active,
inactive, no endpoint), and the end-to-end 401 hook path — 53 tests
total, all passing
* docs(mcp_zero_trust): rewrite as use-case guide covering all new JWT signer features
Add scenario-driven sections for each new config area:
- Verify+re-sign with Okta/Azure AD (access_token_discovery_uri,
end_user_claim_sources, token_introspection_endpoint)
- Enforcing caller attributes with required_claims / optional_claims
- Adding metadata via add_claims / set_claims / remove_claims
- Two-token model for AWS Bedrock AgentCore Gateway
(channel_token_audience / channel_token_ttl)
- Controlling scopes with allowed_scopes
- Debugging JWT rejections with debug_headers
Update JWT claims table to reflect configurable sub (end_user_claim_sources)
* fix(mcp_jwt_signer): wire all config.yaml params through initialize_guardrail
The factory was only passing issuer/audience/ttl_seconds to MCPJWTSigner.
All FR-5/9/10/12/13/14/15 params (access_token_discovery_uri,
end_user_claim_sources, add/set/remove_claims, channel_token_audience,
required/optional_claims, debug_headers, allowed_scopes, etc.) were
silently dropped, making every advertised advanced feature non-functional
when loaded from config.yaml.
Add regression test that asserts every param is wired through correctly.
* docs(mcp_zero_trust): add hero image
* docs(mcp_zero_trust): apply Linear-style edits
- Lead with the problem (unsigned direct calls bypass access controls)
- Shorter statement section headers instead of question-form headers
- Move diagram/OIDC discovery block after the reader is bought in
- Add 'read further only if you need to' callout after basic setup
- Two-token section now opens from the user problem not product jargon
- Add concrete 403 error response example in required_claims section
- Debug section opens from the symptom (MCP server returning 401)
- Lowercase claims reference header for consistency
* fix(mcp_jwt_signer): fix algorithm confusion attack + add OIDC discovery 24h TTL
- Remove alg from unverified JWT header; use signing_jwk.algorithm_name from JWKS key instead.
Reading alg from attacker-controlled headers enables alg:none / HS256 confusion attacks.
- Add _oidc_discovery_fetched_at timestamp and _OIDC_DISCOVERY_TTL = 86400 (24h).
Without a TTL the cached discovery doc never refreshes, so IdP key rotation is invisible.
---------
Co-authored-by: Noah Nistler <60981020+noahnistler@users.noreply.github.com>
* fix(ci): stabilize CI - formatting, type errors, test polling, security CVEs, router bug, batch resolution
Fix 1: Run Black formatter on 35 files
Fix 2: Fix MyPy type errors:
- setup_wizard.py: add type annotation for 'selected' set variable
- user_api_key_auth.py: remove redundant type annotation on jwt_claims reassignment
Fix 3: Fix spend accuracy test burst 2 polling to wait for expected total
spend instead of just 'any increase' from burst 2
Fix 4: Bump Next.js 16.1.6 -> 16.1.7 to fix CVE-2026-27978, CVE-2026-27979,
CVE-2026-27980, CVE-2026-29057
Fix 5: Fix router _pre_call_checks model variable being overwritten inside
loop, causing wrong model lookups on subsequent deployments. Use local
_deployment_model variable instead.
Fix 6: Add missing resolve_output_file_ids_to_unified call in batch retrieve
non-terminal-to-terminal path (matching the terminal path behavior)
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* chore: regenerate poetry.lock to sync with pyproject.toml
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: format merged files from main and regenerate poetry.lock
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix(mypy): annotate jwt_claims as Optional[dict] to fix type incompatibility
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix(ci): update router region test to use gpt-4.1-mini (fix flaky model lookup)
Replace deprecated gpt-3.5-turbo-1106 with gpt-4.1-mini + mock_response in
test_router_region_pre_call_check, following the same pattern used in commit
717d37cc5b for test_router_context_window_check_pre_call_check_out_group.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* ci: retry flaky logging_testing (async event loop race condition)
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix(ci): aggregate all mock calls in langfuse e2e test to fix race condition
The _verify_langfuse_call helper only inspected the last mock call
(mock_post.call_args), but the Langfuse SDK may split trace-create and
generation-create events across separate HTTP flush cycles. This caused
an IndexError when the last call's batch contained only one event type.
Fix: iterate over mock_post.call_args_list to collect batch items from
ALL calls. Also add a safety assertion after filtering by trace_id and
mark all langfuse e2e tests with @pytest.mark.flaky(retries=3) as an
extra safety net for any residual timing issues.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix(ci): black formatting + update OpenAPI compliance tests for spec changes
- Apply Black 26.x formatting to litellm_logging.py (parenthesized style)
- Update test_input_types_match_spec to follow $ref to InteractionsInput schema
(Google updated their OpenAPI spec to use $ref instead of inline oneOf)
- Update test_content_schema_uses_discriminator to handle discriminator without
explicit mapping (Google removed the mapping key from Content discriminator)
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* revert: undo incorrect Black 26.x formatting on litellm_logging.py
The file was correctly formatted for Black 23.12.1 (the version pinned
in pyproject.toml). The previous commit applied Black 26.x formatting
which was incompatible with the CI's Black version.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix(ci): deduplicate and sort langfuse batch events after aggregation
The Langfuse SDK may send the same event (e.g., trace-create) in
multiple flush cycles, causing duplicates when we aggregate from all
mock calls. After filtering by trace_id, deduplicate by keeping only
the first event of each type, then sort to ensure trace-create is at
index 0 and generation-create at index 1.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
---------
Co-authored-by: Noah Nistler <60981020+noahnistler@users.noreply.github.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
5428 lines
205 KiB
Python
5428 lines
205 KiB
Python
"""
|
||
KEY MANAGEMENT
|
||
|
||
All /key management endpoints
|
||
|
||
/key/generate
|
||
/key/info
|
||
/key/update
|
||
/key/delete
|
||
"""
|
||
|
||
import asyncio
|
||
import copy
|
||
import inspect
|
||
import json
|
||
import os
|
||
import re
|
||
import secrets
|
||
import traceback
|
||
from datetime import datetime, timedelta, timezone
|
||
from typing import Any, Dict, List, Literal, Optional, Tuple, cast
|
||
|
||
import fastapi
|
||
import yaml
|
||
from fastapi import APIRouter, Depends, Header, HTTPException, Query, Request, status
|
||
|
||
import litellm
|
||
from litellm._logging import verbose_proxy_logger
|
||
from litellm._uuid import uuid
|
||
from litellm.caching import DualCache
|
||
from litellm.constants import (
|
||
LENGTH_OF_LITELLM_GENERATED_KEY,
|
||
LITELLM_PROXY_ADMIN_NAME,
|
||
UI_SESSION_TOKEN_TEAM_ID,
|
||
)
|
||
from litellm.litellm_core_utils.duration_parser import duration_in_seconds
|
||
from litellm.litellm_core_utils.safe_json_dumps import safe_dumps
|
||
from litellm.proxy._experimental.mcp_server.db import (
|
||
rotate_mcp_server_credentials_master_key,
|
||
)
|
||
from litellm.proxy._types import *
|
||
from litellm.proxy._types import LiteLLM_VerificationToken
|
||
from litellm.proxy.auth.auth_checks import (
|
||
_cache_key_object,
|
||
_delete_cache_key_object,
|
||
can_team_access_model,
|
||
get_key_object,
|
||
get_org_object,
|
||
get_project_object,
|
||
get_team_object,
|
||
)
|
||
from litellm.proxy.auth.auth_utils import abbreviate_api_key
|
||
from litellm.proxy.auth.user_api_key_auth import user_api_key_auth
|
||
from litellm.proxy.common_utils.timezone_utils import get_budget_reset_time
|
||
from litellm.proxy.hooks.key_management_event_hooks import KeyManagementEventHooks
|
||
from litellm.proxy.management_endpoints.common_utils import (
|
||
_is_user_org_admin_for_team,
|
||
_is_user_team_admin,
|
||
_set_object_metadata_field,
|
||
)
|
||
from litellm.proxy.management_endpoints.model_management_endpoints import (
|
||
_add_model_to_db,
|
||
)
|
||
from litellm.proxy.management_helpers.object_permission_utils import (
|
||
_set_object_permission,
|
||
attach_object_permission_to_dict,
|
||
handle_update_object_permission_common,
|
||
validate_key_mcp_servers_against_team,
|
||
)
|
||
from litellm.proxy.management_helpers.team_member_permission_checks import (
|
||
TeamMemberPermissionChecks,
|
||
)
|
||
from litellm.proxy.management_helpers.utils import management_endpoint_wrapper
|
||
from litellm.proxy.spend_tracking.spend_tracking_utils import _is_master_key
|
||
from litellm.proxy.ui_crud_endpoints.proxy_setting_endpoints import (
|
||
get_ui_settings_cached,
|
||
)
|
||
from litellm.proxy.utils import (
|
||
PrismaClient,
|
||
ProxyLogging,
|
||
_hash_token_if_needed,
|
||
handle_exception_on_proxy,
|
||
is_valid_api_key,
|
||
)
|
||
from litellm.router import Router
|
||
from litellm.secret_managers.main import get_secret
|
||
from litellm.types.proxy.management_endpoints.key_management_endpoints import (
|
||
BulkUpdateKeyRequest,
|
||
BulkUpdateKeyRequestItem,
|
||
BulkUpdateKeyResponse,
|
||
FailedKeyUpdate,
|
||
SuccessfulKeyUpdate,
|
||
)
|
||
from litellm.types.router import Deployment
|
||
from litellm.types.utils import (
|
||
BudgetConfig,
|
||
PersonalUIKeyGenerationConfig,
|
||
TeamUIKeyGenerationConfig,
|
||
)
|
||
|
||
|
||
async def _check_custom_key_allowed(custom_key_value: Optional[str]) -> None:
|
||
"""Raise 403 if custom API keys are disabled and a custom key was provided."""
|
||
if custom_key_value is None:
|
||
return
|
||
|
||
ui_settings = await get_ui_settings_cached()
|
||
if ui_settings.get("disable_custom_api_keys", False) is True:
|
||
verbose_proxy_logger.warning(
|
||
"Custom API key rejected: disable_custom_api_keys is enabled"
|
||
)
|
||
raise HTTPException(
|
||
status_code=403,
|
||
detail={
|
||
"error": "Custom API key values are disabled by your administrator. Keys must be auto-generated."
|
||
},
|
||
)
|
||
|
||
|
||
def _is_team_key(data: Union[GenerateKeyRequest, LiteLLM_VerificationToken]):
|
||
return data.team_id is not None
|
||
|
||
|
||
def _get_user_in_team(
|
||
team_table: LiteLLM_TeamTableCachedObj, user_id: Optional[str]
|
||
) -> Optional[Member]:
|
||
if user_id is None:
|
||
return None
|
||
for member in team_table.members_with_roles:
|
||
if member.user_id is not None and member.user_id == user_id:
|
||
return member
|
||
|
||
return None
|
||
|
||
|
||
def _calculate_key_rotation_time(rotation_interval: str) -> datetime:
|
||
"""
|
||
Helper function to calculate the next rotation time for a key based on the rotation interval.
|
||
|
||
Args:
|
||
rotation_interval: String representing the rotation interval (e.g., '30d', '90d', '1h')
|
||
|
||
Returns:
|
||
datetime: The calculated next rotation time in UTC
|
||
"""
|
||
now = datetime.now(timezone.utc)
|
||
interval_seconds = duration_in_seconds(rotation_interval)
|
||
return now + timedelta(seconds=interval_seconds)
|
||
|
||
|
||
def _set_key_rotation_fields(
|
||
data: dict,
|
||
auto_rotate: bool,
|
||
rotation_interval: Optional[str],
|
||
existing_key_alias: Optional[str] = None,
|
||
) -> None:
|
||
"""
|
||
Helper function to set rotation fields in key data if auto_rotate is enabled.
|
||
|
||
Args:
|
||
data: Dictionary to update with rotation fields
|
||
auto_rotate: Whether auto rotation is enabled
|
||
rotation_interval: The rotation interval string (required if auto_rotate is True)
|
||
existing_key_alias: The existing key alias from the database (if any)
|
||
"""
|
||
if auto_rotate and rotation_interval:
|
||
if (
|
||
litellm._key_management_settings is not None
|
||
and litellm._key_management_settings.store_virtual_keys is True
|
||
and data.get("key_alias") is None
|
||
and existing_key_alias is None
|
||
):
|
||
raise ProxyException(
|
||
message="key_alias is required when auto_rotate=True and store_virtual_keys is enabled. This ensures stable secret naming during rotation.",
|
||
type=ProxyErrorTypes.bad_request_error,
|
||
param="key_alias",
|
||
code=400,
|
||
)
|
||
data.update(
|
||
{
|
||
"auto_rotate": auto_rotate,
|
||
"rotation_interval": rotation_interval,
|
||
"key_rotation_at": _calculate_key_rotation_time(rotation_interval),
|
||
}
|
||
)
|
||
|
||
|
||
def _is_allowed_to_make_key_request(
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
user_id: Optional[str],
|
||
team_id: Optional[str],
|
||
) -> bool:
|
||
"""
|
||
Assert user only creates/updates keys for themselves
|
||
|
||
Relevant issue: https://github.com/BerriAI/litellm/issues/7336
|
||
"""
|
||
## BASE CASE - PROXY ADMIN
|
||
if (
|
||
user_api_key_dict.user_role is not None
|
||
and user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value
|
||
):
|
||
return True
|
||
|
||
if user_id is not None:
|
||
assert (
|
||
user_id == user_api_key_dict.user_id
|
||
), "User can only create keys for themselves. Got user_id={}, Your ID={}".format(
|
||
user_id, user_api_key_dict.user_id
|
||
)
|
||
|
||
if team_id is not None:
|
||
if (
|
||
user_api_key_dict.team_id is not None
|
||
and user_api_key_dict.team_id == UI_TEAM_ID
|
||
):
|
||
return True # handle https://github.com/BerriAI/litellm/issues/7482
|
||
|
||
return True
|
||
|
||
|
||
def _team_key_operation_team_member_check(
|
||
assigned_user_id: Optional[str],
|
||
team_table: LiteLLM_TeamTableCachedObj,
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
team_key_generation: TeamUIKeyGenerationConfig,
|
||
route: KeyManagementRoutes,
|
||
):
|
||
if assigned_user_id is not None:
|
||
key_assigned_user_in_team = _get_user_in_team(
|
||
team_table=team_table, user_id=assigned_user_id
|
||
)
|
||
|
||
if key_assigned_user_in_team is None:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"User={assigned_user_id} not assigned to team={team_table.team_id}",
|
||
)
|
||
|
||
team_member_object = _get_user_in_team(
|
||
team_table=team_table, user_id=user_api_key_dict.user_id
|
||
)
|
||
|
||
is_admin = (
|
||
user_api_key_dict.user_role is not None
|
||
and user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value
|
||
)
|
||
|
||
if is_admin:
|
||
return True
|
||
elif team_member_object is None:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"User={user_api_key_dict.user_id} not assigned to team={team_table.team_id}",
|
||
)
|
||
elif (
|
||
"allowed_team_member_roles" in team_key_generation
|
||
and team_member_object.role
|
||
not in team_key_generation["allowed_team_member_roles"]
|
||
):
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Team member role {team_member_object.role} not in allowed_team_member_roles={team_key_generation['allowed_team_member_roles']}",
|
||
)
|
||
|
||
TeamMemberPermissionChecks.does_team_member_have_permissions_for_endpoint(
|
||
team_member_object=team_member_object,
|
||
team_table=team_table,
|
||
route=route,
|
||
)
|
||
return True
|
||
|
||
|
||
def _key_generation_required_param_check(
|
||
data: GenerateKeyRequest, required_params: Optional[List[str]]
|
||
):
|
||
if required_params is None:
|
||
return True
|
||
|
||
data_dict = data.model_dump(exclude_unset=True)
|
||
for param in required_params:
|
||
if param not in data_dict:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Required param {param} not in data",
|
||
)
|
||
return True
|
||
|
||
|
||
def _team_key_generation_check(
|
||
team_table: LiteLLM_TeamTableCachedObj,
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
data: GenerateKeyRequest,
|
||
route: KeyManagementRoutes,
|
||
):
|
||
if user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value:
|
||
return True
|
||
if (
|
||
litellm.key_generation_settings is not None
|
||
and "team_key_generation" in litellm.key_generation_settings
|
||
):
|
||
_team_key_generation = litellm.key_generation_settings["team_key_generation"]
|
||
else:
|
||
_team_key_generation = TeamUIKeyGenerationConfig(
|
||
allowed_team_member_roles=["admin", "user"],
|
||
)
|
||
|
||
_team_key_operation_team_member_check(
|
||
assigned_user_id=data.user_id,
|
||
team_table=team_table,
|
||
user_api_key_dict=user_api_key_dict,
|
||
team_key_generation=_team_key_generation,
|
||
route=route,
|
||
)
|
||
_key_generation_required_param_check(
|
||
data,
|
||
_team_key_generation.get("required_params"),
|
||
)
|
||
|
||
return True
|
||
|
||
|
||
def _personal_key_membership_check(
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
personal_key_generation: Optional[PersonalUIKeyGenerationConfig],
|
||
):
|
||
if (
|
||
personal_key_generation is None
|
||
or "allowed_user_roles" not in personal_key_generation
|
||
):
|
||
return True
|
||
|
||
if user_api_key_dict.user_role not in personal_key_generation["allowed_user_roles"]:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Personal key creation has been restricted by admin. Allowed roles={litellm.key_generation_settings['personal_key_generation']['allowed_user_roles']}. Your role={user_api_key_dict.user_role}", # type: ignore
|
||
)
|
||
|
||
return True
|
||
|
||
|
||
def _personal_key_generation_check(
|
||
user_api_key_dict: UserAPIKeyAuth, data: GenerateKeyRequest
|
||
):
|
||
if (
|
||
litellm.key_generation_settings is None
|
||
or litellm.key_generation_settings.get("personal_key_generation") is None
|
||
):
|
||
return True
|
||
|
||
_personal_key_generation = litellm.key_generation_settings["personal_key_generation"] # type: ignore
|
||
|
||
_personal_key_membership_check(
|
||
user_api_key_dict,
|
||
personal_key_generation=_personal_key_generation,
|
||
)
|
||
|
||
_key_generation_required_param_check(
|
||
data,
|
||
_personal_key_generation.get("required_params"),
|
||
)
|
||
|
||
return True
|
||
|
||
|
||
def key_generation_check(
|
||
team_table: Optional[LiteLLM_TeamTableCachedObj],
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
data: GenerateKeyRequest,
|
||
route: KeyManagementRoutes,
|
||
) -> bool:
|
||
"""
|
||
Check if admin has restricted key creation to certain roles for teams or individuals
|
||
"""
|
||
|
||
## check if key is for team or individual
|
||
is_team_key = _is_team_key(data=data)
|
||
_is_admin = (
|
||
user_api_key_dict.user_role is not None
|
||
and user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value
|
||
)
|
||
if is_team_key:
|
||
if team_table is None and litellm.key_generation_settings is not None:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Unable to find team object in database. Team ID: {data.team_id}",
|
||
)
|
||
elif team_table is None:
|
||
if _is_admin:
|
||
return True # admins can assign team_id without team table
|
||
# Non-admin callers must have a valid team (LIT-1884)
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Unable to find team object in database. Team ID: {data.team_id}",
|
||
)
|
||
return _team_key_generation_check(
|
||
team_table=team_table,
|
||
user_api_key_dict=user_api_key_dict,
|
||
data=data,
|
||
route=route,
|
||
)
|
||
else:
|
||
return _personal_key_generation_check(
|
||
user_api_key_dict=user_api_key_dict, data=data
|
||
)
|
||
|
||
|
||
def common_key_access_checks(
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
data: Union[GenerateKeyRequest, UpdateKeyRequest],
|
||
llm_router: Optional[Router],
|
||
premium_user: bool,
|
||
user_id: Optional[str] = None,
|
||
) -> Literal[True]:
|
||
"""
|
||
Check if user is allowed to make a key request, for this key
|
||
"""
|
||
try:
|
||
_is_allowed_to_make_key_request(
|
||
user_api_key_dict=user_api_key_dict,
|
||
user_id=user_id or data.user_id,
|
||
team_id=data.team_id,
|
||
)
|
||
except AssertionError as e:
|
||
raise HTTPException(
|
||
status_code=403,
|
||
detail=str(e),
|
||
)
|
||
except Exception as e:
|
||
raise HTTPException(
|
||
status_code=500,
|
||
detail=str(e),
|
||
)
|
||
|
||
_check_model_access_group(
|
||
models=data.models,
|
||
llm_router=llm_router,
|
||
premium_user=premium_user,
|
||
)
|
||
return True
|
||
|
||
|
||
router = APIRouter()
|
||
|
||
|
||
def handle_key_type(data: GenerateKeyRequest, data_json: dict) -> dict:
|
||
"""
|
||
Handle the key type.
|
||
"""
|
||
key_type = data.key_type
|
||
data_json.pop("key_type", None)
|
||
if key_type == LiteLLMKeyType.LLM_API:
|
||
data_json["allowed_routes"] = ["llm_api_routes"]
|
||
elif key_type == LiteLLMKeyType.MANAGEMENT:
|
||
data_json["allowed_routes"] = ["management_routes"]
|
||
elif key_type == LiteLLMKeyType.READ_ONLY:
|
||
data_json["allowed_routes"] = ["info_routes"]
|
||
return data_json
|
||
|
||
|
||
async def validate_team_id_used_in_service_account_request(
|
||
team_id: Optional[str],
|
||
prisma_client: Optional[PrismaClient],
|
||
):
|
||
"""
|
||
Validate team_id is used in the request body for generating a service account key
|
||
"""
|
||
if team_id is None:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail="team_id is required for service account keys. Please specify `team_id` in the request body.",
|
||
)
|
||
|
||
if prisma_client is None:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail="prisma_client is required for service account keys. Please specify `prisma_client` in the request body.",
|
||
)
|
||
|
||
# check if team_id exists in the database
|
||
team = await prisma_client.db.litellm_teamtable.find_unique(
|
||
where={"team_id": team_id},
|
||
)
|
||
if team is None:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail="team_id does not exist in the database. Please specify a valid `team_id` in the request body.",
|
||
)
|
||
return True
|
||
|
||
|
||
async def _common_key_generation_helper( # noqa: PLR0915
|
||
data: GenerateKeyRequest,
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
litellm_changed_by: Optional[str],
|
||
team_table: Optional[LiteLLM_TeamTableCachedObj],
|
||
) -> GenerateKeyResponse:
|
||
from litellm.proxy.proxy_server import (
|
||
litellm_proxy_admin_name,
|
||
llm_router,
|
||
premium_user,
|
||
prisma_client,
|
||
)
|
||
|
||
common_key_access_checks(
|
||
user_api_key_dict=user_api_key_dict,
|
||
data=data,
|
||
llm_router=llm_router,
|
||
premium_user=premium_user,
|
||
)
|
||
|
||
if (
|
||
data.metadata is not None
|
||
and data.metadata.get("service_account_id") is not None
|
||
and data.team_id is None
|
||
):
|
||
await validate_team_id_used_in_service_account_request(
|
||
team_id=data.team_id,
|
||
prisma_client=prisma_client,
|
||
)
|
||
|
||
# check if user set default key/generate params on config.yaml
|
||
if litellm.default_key_generate_params is not None:
|
||
for elem in data:
|
||
key, value = elem
|
||
if value is None and key in [
|
||
"max_budget",
|
||
"user_id",
|
||
"team_id",
|
||
"max_parallel_requests",
|
||
"tpm_limit",
|
||
"rpm_limit",
|
||
"budget_duration",
|
||
"duration",
|
||
]:
|
||
setattr(data, key, litellm.default_key_generate_params.get(key, None))
|
||
elif key == "models" and value == []:
|
||
setattr(data, key, litellm.default_key_generate_params.get(key, []))
|
||
elif key == "metadata" and value == {}:
|
||
setattr(data, key, litellm.default_key_generate_params.get(key, {}))
|
||
|
||
# check if user set default key/generate params on config.yaml
|
||
if litellm.upperbound_key_generate_params is not None:
|
||
for elem in data:
|
||
key, value = elem
|
||
upperbound_value = getattr(
|
||
litellm.upperbound_key_generate_params, key, None
|
||
)
|
||
if upperbound_value is not None:
|
||
if value is None:
|
||
# Use the upperbound value if user didn't provide a value
|
||
setattr(data, key, upperbound_value)
|
||
else:
|
||
# Compare with upperbound for numeric fields
|
||
if key in [
|
||
"max_budget",
|
||
"max_parallel_requests",
|
||
"tpm_limit",
|
||
"rpm_limit",
|
||
]:
|
||
if value > upperbound_value:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={
|
||
"error": f"{key} is over max limit set in config - user_value={value}; max_value={upperbound_value}"
|
||
},
|
||
)
|
||
# Compare durations
|
||
elif key in ["budget_duration", "duration"]:
|
||
upperbound_duration = duration_in_seconds(
|
||
duration=upperbound_value
|
||
)
|
||
# Handle special case where duration is None or "-1" (never expires)
|
||
if value is None or value == "-1":
|
||
user_duration = float("inf") # Infinite duration
|
||
else:
|
||
user_duration = duration_in_seconds(duration=value)
|
||
if user_duration > upperbound_duration:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={
|
||
"error": f"{key} is over max limit set in config - user_value={value}; max_value={upperbound_value}"
|
||
},
|
||
)
|
||
|
||
# APPLY ENTERPRISE KEY MANAGEMENT PARAMS
|
||
try:
|
||
from litellm_enterprise.proxy.management_endpoints.key_management_endpoints import (
|
||
apply_enterprise_key_management_params,
|
||
)
|
||
|
||
data = apply_enterprise_key_management_params(data, team_table)
|
||
except Exception as e:
|
||
verbose_proxy_logger.debug(
|
||
"litellm.proxy.proxy_server.generate_key_fn(): Enterprise key management params not applied - {}".format(
|
||
str(e)
|
||
)
|
||
)
|
||
|
||
# TODO: @ishaan-jaff: Migrate all budget tracking to use LiteLLM_BudgetTable
|
||
_budget_id = data.budget_id
|
||
if prisma_client is not None and data.soft_budget is not None:
|
||
# create the Budget Row for the LiteLLM Verification Token
|
||
budget_row = LiteLLM_BudgetTable(
|
||
soft_budget=data.soft_budget,
|
||
model_max_budget=data.model_max_budget or {},
|
||
)
|
||
new_budget = prisma_client.jsonify_object(budget_row.json(exclude_none=True))
|
||
|
||
_budget = await prisma_client.db.litellm_budgettable.create(
|
||
data={
|
||
**new_budget, # type: ignore
|
||
"created_by": user_api_key_dict.user_id or litellm_proxy_admin_name,
|
||
"updated_by": user_api_key_dict.user_id or litellm_proxy_admin_name,
|
||
}
|
||
)
|
||
_budget_id = getattr(_budget, "budget_id", None)
|
||
|
||
# ADD METADATA FIELDS
|
||
# Set Management Endpoint Metadata Fields
|
||
for field in LiteLLM_ManagementEndpoint_MetadataFields_Premium:
|
||
if getattr(data, field, None) is not None:
|
||
_set_object_metadata_field(
|
||
object_data=data,
|
||
field_name=field,
|
||
value=getattr(data, field),
|
||
)
|
||
delattr(data, field)
|
||
|
||
for field in LiteLLM_ManagementEndpoint_MetadataFields:
|
||
if getattr(data, field, None) is not None:
|
||
_set_object_metadata_field(
|
||
object_data=data,
|
||
field_name=field,
|
||
value=getattr(data, field),
|
||
)
|
||
delattr(data, field)
|
||
|
||
data_json = data.model_dump(exclude_unset=True, exclude_none=True) # type: ignore
|
||
|
||
data_json = handle_key_type(data, data_json)
|
||
|
||
# if we get max_budget passed to /key/generate, then use it as key_max_budget. Since generate_key_helper_fn is used to make new users
|
||
if "max_budget" in data_json:
|
||
data_json["key_max_budget"] = data_json.pop("max_budget", None)
|
||
if _budget_id is not None:
|
||
data_json["budget_id"] = _budget_id
|
||
|
||
# Only set budget_duration on key when explicitly provided. Keys with budget_id
|
||
# but no explicit budget_duration follow their linked budget tier's schedule;
|
||
# reset_budget_for_keys_linked_to_budgets() resets them when the tier resets.
|
||
# This avoids duplicating budget_duration on keys so tier updates apply automatically.
|
||
if "budget_duration" in data_json:
|
||
data_json["key_budget_duration"] = data_json.pop("budget_duration", None)
|
||
|
||
if user_api_key_dict.user_id is not None:
|
||
data_json["created_by"] = user_api_key_dict.user_id
|
||
data_json["updated_by"] = user_api_key_dict.user_id
|
||
|
||
# Set tags on the new key
|
||
if "tags" in data_json:
|
||
from litellm.proxy.proxy_server import premium_user
|
||
|
||
if premium_user is not True and data_json["tags"] is not None:
|
||
raise ValueError(
|
||
f"Only premium users can add tags to keys. {CommonProxyErrors.not_premium_user.value}"
|
||
)
|
||
|
||
_metadata = data_json.get("metadata")
|
||
if not _metadata:
|
||
data_json["metadata"] = {"tags": data_json["tags"]}
|
||
else:
|
||
data_json["metadata"]["tags"] = data_json["tags"]
|
||
|
||
data_json.pop("tags")
|
||
|
||
# Validate MCP servers in object_permission are within team scope
|
||
await validate_key_mcp_servers_against_team(
|
||
object_permission=data_json.get("object_permission"),
|
||
team_obj=team_table,
|
||
)
|
||
|
||
data_json = await _set_object_permission(
|
||
data_json=data_json,
|
||
prisma_client=prisma_client,
|
||
)
|
||
|
||
_validate_key_alias_format(key_alias=data_json.get("key_alias", None))
|
||
|
||
await _enforce_unique_key_alias(
|
||
key_alias=data_json.get("key_alias", None),
|
||
prisma_client=prisma_client,
|
||
)
|
||
|
||
# Reject custom key values if disabled by admin
|
||
await _check_custom_key_allowed(data.key)
|
||
|
||
# Validate user-provided key format
|
||
if data.key is not None and not data.key.startswith("sk-"):
|
||
_masked = (
|
||
"{}****{}".format(data.key[:4], data.key[-4:])
|
||
if len(data.key) > 8
|
||
else "****"
|
||
)
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={
|
||
"error": f"Invalid key format. LiteLLM Virtual Key must start with 'sk-'. Received: {_masked}"
|
||
},
|
||
)
|
||
|
||
# check org key limits - done here to handle inheriting org id from team
|
||
if data.organization_id is not None:
|
||
from litellm.proxy.proxy_server import prisma_client, user_api_key_cache
|
||
|
||
if prisma_client:
|
||
org_table = await get_org_object(
|
||
org_id=data.organization_id,
|
||
user_api_key_cache=user_api_key_cache,
|
||
prisma_client=prisma_client,
|
||
)
|
||
if org_table is None:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Organization not found for organization_id={data.organization_id}",
|
||
)
|
||
await _check_org_key_limits(
|
||
org_table=org_table,
|
||
data=data,
|
||
prisma_client=prisma_client,
|
||
)
|
||
|
||
response = await generate_key_helper_fn(
|
||
request_type="key", **data_json, table_name="key"
|
||
)
|
||
|
||
response[
|
||
"soft_budget"
|
||
] = data.soft_budget # include the user-input soft budget in the response
|
||
|
||
response = GenerateKeyResponse(**response)
|
||
|
||
response.token = (
|
||
response.token_id
|
||
) # remap token to use the hash, and leave the key in the `key` field [TODO]: clean up generate_key_helper_fn to do this
|
||
|
||
asyncio.create_task(
|
||
KeyManagementEventHooks.async_key_generated_hook(
|
||
data=data,
|
||
response=response,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
)
|
||
)
|
||
|
||
return response
|
||
|
||
|
||
def _check_key_model_specific_limits(
|
||
keys: List[LiteLLM_VerificationToken],
|
||
data: Union[GenerateKeyRequest, UpdateKeyRequest],
|
||
entity_rpm_limit: Optional[int],
|
||
entity_tpm_limit: Optional[int],
|
||
entity_model_rpm_limit_dict: Dict[str, int],
|
||
entity_model_tpm_limit_dict: Dict[str, int],
|
||
entity_type: str, # "team" or "organization"
|
||
) -> None:
|
||
"""
|
||
Generic function to check if a key is allocating model specific limits.
|
||
Raises an error if we're overallocating.
|
||
"""
|
||
model_rpm_limit = getattr(data, "model_rpm_limit", None) or (
|
||
data.metadata.get("model_rpm_limit", None) if data.metadata else None
|
||
)
|
||
model_tpm_limit = getattr(data, "model_tpm_limit", None) or (
|
||
data.metadata.get("model_tpm_limit", None) if data.metadata else None
|
||
)
|
||
if model_rpm_limit is None and model_tpm_limit is None:
|
||
return
|
||
|
||
# get total model specific tpm/rpm limit
|
||
model_specific_rpm_limit: Dict[str, int] = {}
|
||
model_specific_tpm_limit: Dict[str, int] = {}
|
||
|
||
for key in keys:
|
||
if key.metadata.get("model_rpm_limit", None) is not None:
|
||
for model, rpm_limit in key.metadata.get("model_rpm_limit", {}).items():
|
||
model_specific_rpm_limit[model] = (
|
||
model_specific_rpm_limit.get(model, 0) + rpm_limit
|
||
)
|
||
if key.metadata.get("model_tpm_limit", None) is not None:
|
||
for model, tpm_limit in key.metadata.get("model_tpm_limit", {}).items():
|
||
model_specific_tpm_limit[model] = (
|
||
model_specific_tpm_limit.get(model, 0) + tpm_limit
|
||
)
|
||
|
||
if model_rpm_limit is not None:
|
||
for model, rpm_limit in model_rpm_limit.items():
|
||
if (
|
||
entity_rpm_limit is not None
|
||
and model_specific_rpm_limit.get(model, 0) + rpm_limit
|
||
> entity_rpm_limit
|
||
):
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Allocated RPM limit={model_specific_rpm_limit.get(model, 0)} + Key RPM limit={rpm_limit} is greater than {entity_type} RPM limit={entity_rpm_limit}",
|
||
)
|
||
elif entity_model_rpm_limit_dict:
|
||
entity_model_specific_rpm_limit = entity_model_rpm_limit_dict.get(model)
|
||
if (
|
||
entity_model_specific_rpm_limit
|
||
and model_specific_rpm_limit.get(model, 0) + rpm_limit
|
||
> entity_model_specific_rpm_limit
|
||
):
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Allocated RPM limit={model_specific_rpm_limit.get(model, 0)} + Key RPM limit={rpm_limit} is greater than {entity_type} RPM limit={entity_model_specific_rpm_limit}",
|
||
)
|
||
|
||
if model_tpm_limit is not None:
|
||
for model, tpm_limit in model_tpm_limit.items():
|
||
if (
|
||
entity_tpm_limit is not None
|
||
and model_specific_tpm_limit.get(model, 0) + tpm_limit
|
||
> entity_tpm_limit
|
||
):
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Allocated TPM limit={model_specific_tpm_limit.get(model, 0)} + Key TPM limit={tpm_limit} is greater than {entity_type} TPM limit={entity_tpm_limit}",
|
||
)
|
||
elif entity_model_tpm_limit_dict:
|
||
entity_model_specific_tpm_limit = entity_model_tpm_limit_dict.get(model)
|
||
if (
|
||
entity_model_specific_tpm_limit
|
||
and model_specific_tpm_limit.get(model, 0) + tpm_limit
|
||
> entity_model_specific_tpm_limit
|
||
):
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Allocated TPM limit={model_specific_tpm_limit.get(model, 0)} + Key TPM limit={tpm_limit} is greater than {entity_type} TPM limit={entity_model_specific_tpm_limit}",
|
||
)
|
||
|
||
|
||
def _check_key_rpm_tpm_limits(
|
||
keys: List[LiteLLM_VerificationToken],
|
||
data: Union[GenerateKeyRequest, UpdateKeyRequest],
|
||
entity_rpm_limit: Optional[int],
|
||
entity_tpm_limit: Optional[int],
|
||
entity_type: str, # "team" or "organization"
|
||
) -> None:
|
||
"""
|
||
Generic function to check if a key is allocating rpm/tpm limits.
|
||
Raises an error if we're overallocating.
|
||
"""
|
||
if keys is not None and len(keys) > 0:
|
||
allocated_tpm = sum(key.tpm_limit for key in keys if key.tpm_limit is not None)
|
||
allocated_rpm = sum(key.rpm_limit for key in keys if key.rpm_limit is not None)
|
||
else:
|
||
allocated_tpm = 0
|
||
allocated_rpm = 0
|
||
|
||
if (
|
||
data.tpm_limit is not None
|
||
and entity_tpm_limit is not None
|
||
and data.tpm_limit + allocated_tpm > entity_tpm_limit
|
||
):
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Allocated TPM limit={allocated_tpm} + Key TPM limit={data.tpm_limit} is greater than {entity_type} TPM limit={entity_tpm_limit}",
|
||
)
|
||
if (
|
||
data.rpm_limit is not None
|
||
and entity_rpm_limit is not None
|
||
and data.rpm_limit + allocated_rpm > entity_rpm_limit
|
||
):
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Allocated RPM limit={allocated_rpm} + Key RPM limit={data.rpm_limit} is greater than {entity_type} RPM limit={entity_rpm_limit}",
|
||
)
|
||
|
||
|
||
def check_team_key_model_specific_limits(
|
||
keys: List[LiteLLM_VerificationToken],
|
||
team_table: LiteLLM_TeamTableCachedObj,
|
||
data: Union[GenerateKeyRequest, UpdateKeyRequest],
|
||
) -> None:
|
||
"""
|
||
Check if the team key is allocating model specific limits. If so, raise an error if we're overallocating.
|
||
"""
|
||
entity_model_rpm_limit_dict = {}
|
||
entity_model_tpm_limit_dict = {}
|
||
if team_table.metadata:
|
||
entity_model_rpm_limit_dict = team_table.metadata.get("model_rpm_limit", {})
|
||
entity_model_tpm_limit_dict = team_table.metadata.get("model_tpm_limit", {})
|
||
|
||
_check_key_model_specific_limits(
|
||
keys=keys,
|
||
data=data,
|
||
entity_rpm_limit=team_table.rpm_limit,
|
||
entity_tpm_limit=team_table.tpm_limit,
|
||
entity_model_rpm_limit_dict=entity_model_rpm_limit_dict,
|
||
entity_model_tpm_limit_dict=entity_model_tpm_limit_dict,
|
||
entity_type="team",
|
||
)
|
||
|
||
|
||
def check_team_key_rpm_tpm_limits(
|
||
keys: List[LiteLLM_VerificationToken],
|
||
team_table: LiteLLM_TeamTableCachedObj,
|
||
data: Union[GenerateKeyRequest, UpdateKeyRequest],
|
||
) -> None:
|
||
"""
|
||
Check if the team key is allocating rpm/tpm limits. If so, raise an error if we're overallocating.
|
||
"""
|
||
_check_key_rpm_tpm_limits(
|
||
keys=keys,
|
||
data=data,
|
||
entity_rpm_limit=team_table.rpm_limit,
|
||
entity_tpm_limit=team_table.tpm_limit,
|
||
entity_type="team",
|
||
)
|
||
|
||
|
||
async def _check_team_key_limits(
|
||
team_table: LiteLLM_TeamTableCachedObj,
|
||
data: Union[GenerateKeyRequest, UpdateKeyRequest],
|
||
prisma_client: PrismaClient,
|
||
) -> None:
|
||
"""
|
||
Check if the team key is allocating guaranteed throughput limits. If so, raise an error if we're overallocating.
|
||
|
||
Only runs check if tpm_limit_type or rpm_limit_type is "guaranteed_throughput"
|
||
"""
|
||
if (
|
||
data.tpm_limit_type != "guaranteed_throughput"
|
||
and data.rpm_limit_type != "guaranteed_throughput"
|
||
):
|
||
return
|
||
# get all team keys
|
||
# calculate allocated tpm/rpm limit
|
||
# check if specified tpm/rpm limit is greater than allocated tpm/rpm limit
|
||
|
||
keys = await prisma_client.db.litellm_verificationtoken.find_many(
|
||
where={"team_id": team_table.team_id},
|
||
)
|
||
# Exclude the key being updated to avoid double-counting its limits.
|
||
# key.token is the SHA-256 hash stored in DB; data.key is the raw key string.
|
||
if isinstance(data, UpdateKeyRequest):
|
||
hashed_key = hash_token(data.key)
|
||
keys = [key for key in keys if key.token != hashed_key]
|
||
check_team_key_model_specific_limits(
|
||
keys=keys,
|
||
team_table=team_table,
|
||
data=data,
|
||
)
|
||
check_team_key_rpm_tpm_limits(
|
||
keys=keys,
|
||
team_table=team_table,
|
||
data=data,
|
||
)
|
||
|
||
|
||
async def _check_project_key_limits(
|
||
project_id: str,
|
||
data: Union[GenerateKeyRequest, UpdateKeyRequest],
|
||
prisma_client: PrismaClient,
|
||
user_api_key_cache: DualCache,
|
||
) -> None:
|
||
"""
|
||
Validate that key's models and budget respect its project's limits.
|
||
|
||
- Key models must be a subset of project models
|
||
- Key max_budget must be <= project max_budget
|
||
"""
|
||
project_obj = await get_project_object(
|
||
project_id=project_id,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
)
|
||
|
||
if project_obj is None:
|
||
raise HTTPException(
|
||
status_code=404,
|
||
detail={"error": f"Project not found, project_id={project_id}"},
|
||
)
|
||
|
||
# Validate key models are a subset of project models
|
||
if data.models and len(project_obj.models) > 0:
|
||
for m in data.models:
|
||
if m not in project_obj.models:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={
|
||
"error": f"Model '{m}' not in project's allowed models. Project allowed models={project_obj.models}. Project: {project_id}"
|
||
},
|
||
)
|
||
|
||
# Validate key max_budget <= project max_budget
|
||
project_max_budget = None
|
||
if project_obj.litellm_budget_table is not None:
|
||
project_max_budget = getattr(
|
||
project_obj.litellm_budget_table, "max_budget", None
|
||
)
|
||
|
||
if (
|
||
data.max_budget is not None
|
||
and project_max_budget is not None
|
||
and data.max_budget > project_max_budget
|
||
):
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={
|
||
"error": f"Key max_budget ({data.max_budget}) exceeds project's max_budget ({project_max_budget}). Project: {project_id}"
|
||
},
|
||
)
|
||
|
||
|
||
def check_org_key_model_specific_limits(
|
||
keys: List[LiteLLM_VerificationToken],
|
||
org_table: LiteLLM_OrganizationTable,
|
||
data: Union[GenerateKeyRequest, UpdateKeyRequest],
|
||
) -> None:
|
||
"""
|
||
Check if the organization key is allocating model specific limits. If so, raise an error if we're overallocating.
|
||
"""
|
||
# Get org limits from budget table if available
|
||
entity_rpm_limit = None
|
||
entity_tpm_limit = None
|
||
entity_model_rpm_limit_dict = {}
|
||
entity_model_tpm_limit_dict = {}
|
||
|
||
if org_table.litellm_budget_table is not None:
|
||
entity_rpm_limit = org_table.litellm_budget_table.rpm_limit
|
||
entity_tpm_limit = org_table.litellm_budget_table.tpm_limit
|
||
|
||
if org_table.metadata:
|
||
entity_model_rpm_limit_dict = org_table.metadata.get("model_rpm_limit", {})
|
||
entity_model_tpm_limit_dict = org_table.metadata.get("model_tpm_limit", {})
|
||
|
||
_check_key_model_specific_limits(
|
||
keys=keys,
|
||
data=data,
|
||
entity_rpm_limit=entity_rpm_limit,
|
||
entity_tpm_limit=entity_tpm_limit,
|
||
entity_model_rpm_limit_dict=entity_model_rpm_limit_dict,
|
||
entity_model_tpm_limit_dict=entity_model_tpm_limit_dict,
|
||
entity_type="organization",
|
||
)
|
||
|
||
|
||
def check_org_key_rpm_tpm_limits(
|
||
keys: List[LiteLLM_VerificationToken],
|
||
org_table: LiteLLM_OrganizationTable,
|
||
data: Union[GenerateKeyRequest, UpdateKeyRequest],
|
||
) -> None:
|
||
"""
|
||
Check if the organization key is allocating rpm/tpm limits. If so, raise an error if we're overallocating.
|
||
"""
|
||
# Get org limits from budget table if available
|
||
entity_rpm_limit = None
|
||
entity_tpm_limit = None
|
||
|
||
if org_table.litellm_budget_table is not None:
|
||
entity_rpm_limit = org_table.litellm_budget_table.rpm_limit
|
||
entity_tpm_limit = org_table.litellm_budget_table.tpm_limit
|
||
|
||
_check_key_rpm_tpm_limits(
|
||
keys=keys,
|
||
data=data,
|
||
entity_rpm_limit=entity_rpm_limit,
|
||
entity_tpm_limit=entity_tpm_limit,
|
||
entity_type="organization",
|
||
)
|
||
|
||
|
||
async def _check_org_key_limits(
|
||
org_table: LiteLLM_OrganizationTable,
|
||
data: Union[GenerateKeyRequest, UpdateKeyRequest],
|
||
prisma_client: PrismaClient,
|
||
) -> None:
|
||
"""
|
||
Check if the organization key is allocating guaranteed throughput limits. If so, raise an error if we're overallocating.
|
||
|
||
Only runs check if tpm_limit_type or rpm_limit_type is "guaranteed_throughput"
|
||
"""
|
||
|
||
rpm_limit_type = getattr(data, "rpm_limit_type", None) or (
|
||
data.metadata.get("rpm_limit_type", None) if data.metadata else None
|
||
)
|
||
tpm_limit_type = getattr(data, "tpm_limit_type", None) or (
|
||
data.metadata.get("tpm_limit_type", None) if data.metadata else None
|
||
)
|
||
|
||
if (
|
||
tpm_limit_type != "guaranteed_throughput"
|
||
and rpm_limit_type != "guaranteed_throughput"
|
||
):
|
||
return
|
||
# get all organization keys
|
||
# calculate allocated tpm/rpm limit
|
||
# check if specified tpm/rpm limit is greater than allocated tpm/rpm limit
|
||
keys = await prisma_client.db.litellm_verificationtoken.find_many(
|
||
where={"organization_id": org_table.organization_id},
|
||
)
|
||
# Exclude the key being updated to avoid double-counting its limits.
|
||
# key.token is the SHA-256 hash stored in DB; data.key is the raw key string.
|
||
if isinstance(data, UpdateKeyRequest):
|
||
hashed_key = hash_token(data.key)
|
||
keys = [key for key in keys if key.token != hashed_key]
|
||
check_org_key_model_specific_limits(
|
||
keys=keys,
|
||
org_table=org_table,
|
||
data=data,
|
||
)
|
||
check_org_key_rpm_tpm_limits(
|
||
keys=keys,
|
||
org_table=org_table,
|
||
data=data,
|
||
)
|
||
|
||
|
||
@router.post(
|
||
"/key/generate",
|
||
tags=["key management"],
|
||
dependencies=[Depends(user_api_key_auth)],
|
||
response_model=GenerateKeyResponse,
|
||
)
|
||
@management_endpoint_wrapper
|
||
async def generate_key_fn(
|
||
data: GenerateKeyRequest,
|
||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||
litellm_changed_by: Optional[str] = Header(
|
||
None,
|
||
description="The litellm-changed-by header enables tracking of actions performed by authorized users on behalf of other users, providing an audit trail for accountability",
|
||
),
|
||
):
|
||
"""
|
||
Generate an API key based on the provided data.
|
||
|
||
Docs: https://docs.litellm.ai/docs/proxy/virtual_keys
|
||
|
||
Parameters:
|
||
- duration: Optional[str] - Specify the length of time the token is valid for. You can set duration as seconds ("30s"), minutes ("30m"), hours ("30h"), days ("30d").
|
||
- key_alias: Optional[str] - User defined key alias
|
||
- key: Optional[str] - User defined key value. If not set, a 16-digit unique sk-key is created for you.
|
||
- team_id: Optional[str] - The team id of the key
|
||
- user_id: Optional[str] - The user id of the key
|
||
- agent_id: Optional[str] - The agent id associated with the key.
|
||
- organization_id: Optional[str] - The organization id of the key. If not set, and team_id is set, the organization id will be the same as the team id. If conflict, an error will be raised.
|
||
- project_id: Optional[str] - The project id of the key. When set, models and max_budget are validated against the project's limits.
|
||
- budget_id: Optional[str] - The budget id associated with the key. Created by calling `/budget/new`.
|
||
- models: Optional[list] - Model_name's a user is allowed to call. (if empty, key is allowed to call all models)
|
||
- aliases: Optional[dict] - Any alias mappings, on top of anything in the config.yaml model list. - https://docs.litellm.ai/docs/proxy/virtual_keys#managing-auth---upgradedowngrade-models
|
||
- config: Optional[dict] - any key-specific configs, overrides config in config.yaml
|
||
- spend: Optional[int] - Amount spent by key. Default is 0. Will be updated by proxy whenever key is used. https://docs.litellm.ai/docs/proxy/virtual_keys#managing-auth---tracking-spend
|
||
- send_invite_email: Optional[bool] - Whether to send an invite email to the user_id, with the generate key
|
||
- max_budget: Optional[float] - Specify max budget for a given key.
|
||
- budget_duration: Optional[str] - Budget is reset at the end of specified duration. If not set, budget is never reset. You can set duration as seconds ("30s"), minutes ("30m"), hours ("30h"), days ("30d").
|
||
- max_parallel_requests: Optional[int] - Rate limit a user based on the number of parallel requests. Raises 429 error, if user's parallel requests > x.
|
||
- metadata: Optional[dict] - Metadata for key, store information for key. Example metadata = {"team": "core-infra", "app": "app2", "email": "ishaan@berri.ai" }
|
||
- guardrails: Optional[List[str]] - List of active guardrails for the key
|
||
- policies: Optional[List[str]] - List of policy names to apply to the key. Policies define guardrails, conditions, and inheritance rules.
|
||
- disable_global_guardrails: Optional[bool] - Whether to disable global guardrails for the key.
|
||
- permissions: Optional[dict] - key-specific permissions. Currently just used for turning off pii masking (if connected). Example - {"pii": false}
|
||
- model_max_budget: Optional[Dict[str, BudgetConfig]] - Model-specific budgets {"gpt-4": {"budget_limit": 0.0005, "time_period": "30d"}}}. IF null or {} then no model specific budget.
|
||
- model_rpm_limit: Optional[dict] - key-specific model rpm limit. Example - {"text-davinci-002": 1000, "gpt-3.5-turbo": 1000}. IF null or {} then no model specific rpm limit.
|
||
- model_tpm_limit: Optional[dict] - key-specific model tpm limit. Example - {"text-davinci-002": 1000, "gpt-3.5-turbo": 1000}. IF null or {} then no model specific tpm limit.
|
||
- tpm_limit_type: Optional[str] - Type of tpm limit. Options: "best_effort_throughput" (no error if we're overallocating tpm), "guaranteed_throughput" (raise an error if we're overallocating tpm), "dynamic" (dynamically exceed limit when no 429 errors). Defaults to "best_effort_throughput".
|
||
- rpm_limit_type: Optional[str] - Type of rpm limit. Options: "best_effort_throughput" (no error if we're overallocating rpm), "guaranteed_throughput" (raise an error if we're overallocating rpm), "dynamic" (dynamically exceed limit when no 429 errors). Defaults to "best_effort_throughput".
|
||
- allowed_cache_controls: Optional[list] - List of allowed cache control values. Example - ["no-cache", "no-store"]. See all values - https://docs.litellm.ai/docs/proxy/caching#turn-on--off-caching-per-request
|
||
- blocked: Optional[bool] - Whether the key is blocked.
|
||
- rpm_limit: Optional[int] - Specify rpm limit for a given key (Requests per minute)
|
||
- tpm_limit: Optional[int] - Specify tpm limit for a given key (Tokens per minute)
|
||
- soft_budget: Optional[float] - Specify soft budget for a given key. Will trigger a slack alert when this soft budget is reached.
|
||
- tags: Optional[List[str]] - Tags for [tracking spend](https://litellm.vercel.app/docs/proxy/enterprise#tracking-spend-for-custom-tags) and/or doing [tag-based routing](https://litellm.vercel.app/docs/proxy/tag_routing).
|
||
- prompts: Optional[List[str]] - List of prompts that the key is allowed to use.
|
||
- enforced_params: Optional[List[str]] - List of enforced params for the key (Enterprise only). [Docs](https://docs.litellm.ai/docs/proxy/enterprise#enforce-required-params-for-llm-requests)
|
||
- prompts: Optional[List[str]] - List of prompts that the key is allowed to use.
|
||
- allowed_routes: Optional[list] - List of allowed routes for the key. Store the actual route or store a wildcard pattern for a set of routes. Example - ["/chat/completions", "/embeddings", "/keys/*"]
|
||
- allowed_passthrough_routes: Optional[list] - List of allowed pass through endpoints for the key. Store the actual endpoint or store a wildcard pattern for a set of endpoints. Example - ["/my-custom-endpoint"]. Use this instead of allowed_routes, if you just want to specify which pass through endpoints the key can access, without specifying the routes. If allowed_routes is specified, allowed_pass_through_endpoints is ignored.
|
||
- object_permission: Optional[LiteLLM_ObjectPermissionBase] - key-specific object permission. Example - {"vector_stores": ["vector_store_1", "vector_store_2"], "agents": ["agent_1", "agent_2"], "agent_access_groups": ["dev_group"]}. IF null or {} then no object permission.
|
||
- key_type: Optional[str] - Type of key that determines default allowed routes. Options: "llm_api" (can call LLM API routes), "management" (can call management routes), "read_only" (can only call info/read routes), "default" (uses default allowed routes). Defaults to "default".
|
||
- prompts: Optional[List[str]] - List of allowed prompts for the key. If specified, the key will only be able to use these specific prompts.
|
||
- auto_rotate: Optional[bool] - Whether this key should be automatically rotated (regenerated)
|
||
- rotation_interval: Optional[str] - How often to auto-rotate this key (e.g., '30s', '30m', '30h', '30d'). Required if auto_rotate=True.
|
||
- allowed_vector_store_indexes: Optional[List[dict]] - List of allowed vector store indexes for the key. Example - [{"index_name": "my-index", "index_permissions": ["write", "read"]}]. If specified, the key will only be able to use these specific vector store indexes. Create index, using `/v1/indexes` endpoint.
|
||
- router_settings: Optional[UpdateRouterConfig] - key-specific router settings. Example - {"model_group_retry_policy": {"max_retries": 5}}. IF null or {} then no router settings.
|
||
- access_group_ids: Optional[List[str]] - List of access group IDs to associate with the key. Access groups define which models a key can access. Example - ["access_group_1", "access_group_2"].
|
||
|
||
Examples:
|
||
|
||
1. Allow users to turn on/off pii masking
|
||
|
||
```bash
|
||
curl --location 'http://0.0.0.0:4000/key/generate' \
|
||
--header 'Authorization: Bearer sk-1234' \
|
||
--header 'Content-Type: application/json' \
|
||
--data '{
|
||
"permissions": {"allow_pii_controls": true}
|
||
}'
|
||
```
|
||
|
||
Returns:
|
||
- key: (str) The generated api key
|
||
- expires: (datetime) Datetime object for when key expires.
|
||
- user_id: (str) Unique user id - used for tracking spend across multiple keys for same user id.
|
||
"""
|
||
try:
|
||
from litellm.proxy._types import CommonProxyErrors
|
||
from litellm.proxy.proxy_server import (
|
||
prisma_client,
|
||
user_api_key_cache,
|
||
user_custom_key_generate,
|
||
)
|
||
|
||
if prisma_client is None:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||
detail={"error": CommonProxyErrors.db_not_connected_error.value},
|
||
)
|
||
|
||
verbose_proxy_logger.debug("entered /key/generate")
|
||
|
||
# Validate budget values are not negative
|
||
if data.max_budget is not None and data.max_budget < 0:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={
|
||
"error": f"max_budget cannot be negative. Received: {data.max_budget}"
|
||
},
|
||
)
|
||
if data.soft_budget is not None and data.soft_budget < 0:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={
|
||
"error": f"soft_budget cannot be negative. Received: {data.soft_budget}"
|
||
},
|
||
)
|
||
|
||
if user_custom_key_generate is not None:
|
||
if inspect.iscoroutinefunction(user_custom_key_generate):
|
||
result = await user_custom_key_generate(data) # type: ignore
|
||
else:
|
||
raise ValueError("user_custom_key_generate must be a coroutine")
|
||
decision = result.get("decision", True)
|
||
message = result.get("message", "Authentication Failed - Custom Auth Rule")
|
||
if not decision:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_403_FORBIDDEN, detail=message
|
||
)
|
||
# For non-admin internal users: auto-assign caller's user_id if not provided
|
||
# This prevents creating unbound keys with no user association (LIT-1884)
|
||
_is_proxy_admin = (
|
||
user_api_key_dict.user_role is not None
|
||
and user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value
|
||
)
|
||
if not _is_proxy_admin and data.user_id is None:
|
||
data.user_id = user_api_key_dict.user_id
|
||
verbose_proxy_logger.warning(
|
||
"key/generate: auto-assigning user_id=%s for non-admin caller",
|
||
user_api_key_dict.user_id,
|
||
)
|
||
|
||
team_table: Optional[LiteLLM_TeamTableCachedObj] = None
|
||
if data.team_id is not None:
|
||
try:
|
||
team_table = await get_team_object(
|
||
team_id=data.team_id,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
parent_otel_span=user_api_key_dict.parent_otel_span,
|
||
check_db_only=True,
|
||
)
|
||
except Exception as e:
|
||
verbose_proxy_logger.debug(
|
||
f"Error getting team object in `/key/generate`: {e}"
|
||
)
|
||
# For non-admin callers, team must exist (LIT-1884)
|
||
if not _is_proxy_admin:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Team not found for team_id={data.team_id}. Non-admin users cannot create keys for non-existent teams.",
|
||
)
|
||
|
||
key_generation_check(
|
||
team_table=team_table,
|
||
user_api_key_dict=user_api_key_dict,
|
||
data=data,
|
||
route=KeyManagementRoutes.KEY_GENERATE,
|
||
)
|
||
|
||
if team_table is not None:
|
||
await _check_team_key_limits(
|
||
team_table=team_table,
|
||
data=data,
|
||
prisma_client=prisma_client,
|
||
)
|
||
|
||
# Validate key against project limits if project_id is set
|
||
if data.project_id is not None:
|
||
await _check_project_key_limits(
|
||
project_id=data.project_id,
|
||
data=data,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
)
|
||
|
||
return await _common_key_generation_helper(
|
||
data=data,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
team_table=team_table,
|
||
)
|
||
|
||
except Exception as e:
|
||
verbose_proxy_logger.exception(
|
||
"litellm.proxy.proxy_server.generate_key_fn(): Exception occured - {}".format(
|
||
str(e)
|
||
)
|
||
)
|
||
raise handle_exception_on_proxy(e)
|
||
|
||
|
||
@router.post(
|
||
"/key/service-account/generate",
|
||
tags=["key management"],
|
||
dependencies=[Depends(user_api_key_auth)],
|
||
)
|
||
@management_endpoint_wrapper
|
||
async def generate_service_account_key_fn(
|
||
data: GenerateKeyRequest,
|
||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||
litellm_changed_by: Optional[str] = Header(
|
||
None,
|
||
description="The litellm-changed-by header enables tracking of actions performed by authorized users on behalf of other users, providing an audit trail for accountability",
|
||
),
|
||
):
|
||
"""
|
||
Generate a Service Account API key based on the provided data. This key does not belong to any user. It belongs to the team.
|
||
|
||
Why use a service account key?
|
||
- Prevent key from being deleted when user is deleted.
|
||
- Apply team limits, not team member limits to key.
|
||
|
||
Docs: https://docs.litellm.ai/docs/proxy/virtual_keys
|
||
|
||
Parameters:
|
||
- duration: Optional[str] - Specify the length of time the token is valid for. You can set duration as seconds ("30s"), minutes ("30m"), hours ("30h"), days ("30d").
|
||
- key_alias: Optional[str] - User defined key alias
|
||
- key: Optional[str] - User defined key value. If not set, a 16-digit unique sk-key is created for you.
|
||
- team_id: Optional[str] - The team id of the key
|
||
- user_id: Optional[str] - [NON-FUNCTIONAL] THIS WILL BE IGNORED. The user id of the key
|
||
- budget_id: Optional[str] - The budget id associated with the key. Created by calling `/budget/new`.
|
||
- models: Optional[list] - Model_name's a user is allowed to call. (if empty, key is allowed to call all models)
|
||
- aliases: Optional[dict] - Any alias mappings, on top of anything in the config.yaml model list. - https://docs.litellm.ai/docs/proxy/virtual_keys#managing-auth---upgradedowngrade-models
|
||
- config: Optional[dict] - any key-specific configs, overrides config in config.yaml
|
||
- spend: Optional[int] - Amount spent by key. Default is 0. Will be updated by proxy whenever key is used. https://docs.litellm.ai/docs/proxy/virtual_keys#managing-auth---tracking-spend
|
||
- send_invite_email: Optional[bool] - Whether to send an invite email to the user_id, with the generate key
|
||
- max_budget: Optional[float] - Specify max budget for a given key.
|
||
- budget_duration: Optional[str] - Budget is reset at the end of specified duration. If not set, budget is never reset. You can set duration as seconds ("30s"), minutes ("30m"), hours ("30h"), days ("30d").
|
||
- max_parallel_requests: Optional[int] - Rate limit a user based on the number of parallel requests. Raises 429 error, if user's parallel requests > x.
|
||
- metadata: Optional[dict] - Metadata for key, store information for key. Example metadata = {"team": "core-infra", "app": "app2", "email": "ishaan@berri.ai" }
|
||
- guardrails: Optional[List[str]] - List of active guardrails for the key
|
||
- permissions: Optional[dict] - key-specific permissions. Currently just used for turning off pii masking (if connected). Example - {"pii": false}
|
||
- model_max_budget: Optional[Dict[str, BudgetConfig]] - Model-specific budgets {"gpt-4": {"budget_limit": 0.0005, "time_period": "30d"}}}. IF null or {} then no model specific budget.
|
||
- model_rpm_limit: Optional[dict] - key-specific model rpm limit. Example - {"text-davinci-002": 1000, "gpt-3.5-turbo": 1000}. IF null or {} then no model specific rpm limit.
|
||
- model_tpm_limit: Optional[dict] - key-specific model tpm limit. Example - {"text-davinci-002": 1000, "gpt-3.5-turbo": 1000}. IF null or {} then no model specific tpm limit.
|
||
- tpm_limit_type: Optional[str] - TPM rate limit type - "best_effort_throughput", "guaranteed_throughput", or "dynamic"
|
||
- rpm_limit_type: Optional[str] - RPM rate limit type - "best_effort_throughput", "guaranteed_throughput", or "dynamic"
|
||
- allowed_cache_controls: Optional[list] - List of allowed cache control values. Example - ["no-cache", "no-store"]. See all values - https://docs.litellm.ai/docs/proxy/caching#turn-on--off-caching-per-request
|
||
- blocked: Optional[bool] - Whether the key is blocked.
|
||
- rpm_limit: Optional[int] - Specify rpm limit for a given key (Requests per minute)
|
||
- tpm_limit: Optional[int] - Specify tpm limit for a given key (Tokens per minute)
|
||
- soft_budget: Optional[float] - Specify soft budget for a given key. Will trigger a slack alert when this soft budget is reached.
|
||
- tags: Optional[List[str]] - Tags for [tracking spend](https://litellm.vercel.app/docs/proxy/enterprise#tracking-spend-for-custom-tags) and/or doing [tag-based routing](https://litellm.vercel.app/docs/proxy/tag_routing).
|
||
- enforced_params: Optional[List[str]] - List of enforced params for the key (Enterprise only). [Docs](https://docs.litellm.ai/docs/proxy/enterprise#enforce-required-params-for-llm-requests)
|
||
- allowed_routes: Optional[list] - List of allowed routes for the key. Store the actual route or store a wildcard pattern for a set of routes. Example - ["/chat/completions", "/embeddings", "/keys/*"]
|
||
- object_permission: Optional[LiteLLM_ObjectPermissionBase] - key-specific object permission. Example - {"vector_stores": ["vector_store_1", "vector_store_2"], "agents": ["agent_1", "agent_2"], "agent_access_groups": ["dev_group"]}. IF null or {} then no object permission.
|
||
Examples:
|
||
- allowed_vector_store_indexes: Optional[List[dict]] - List of allowed vector store indexes for the key. Example - [{"index_name": "my-index", "index_permissions": ["write", "read"]}]. If specified, the key will only be able to use these specific vector store indexes. Create index, using `/v1/indexes` endpoint.
|
||
|
||
|
||
1. Allow users to turn on/off pii masking
|
||
|
||
```bash
|
||
curl --location 'http://0.0.0.0:4000/key/generate' \
|
||
--header 'Authorization: Bearer sk-1234' \
|
||
--header 'Content-Type: application/json' \
|
||
--data '{
|
||
"permissions": {"allow_pii_controls": true}
|
||
}'
|
||
```
|
||
|
||
Returns:
|
||
- key: (str) The generated api key
|
||
- expires: (datetime) Datetime object for when key expires.
|
||
- user_id: (str) Unique user id - used for tracking spend across multiple keys for same user id.
|
||
|
||
"""
|
||
from litellm.proxy._types import CommonProxyErrors
|
||
from litellm.proxy.proxy_server import (
|
||
prisma_client,
|
||
user_api_key_cache,
|
||
user_custom_key_generate,
|
||
)
|
||
|
||
if prisma_client is None:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||
detail={"error": CommonProxyErrors.db_not_connected_error.value},
|
||
)
|
||
|
||
await validate_team_id_used_in_service_account_request(
|
||
team_id=data.team_id,
|
||
prisma_client=prisma_client,
|
||
)
|
||
|
||
verbose_proxy_logger.debug("entered /key/generate")
|
||
|
||
if user_custom_key_generate is not None:
|
||
if inspect.iscoroutinefunction(user_custom_key_generate):
|
||
result = await user_custom_key_generate(data) # type: ignore
|
||
else:
|
||
raise ValueError("user_custom_key_generate must be a coroutine")
|
||
decision = result.get("decision", True)
|
||
message = result.get("message", "Authentication Failed - Custom Auth Rule")
|
||
if not decision:
|
||
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail=message)
|
||
team_table: Optional[LiteLLM_TeamTableCachedObj] = None
|
||
if data.team_id is not None:
|
||
try:
|
||
team_table = await get_team_object(
|
||
team_id=data.team_id,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
parent_otel_span=user_api_key_dict.parent_otel_span,
|
||
check_db_only=True,
|
||
)
|
||
except Exception as e:
|
||
verbose_proxy_logger.debug(
|
||
f"Error getting team object in `/key/generate`: {e}"
|
||
)
|
||
team_table = None
|
||
|
||
if team_table is not None:
|
||
await _check_team_key_limits(
|
||
team_table=team_table,
|
||
data=data,
|
||
prisma_client=prisma_client,
|
||
)
|
||
|
||
key_generation_check(
|
||
team_table=team_table,
|
||
user_api_key_dict=user_api_key_dict,
|
||
data=data,
|
||
route=KeyManagementRoutes.KEY_GENERATE_SERVICE_ACCOUNT,
|
||
)
|
||
|
||
data.user_id = None # do not allow user_id to be set for service account keys
|
||
|
||
return await _common_key_generation_helper(
|
||
data=data,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
team_table=team_table,
|
||
)
|
||
|
||
|
||
def prepare_metadata_fields(
|
||
data: BaseModel, non_default_values: dict, existing_metadata: dict
|
||
) -> dict:
|
||
"""
|
||
Check LiteLLM_ManagementEndpoint_MetadataFields (proxy/_types.py) for fields that are allowed to be updated
|
||
"""
|
||
if "metadata" not in non_default_values: # allow user to set metadata to none
|
||
non_default_values["metadata"] = existing_metadata.copy()
|
||
|
||
casted_metadata = cast(dict, non_default_values["metadata"])
|
||
|
||
data_json = data.model_dump(exclude_unset=True, exclude_none=True)
|
||
|
||
try:
|
||
for k, v in data_json.items():
|
||
if k in LiteLLM_ManagementEndpoint_MetadataFields:
|
||
if isinstance(v, datetime):
|
||
casted_metadata[k] = v.isoformat()
|
||
else:
|
||
casted_metadata[k] = v
|
||
if k in LiteLLM_ManagementEndpoint_MetadataFields_Premium:
|
||
from litellm.proxy.utils import _premium_user_check
|
||
|
||
_premium_user_check(k)
|
||
casted_metadata[k] = v
|
||
|
||
except Exception as e:
|
||
verbose_proxy_logger.exception(
|
||
"litellm.proxy.proxy_server.prepare_metadata_fields(): Exception occured - {}".format(
|
||
str(e)
|
||
)
|
||
)
|
||
|
||
non_default_values["metadata"] = casted_metadata
|
||
return non_default_values
|
||
|
||
|
||
async def prepare_key_update_data(
|
||
data: Union[UpdateKeyRequest, RegenerateKeyRequest],
|
||
existing_key_row: LiteLLM_VerificationToken,
|
||
):
|
||
data_json: dict = data.model_dump(exclude_unset=True)
|
||
data_json.pop("key", None)
|
||
data_json.pop("new_key", None)
|
||
data_json.pop("grace_period", None) # Request-only param, not a DB column
|
||
if (
|
||
data.metadata is not None
|
||
and data.metadata.get("service_account_id") is not None
|
||
and (data.team_id or existing_key_row.team_id) is None
|
||
):
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail="team_id is required for service account keys. Please specify `team_id` in the request body.",
|
||
)
|
||
non_default_values = {}
|
||
# ADD METADATA FIELDS
|
||
# Set Management Endpoint Metadata Fields
|
||
for field in LiteLLM_ManagementEndpoint_MetadataFields_Premium:
|
||
if getattr(data, field, None) is not None:
|
||
_set_object_metadata_field(
|
||
object_data=data,
|
||
field_name=field,
|
||
value=getattr(data, field),
|
||
)
|
||
for k, v in data_json.items():
|
||
if (
|
||
k in LiteLLM_ManagementEndpoint_MetadataFields
|
||
or k in LiteLLM_ManagementEndpoint_MetadataFields_Premium
|
||
):
|
||
continue
|
||
non_default_values[k] = v
|
||
|
||
if "duration" in non_default_values:
|
||
duration = non_default_values.pop("duration")
|
||
if duration is None or duration == "-1":
|
||
# Set expires to None to indicate the key never expires
|
||
non_default_values["expires"] = None
|
||
elif duration and (isinstance(duration, str)) and len(duration) > 0:
|
||
duration_s = duration_in_seconds(duration=duration)
|
||
expires = datetime.now(timezone.utc) + timedelta(seconds=duration_s)
|
||
non_default_values["expires"] = expires
|
||
|
||
if "budget_duration" in non_default_values:
|
||
budget_duration = non_default_values.pop("budget_duration")
|
||
if (
|
||
budget_duration
|
||
and (isinstance(budget_duration, str))
|
||
and len(budget_duration) > 0
|
||
):
|
||
from litellm.proxy.common_utils.timezone_utils import get_budget_reset_time
|
||
|
||
key_reset_at = get_budget_reset_time(budget_duration=budget_duration)
|
||
non_default_values["budget_reset_at"] = key_reset_at
|
||
non_default_values["budget_duration"] = budget_duration
|
||
|
||
if "object_permission" in non_default_values:
|
||
non_default_values = await _handle_update_object_permission(
|
||
data_json=non_default_values,
|
||
existing_key_row=existing_key_row,
|
||
)
|
||
|
||
_metadata = existing_key_row.metadata or {}
|
||
|
||
# validate model_max_budget
|
||
if "model_max_budget" in non_default_values:
|
||
validate_model_max_budget(non_default_values["model_max_budget"])
|
||
|
||
# Serialize router_settings to JSON if present
|
||
if (
|
||
"router_settings" in non_default_values
|
||
and non_default_values["router_settings"] is not None
|
||
):
|
||
non_default_values["router_settings"] = safe_dumps(
|
||
non_default_values["router_settings"]
|
||
)
|
||
|
||
non_default_values = prepare_metadata_fields(
|
||
data=data, non_default_values=non_default_values, existing_metadata=_metadata
|
||
)
|
||
|
||
return non_default_values
|
||
|
||
|
||
async def _handle_update_object_permission(
|
||
data_json: dict,
|
||
existing_key_row: LiteLLM_VerificationToken,
|
||
) -> dict:
|
||
"""
|
||
Handle the update of object permission.
|
||
"""
|
||
from litellm.proxy.proxy_server import prisma_client
|
||
|
||
# Use the common helper to handle the object permission update
|
||
object_permission_id = await handle_update_object_permission_common(
|
||
data_json=data_json,
|
||
existing_object_permission_id=existing_key_row.object_permission_id,
|
||
prisma_client=prisma_client,
|
||
)
|
||
|
||
# Add the object_permission_id to data_json if one was created/updated
|
||
if object_permission_id is not None:
|
||
data_json["object_permission_id"] = object_permission_id
|
||
verbose_proxy_logger.debug(
|
||
f"updated object_permission_id: {object_permission_id}"
|
||
)
|
||
|
||
return data_json
|
||
|
||
|
||
def is_different_team(
|
||
data: UpdateKeyRequest, existing_key_row: LiteLLM_VerificationToken
|
||
) -> bool:
|
||
if data.team_id is None:
|
||
return False
|
||
if existing_key_row.team_id is None:
|
||
return True
|
||
return data.team_id != existing_key_row.team_id
|
||
|
||
|
||
def _validate_max_budget(max_budget: Optional[float]) -> None:
|
||
"""
|
||
Validate that max_budget is not negative.
|
||
|
||
Args:
|
||
max_budget: The max_budget value to validate
|
||
|
||
Raises:
|
||
HTTPException: If max_budget is negative
|
||
"""
|
||
if max_budget is not None and max_budget < 0:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={"error": f"max_budget cannot be negative. Received: {max_budget}"},
|
||
)
|
||
|
||
|
||
async def _get_and_validate_existing_key(
|
||
token: str, prisma_client: Optional[PrismaClient]
|
||
) -> LiteLLM_VerificationToken:
|
||
"""
|
||
Get existing key from database and validate it exists.
|
||
|
||
Args:
|
||
token: The key token to look up
|
||
prisma_client: Prisma client instance
|
||
|
||
Returns:
|
||
LiteLLM_VerificationToken: The existing key row
|
||
|
||
Raises:
|
||
HTTPException: If key is not found
|
||
"""
|
||
if prisma_client is None:
|
||
raise HTTPException(
|
||
status_code=500,
|
||
detail={"error": "Database not connected"},
|
||
)
|
||
|
||
existing_key_row = await prisma_client.get_data(
|
||
token=token,
|
||
table_name="key",
|
||
query_type="find_unique",
|
||
)
|
||
|
||
if existing_key_row is None:
|
||
raise HTTPException(
|
||
status_code=404,
|
||
detail={"error": f"Key not found: {token}"},
|
||
)
|
||
|
||
return existing_key_row
|
||
|
||
|
||
async def _process_single_key_update(
|
||
key_update_item: BulkUpdateKeyRequestItem,
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
litellm_changed_by: Optional[str],
|
||
prisma_client: Optional[PrismaClient],
|
||
user_api_key_cache: DualCache,
|
||
proxy_logging_obj: Any,
|
||
llm_router: Optional[Router],
|
||
) -> Dict[str, Any]:
|
||
"""
|
||
Process a single key update with all validations and checks.
|
||
|
||
This function encapsulates all the logic for updating a single key,
|
||
including validation, permission checks, team checks, and database updates.
|
||
|
||
Args:
|
||
key_update_item: The key update request item
|
||
user_api_key_dict: The authenticated user's API key info
|
||
litellm_changed_by: Optional header for tracking who made the change
|
||
prisma_client: Prisma client instance
|
||
user_api_key_cache: User API key cache
|
||
proxy_logging_obj: Proxy logging object
|
||
llm_router: LLM router instance
|
||
|
||
Returns:
|
||
Dict containing the updated key information
|
||
|
||
Raises:
|
||
HTTPException: For various validation and permission errors
|
||
"""
|
||
# Validate max_budget
|
||
_validate_max_budget(key_update_item.max_budget)
|
||
|
||
# Get and validate existing key
|
||
existing_key_row = await _get_and_validate_existing_key(
|
||
token=key_update_item.key,
|
||
prisma_client=prisma_client,
|
||
)
|
||
|
||
# Check team member permissions
|
||
if prisma_client is not None:
|
||
await TeamMemberPermissionChecks.can_team_member_execute_key_management_endpoint(
|
||
user_api_key_dict=user_api_key_dict,
|
||
route=KeyManagementRoutes.KEY_UPDATE,
|
||
prisma_client=prisma_client,
|
||
existing_key_row=existing_key_row,
|
||
user_api_key_cache=user_api_key_cache,
|
||
)
|
||
|
||
# Create UpdateKeyRequest from BulkUpdateKeyRequestItem
|
||
update_key_request = UpdateKeyRequest(
|
||
key=key_update_item.key,
|
||
budget_id=key_update_item.budget_id,
|
||
max_budget=key_update_item.max_budget,
|
||
team_id=key_update_item.team_id,
|
||
tags=key_update_item.tags,
|
||
)
|
||
|
||
# Get team object and check team limits if team_id is provided
|
||
team_obj: Optional[LiteLLM_TeamTableCachedObj] = None
|
||
if update_key_request.team_id is not None:
|
||
team_obj = await get_team_object(
|
||
team_id=update_key_request.team_id,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
check_db_only=True,
|
||
)
|
||
|
||
if team_obj is not None and prisma_client is not None:
|
||
await _check_team_key_limits(
|
||
team_table=team_obj,
|
||
data=update_key_request,
|
||
prisma_client=prisma_client,
|
||
)
|
||
|
||
# Validate team change if team is being changed
|
||
if is_different_team(data=update_key_request, existing_key_row=existing_key_row):
|
||
if llm_router is None:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={
|
||
"error": "LLM router not found. Please set it up by passing in a valid config.yaml or adding models via the UI."
|
||
},
|
||
)
|
||
if team_obj is None:
|
||
raise HTTPException(
|
||
status_code=500,
|
||
detail={"error": "Team object not found for team change validation"},
|
||
)
|
||
await validate_key_team_change(
|
||
key=existing_key_row,
|
||
team=team_obj,
|
||
change_initiated_by=user_api_key_dict,
|
||
llm_router=llm_router,
|
||
)
|
||
|
||
# Prepare update data
|
||
non_default_values = await prepare_key_update_data(
|
||
data=update_key_request, existing_key_row=existing_key_row
|
||
)
|
||
|
||
# Update key in database
|
||
if prisma_client is None:
|
||
raise HTTPException(
|
||
status_code=500,
|
||
detail={"error": "Database not connected"},
|
||
)
|
||
|
||
_data = {**non_default_values, "token": key_update_item.key}
|
||
response = await prisma_client.update_data(token=key_update_item.key, data=_data)
|
||
|
||
# Delete cache
|
||
await _delete_cache_key_object(
|
||
hashed_token=hash_token(key_update_item.key),
|
||
user_api_key_cache=user_api_key_cache,
|
||
proxy_logging_obj=proxy_logging_obj,
|
||
)
|
||
|
||
# Trigger async hook
|
||
asyncio.create_task(
|
||
KeyManagementEventHooks.async_key_updated_hook(
|
||
data=update_key_request,
|
||
existing_key_row=existing_key_row,
|
||
response=response,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
)
|
||
)
|
||
|
||
if response is None:
|
||
raise ValueError("Failed to update key got response = None")
|
||
|
||
# Extract and format updated key info
|
||
updated_key_info = response.get("data", {})
|
||
if hasattr(updated_key_info, "model_dump"):
|
||
updated_key_info = updated_key_info.model_dump()
|
||
elif hasattr(updated_key_info, "dict"):
|
||
updated_key_info = updated_key_info.dict()
|
||
|
||
updated_key_info.pop("token", None)
|
||
|
||
return updated_key_info
|
||
|
||
|
||
async def _validate_mcp_servers_for_key_update(
|
||
data: "UpdateKeyRequest",
|
||
team_obj: Optional["LiteLLM_TeamTableCachedObj"],
|
||
existing_key_row: Any,
|
||
prisma_client: Any,
|
||
user_api_key_cache: Any,
|
||
) -> None:
|
||
"""Validate MCP servers in object_permission against the effective team."""
|
||
effective_team_obj = team_obj
|
||
# If team_id isn't being changed, resolve the existing key's team
|
||
if effective_team_obj is None and existing_key_row.team_id:
|
||
effective_team_obj = await get_team_object(
|
||
team_id=existing_key_row.team_id,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
check_db_only=True,
|
||
)
|
||
object_permission_dict: Optional[dict] = None
|
||
if data.object_permission is not None:
|
||
object_permission_dict = (
|
||
data.object_permission.model_dump()
|
||
if hasattr(data.object_permission, "model_dump")
|
||
else dict(data.object_permission) # type: ignore[arg-type]
|
||
)
|
||
await validate_key_mcp_servers_against_team(
|
||
object_permission=object_permission_dict,
|
||
team_obj=effective_team_obj,
|
||
)
|
||
|
||
|
||
async def _validate_update_key_data(
|
||
data: UpdateKeyRequest,
|
||
existing_key_row: Any,
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
llm_router: Any,
|
||
premium_user: bool,
|
||
prisma_client: Any,
|
||
user_api_key_cache: Any,
|
||
) -> None:
|
||
"""Validate permissions and constraints for key update."""
|
||
_is_proxy_admin = user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value
|
||
|
||
# Prevent non-admin from removing user_id (setting to empty string) (LIT-1884)
|
||
if data.user_id is not None and data.user_id == "" and not _is_proxy_admin:
|
||
raise HTTPException(
|
||
status_code=403,
|
||
detail="Non-admin users cannot remove the user_id from a key.",
|
||
)
|
||
|
||
# sanity check - prevent non-proxy admin user from updating key to belong to a different user
|
||
if (
|
||
data.user_id is not None
|
||
and data.user_id != existing_key_row.user_id
|
||
and not _is_proxy_admin
|
||
):
|
||
raise HTTPException(
|
||
status_code=403,
|
||
detail=f"User={data.user_id} is not allowed to update key={data.key} to belong to user={existing_key_row.user_id}",
|
||
)
|
||
|
||
common_key_access_checks(
|
||
user_api_key_dict=user_api_key_dict,
|
||
data=data,
|
||
user_id=existing_key_row.user_id,
|
||
llm_router=llm_router,
|
||
premium_user=premium_user,
|
||
)
|
||
|
||
await TeamMemberPermissionChecks.can_team_member_execute_key_management_endpoint(
|
||
user_api_key_dict=user_api_key_dict,
|
||
route=KeyManagementRoutes.KEY_UPDATE,
|
||
prisma_client=prisma_client,
|
||
existing_key_row=existing_key_row,
|
||
user_api_key_cache=user_api_key_cache,
|
||
)
|
||
|
||
# Admin-only: only proxy admins, team admins, or org admins can modify max_budget
|
||
if data.max_budget is not None and data.max_budget != existing_key_row.max_budget:
|
||
if prisma_client is not None:
|
||
hashed_key = existing_key_row.token
|
||
await _check_key_admin_access(
|
||
user_api_key_dict=user_api_key_dict,
|
||
hashed_token=hashed_key,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
route="/key/update (max_budget)",
|
||
)
|
||
|
||
# Check team limits if key has a team_id (from request or existing key)
|
||
team_obj: Optional[LiteLLM_TeamTableCachedObj] = None
|
||
_team_id_to_check = data.team_id or getattr(existing_key_row, "team_id", None)
|
||
if _team_id_to_check is not None:
|
||
team_obj = await get_team_object(
|
||
team_id=_team_id_to_check,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
check_db_only=True,
|
||
)
|
||
|
||
# Validate team exists when non-admin sets a new team_id (LIT-1884)
|
||
if team_obj is None and data.team_id is not None and not _is_proxy_admin:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Team not found for team_id={data.team_id}. Non-admin users cannot set keys to non-existent teams.",
|
||
)
|
||
|
||
if team_obj is not None:
|
||
await _check_team_key_limits(
|
||
team_table=team_obj,
|
||
data=data,
|
||
prisma_client=prisma_client,
|
||
)
|
||
|
||
# Validate key against project limits if project_id is being set
|
||
_project_id_to_check = getattr(data, "project_id", None) or getattr(
|
||
existing_key_row, "project_id", None
|
||
)
|
||
if _project_id_to_check is not None and (
|
||
data.models is not None or data.max_budget is not None
|
||
):
|
||
await _check_project_key_limits(
|
||
project_id=_project_id_to_check,
|
||
data=data,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
)
|
||
|
||
# Check org key limits only when throughput-related fields or organization_id change
|
||
_org_id_to_check = data.organization_id or getattr(
|
||
existing_key_row, "organization_id", None
|
||
)
|
||
_throughput_fields_changed = (
|
||
data.organization_id is not None
|
||
or data.tpm_limit is not None
|
||
or data.rpm_limit is not None
|
||
or data.tpm_limit_type is not None
|
||
or data.rpm_limit_type is not None
|
||
)
|
||
if _org_id_to_check is not None and _throughput_fields_changed:
|
||
org_table = await get_org_object(
|
||
org_id=_org_id_to_check,
|
||
user_api_key_cache=user_api_key_cache,
|
||
prisma_client=prisma_client,
|
||
)
|
||
if org_table is None:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail=f"Organization not found for organization_id={_org_id_to_check}",
|
||
)
|
||
await _check_org_key_limits(
|
||
org_table=org_table,
|
||
data=data,
|
||
prisma_client=prisma_client,
|
||
)
|
||
|
||
# if team change - check if this is possible
|
||
if is_different_team(data=data, existing_key_row=existing_key_row):
|
||
if llm_router is None:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={
|
||
"error": "LLM router not found. Please set it up by passing in a valid config.yaml or adding models via the UI."
|
||
},
|
||
)
|
||
if team_obj is None:
|
||
raise HTTPException(
|
||
status_code=500,
|
||
detail={"error": "Team object not found for team change validation"},
|
||
)
|
||
await validate_key_team_change(
|
||
key=existing_key_row,
|
||
team=team_obj,
|
||
change_initiated_by=user_api_key_dict,
|
||
llm_router=llm_router,
|
||
)
|
||
|
||
# Validate MCP servers in object_permission against the effective team
|
||
if data.object_permission is not None:
|
||
await _validate_mcp_servers_for_key_update(
|
||
data=data,
|
||
team_obj=team_obj,
|
||
existing_key_row=existing_key_row,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
)
|
||
|
||
|
||
@router.post(
|
||
"/key/update", tags=["key management"], dependencies=[Depends(user_api_key_auth)]
|
||
)
|
||
@management_endpoint_wrapper
|
||
async def update_key_fn(
|
||
request: Request,
|
||
data: UpdateKeyRequest,
|
||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||
litellm_changed_by: Optional[str] = Header(
|
||
None,
|
||
description="The litellm-changed-by header enables tracking of actions performed by authorized users on behalf of other users, providing an audit trail for accountability",
|
||
),
|
||
):
|
||
"""
|
||
Update an existing API key's parameters.
|
||
|
||
Parameters:
|
||
- key: str - The key to update
|
||
- key_alias: Optional[str] - User-friendly key alias
|
||
- user_id: Optional[str] - User ID associated with key
|
||
- team_id: Optional[str] - Team ID associated with key
|
||
- agent_id: Optional[str] - The agent id associated with the key.
|
||
- organization_id: Optional[str] - The organization id of the key.
|
||
- budget_id: Optional[str] - The budget id associated with the key. Created by calling `/budget/new`.
|
||
- models: Optional[list] - Model_name's a user is allowed to call
|
||
- tags: Optional[List[str]] - Tags for organizing keys (Enterprise only)
|
||
- prompts: Optional[List[str]] - List of prompts that the key is allowed to use.
|
||
- enforced_params: Optional[List[str]] - List of enforced params for the key (Enterprise only). [Docs](https://docs.litellm.ai/docs/proxy/enterprise#enforce-required-params-for-llm-requests)
|
||
- spend: Optional[float] - Amount spent by key
|
||
- max_budget: Optional[float] - Max budget for key
|
||
- model_max_budget: Optional[Dict[str, BudgetConfig]] - Model-specific budgets {"gpt-4": {"budget_limit": 0.0005, "time_period": "30d"}}
|
||
- budget_duration: Optional[str] - Budget reset period ("30d", "1h", etc.)
|
||
- soft_budget: Optional[float] - [TODO] Soft budget limit (warning vs. hard stop). Will trigger a slack alert when this soft budget is reached.
|
||
- max_parallel_requests: Optional[int] - Rate limit for parallel requests
|
||
- metadata: Optional[dict] - Metadata for key. Example {"team": "core-infra", "app": "app2"}
|
||
- tpm_limit: Optional[int] - Tokens per minute limit
|
||
- rpm_limit: Optional[int] - Requests per minute limit
|
||
- model_rpm_limit: Optional[dict] - Model-specific RPM limits {"gpt-4": 100, "claude-v1": 200}
|
||
- model_tpm_limit: Optional[dict] - Model-specific TPM limits {"gpt-4": 100000, "claude-v1": 200000}
|
||
- tpm_limit_type: Optional[str] - TPM rate limit type - "best_effort_throughput", "guaranteed_throughput", or "dynamic"
|
||
- rpm_limit_type: Optional[str] - RPM rate limit type - "best_effort_throughput", "guaranteed_throughput", or "dynamic"
|
||
- allowed_cache_controls: Optional[list] - List of allowed cache control values
|
||
- duration: Optional[str] - Key validity duration ("30d", "1h", etc.), null to never expire, or "-1" to never expire (deprecated, use null)
|
||
- permissions: Optional[dict] - Key-specific permissions
|
||
- send_invite_email: Optional[bool] - Send invite email to user_id
|
||
- guardrails: Optional[List[str]] - List of active guardrails for the key
|
||
- policies: Optional[List[str]] - List of policy names to apply to the key. Policies define guardrails, conditions, and inheritance rules.
|
||
- disable_global_guardrails: Optional[bool] - Whether to disable global guardrails for the key.
|
||
- prompts: Optional[List[str]] - List of prompts that the key is allowed to use.
|
||
- blocked: Optional[bool] - Whether the key is blocked
|
||
- aliases: Optional[dict] - Model aliases for the key - [Docs](https://litellm.vercel.app/docs/proxy/virtual_keys#model-aliases)
|
||
- config: Optional[dict] - [DEPRECATED PARAM] Key-specific config.
|
||
- temp_budget_increase: Optional[float] - Temporary budget increase for the key (Enterprise only).
|
||
- temp_budget_expiry: Optional[str] - Expiry time for the temporary budget increase (Enterprise only).
|
||
- allowed_routes: Optional[list] - List of allowed routes for the key. Store the actual route or store a wildcard pattern for a set of routes. Example - ["/chat/completions", "/embeddings", "/keys/*"]
|
||
- allowed_passthrough_routes: Optional[list] - List of allowed pass through routes for the key. Store the actual route or store a wildcard pattern for a set of routes. Example - ["/my-custom-endpoint"]. Use this instead of allowed_routes, if you just want to specify which pass through routes the key can access, without specifying the routes. If allowed_routes is specified, allowed_passthrough_routes is ignored.
|
||
- prompts: Optional[List[str]] - List of allowed prompts for the key. If specified, the key will only be able to use these specific prompts.
|
||
- object_permission: Optional[LiteLLM_ObjectPermissionBase] - key-specific object permission. Example - {"vector_stores": ["vector_store_1", "vector_store_2"], "agents": ["agent_1", "agent_2"], "agent_access_groups": ["dev_group"]}. IF null or {} then no object permission.
|
||
- auto_rotate: Optional[bool] - Whether this key should be automatically rotated
|
||
- rotation_interval: Optional[str] - How often to rotate this key (e.g., '30d', '90d'). Required if auto_rotate=True
|
||
- allowed_vector_store_indexes: Optional[List[dict]] - List of allowed vector store indexes for the key. Example - [{"index_name": "my-index", "index_permissions": ["write", "read"]}]. If specified, the key will only be able to use these specific vector store indexes. Create index, using `/v1/indexes` endpoint.
|
||
- router_settings: Optional[UpdateRouterConfig] - key-specific router settings. Example - {"model_group_retry_policy": {"max_retries": 5}}. IF null or {} then no router settings.
|
||
- access_group_ids: Optional[List[str]] - List of access group IDs to associate with the key. Access groups define which models a key can access. Example - ["access_group_1", "access_group_2"].
|
||
|
||
Example:
|
||
```bash
|
||
curl --location 'http://0.0.0.0:4000/key/update' \
|
||
--header 'Authorization: Bearer sk-1234' \
|
||
--header 'Content-Type: application/json' \
|
||
--data '{
|
||
"key": "sk-1234",
|
||
"key_alias": "my-key",
|
||
"user_id": "user-1234",
|
||
"team_id": "team-1234",
|
||
"max_budget": 100,
|
||
"metadata": {"any_key": "any-val"},
|
||
}'
|
||
```
|
||
"""
|
||
from litellm.proxy.proxy_server import (
|
||
llm_router,
|
||
premium_user,
|
||
prisma_client,
|
||
proxy_logging_obj,
|
||
user_api_key_cache,
|
||
)
|
||
|
||
try:
|
||
# Validate budget values are not negative
|
||
if data.max_budget is not None and data.max_budget < 0:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={
|
||
"error": f"max_budget cannot be negative. Received: {data.max_budget}"
|
||
},
|
||
)
|
||
|
||
data_json: dict = data.model_dump(exclude_unset=True, exclude_none=True)
|
||
key = data_json.pop("key")
|
||
|
||
# get the row from db
|
||
if prisma_client is None:
|
||
raise Exception("Not connected to DB!")
|
||
|
||
existing_key_row = await prisma_client.get_data(
|
||
token=data.key, table_name="key", query_type="find_unique"
|
||
)
|
||
|
||
if existing_key_row is None:
|
||
raise HTTPException(
|
||
status_code=404,
|
||
detail={"error": f"Team not found, passed team_id={data.team_id}"},
|
||
)
|
||
|
||
await _validate_update_key_data(
|
||
data=data,
|
||
existing_key_row=existing_key_row,
|
||
user_api_key_dict=user_api_key_dict,
|
||
llm_router=llm_router,
|
||
premium_user=premium_user,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
)
|
||
|
||
non_default_values = await prepare_key_update_data(
|
||
data=data, existing_key_row=existing_key_row
|
||
)
|
||
|
||
# Only validate key_alias format if it's actually being changed
|
||
new_key_alias = non_default_values.get("key_alias", None)
|
||
if new_key_alias != existing_key_row.key_alias:
|
||
_validate_key_alias_format(key_alias=new_key_alias)
|
||
|
||
await _enforce_unique_key_alias(
|
||
key_alias=non_default_values.get("key_alias", None),
|
||
prisma_client=prisma_client,
|
||
existing_key_token=existing_key_row.token,
|
||
)
|
||
|
||
# Handle rotation fields if auto_rotate is being enabled
|
||
_set_key_rotation_fields(
|
||
non_default_values,
|
||
non_default_values.get("auto_rotate", False),
|
||
non_default_values.get("rotation_interval"),
|
||
existing_key_alias=existing_key_row.key_alias,
|
||
)
|
||
|
||
_data = {**non_default_values, "token": key}
|
||
response = await prisma_client.update_data(token=key, data=_data)
|
||
|
||
# Delete - key from cache, since it's been updated!
|
||
# key updated - a new model could have been added to this key. it should not block requests after this is done
|
||
await _delete_cache_key_object(
|
||
hashed_token=hash_token(key),
|
||
user_api_key_cache=user_api_key_cache,
|
||
proxy_logging_obj=proxy_logging_obj,
|
||
)
|
||
|
||
asyncio.create_task(
|
||
KeyManagementEventHooks.async_key_updated_hook(
|
||
data=data,
|
||
existing_key_row=existing_key_row,
|
||
response=response,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
)
|
||
)
|
||
|
||
if response is None:
|
||
raise ValueError("Failed to update key got response = None")
|
||
|
||
return {"key": key, **response["data"]}
|
||
# update based on remaining passed in values
|
||
except Exception as e:
|
||
verbose_proxy_logger.exception(
|
||
"litellm.proxy.proxy_server.update_key_fn(): Exception occured - {}".format(
|
||
str(e)
|
||
)
|
||
)
|
||
if isinstance(e, HTTPException):
|
||
raise ProxyException(
|
||
message=getattr(e, "detail", f"Authentication Error({str(e)})"),
|
||
type=ProxyErrorTypes.auth_error,
|
||
param=getattr(e, "param", "None"),
|
||
code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
|
||
)
|
||
elif isinstance(e, ProxyException):
|
||
raise e
|
||
raise ProxyException(
|
||
message="Authentication Error, " + str(e),
|
||
type=ProxyErrorTypes.auth_error,
|
||
param=getattr(e, "param", "None"),
|
||
code=status.HTTP_400_BAD_REQUEST,
|
||
)
|
||
|
||
|
||
@router.post(
|
||
"/key/bulk_update",
|
||
tags=["key management"],
|
||
dependencies=[Depends(user_api_key_auth)],
|
||
response_model=BulkUpdateKeyResponse,
|
||
)
|
||
@management_endpoint_wrapper
|
||
async def bulk_update_keys(
|
||
data: BulkUpdateKeyRequest,
|
||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||
litellm_changed_by: Optional[str] = Header(
|
||
None,
|
||
description="The litellm-changed-by header enables tracking of actions performed by authorized users on behalf of other users, providing an audit trail for accountability",
|
||
),
|
||
):
|
||
"""
|
||
Bulk update multiple keys at once.
|
||
|
||
This endpoint allows updating multiple keys in a single request. Each key update
|
||
is processed independently - if some updates fail, others will still succeed.
|
||
|
||
Parameters:
|
||
- keys: List[BulkUpdateKeyRequestItem] - List of key update requests, each containing:
|
||
- key: str - The key identifier (token) to update
|
||
- budget_id: Optional[str] - Budget ID associated with the key
|
||
- max_budget: Optional[float] - Max budget for key
|
||
- team_id: Optional[str] - Team ID associated with key
|
||
- tags: Optional[List[str]] - Tags for organizing keys
|
||
|
||
Returns:
|
||
- total_requested: int - Total number of keys requested for update
|
||
- successful_updates: List[SuccessfulKeyUpdate] - List of successfully updated keys with their updated info
|
||
- failed_updates: List[FailedKeyUpdate] - List of failed updates with key_info and failed_reason
|
||
|
||
Example request:
|
||
```bash
|
||
curl --location 'http://0.0.0.0:4000/key/bulk_update' \
|
||
--header 'Authorization: Bearer sk-1234' \
|
||
--header 'Content-Type: application/json' \
|
||
--data '{
|
||
"keys": [
|
||
{
|
||
"key": "sk-1234",
|
||
"max_budget": 100.0,
|
||
"team_id": "team-123",
|
||
"tags": ["production", "api"]
|
||
},
|
||
{
|
||
"key": "sk-5678",
|
||
"budget_id": "budget-456",
|
||
"tags": ["staging"]
|
||
}
|
||
]
|
||
}'
|
||
```
|
||
"""
|
||
from litellm.proxy.proxy_server import (
|
||
llm_router,
|
||
prisma_client,
|
||
proxy_logging_obj,
|
||
user_api_key_cache,
|
||
)
|
||
|
||
if user_api_key_dict.user_role != LitellmUserRoles.PROXY_ADMIN.value:
|
||
raise HTTPException(
|
||
status_code=403,
|
||
detail={"error": "Only proxy admins can perform bulk key updates"},
|
||
)
|
||
|
||
if prisma_client is None:
|
||
raise HTTPException(
|
||
status_code=500,
|
||
detail={"error": "Database not connected"},
|
||
)
|
||
|
||
if not data.keys:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={"error": "No keys provided for update"},
|
||
)
|
||
|
||
MAX_BATCH_SIZE = 500
|
||
if len(data.keys) > MAX_BATCH_SIZE:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={
|
||
"error": f"Maximum {MAX_BATCH_SIZE} keys can be updated at once. Found {len(data.keys)} keys."
|
||
},
|
||
)
|
||
|
||
successful_updates: List[SuccessfulKeyUpdate] = []
|
||
failed_updates: List[FailedKeyUpdate] = []
|
||
|
||
for key_update_item in data.keys:
|
||
try:
|
||
# Process single key update using reusable function
|
||
updated_key_info = await _process_single_key_update(
|
||
key_update_item=key_update_item,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
proxy_logging_obj=proxy_logging_obj,
|
||
llm_router=llm_router,
|
||
)
|
||
|
||
successful_updates.append(
|
||
SuccessfulKeyUpdate(
|
||
key=key_update_item.key,
|
||
key_info=updated_key_info,
|
||
)
|
||
)
|
||
|
||
except Exception as e:
|
||
verbose_proxy_logger.exception(
|
||
f"Failed to update key {key_update_item.key}: {e}"
|
||
)
|
||
|
||
if isinstance(e, HTTPException):
|
||
error_detail = e.detail
|
||
if isinstance(error_detail, dict):
|
||
error_message = error_detail.get("error", str(e))
|
||
else:
|
||
error_message = str(error_detail)
|
||
else:
|
||
error_message = str(e)
|
||
|
||
key_info = None
|
||
try:
|
||
existing_key_row = await prisma_client.get_data(
|
||
token=key_update_item.key,
|
||
table_name="key",
|
||
query_type="find_unique",
|
||
)
|
||
if existing_key_row is not None:
|
||
if hasattr(existing_key_row, "model_dump"):
|
||
key_info = existing_key_row.model_dump()
|
||
elif hasattr(existing_key_row, "dict"):
|
||
key_info = existing_key_row.dict()
|
||
if key_info:
|
||
key_info.pop("token", None)
|
||
except Exception:
|
||
pass
|
||
|
||
failed_updates.append(
|
||
FailedKeyUpdate(
|
||
key=key_update_item.key,
|
||
key_info=key_info,
|
||
failed_reason=error_message,
|
||
)
|
||
)
|
||
|
||
return BulkUpdateKeyResponse(
|
||
total_requested=len(data.keys),
|
||
successful_updates=successful_updates,
|
||
failed_updates=failed_updates,
|
||
)
|
||
|
||
|
||
async def validate_key_team_change(
|
||
key: LiteLLM_VerificationToken,
|
||
team: LiteLLM_TeamTable,
|
||
change_initiated_by: UserAPIKeyAuth,
|
||
llm_router: Router,
|
||
):
|
||
"""
|
||
Validate that a key can be moved to a new team.
|
||
|
||
- The team must have access to the key's models
|
||
- The key's user_id must be a member of the team
|
||
- The key's tpm/rpm limit must be less than the team's tpm/rpm limit
|
||
- The person initiating the change must be either Proxy Admin or Team Admin
|
||
"""
|
||
# Check if the team has access to the key's models
|
||
if len(key.models) > 0:
|
||
for model in key.models:
|
||
# Skip special sentinel values — "all-team-models" means
|
||
# "use whatever the team allows", so it's always valid.
|
||
if model == SpecialModelNames.all_team_models.value:
|
||
continue
|
||
await can_team_access_model(
|
||
model=model,
|
||
team_object=team,
|
||
llm_router=llm_router,
|
||
)
|
||
|
||
# Check if the key's tpm/rpm limit is less than the team's tpm/rpm limit
|
||
if key.tpm_limit is not None:
|
||
if team.tpm_limit and key.tpm_limit > team.tpm_limit:
|
||
raise HTTPException(
|
||
status_code=403,
|
||
detail=f"Key={key.token} has a tpm_limit={key.tpm_limit} which is greater than the team's tpm_limit={team.tpm_limit}.",
|
||
)
|
||
if team.rpm_limit and key.rpm_limit and key.rpm_limit > team.rpm_limit:
|
||
raise HTTPException(
|
||
status_code=403,
|
||
detail=f"Key={key.token} has a rpm_limit={key.rpm_limit} which is greater than the team's rpm_limit={team.rpm_limit}.",
|
||
)
|
||
|
||
# Check if the key's user_id is a member of the team
|
||
member_object = _get_user_in_team(
|
||
team_table=cast(LiteLLM_TeamTableCachedObj, team), user_id=key.user_id
|
||
)
|
||
if key.user_id is not None:
|
||
if not member_object:
|
||
raise HTTPException(
|
||
status_code=403,
|
||
detail=f"User={key.user_id} is not a member of the team={team.team_id}. Check team members via `/team/info`.",
|
||
)
|
||
|
||
# Check if the person initiating the change is a Proxy Admin or Team Admin
|
||
if change_initiated_by.user_role == LitellmUserRoles.PROXY_ADMIN.value:
|
||
return
|
||
elif _is_user_team_admin(
|
||
user_api_key_dict=change_initiated_by,
|
||
team_obj=team,
|
||
):
|
||
return
|
||
# this teams member permissions allow updating a
|
||
elif TeamMemberPermissionChecks.does_team_member_have_permissions_for_endpoint(
|
||
team_member_object=member_object,
|
||
team_table=cast(LiteLLM_TeamTableCachedObj, team),
|
||
route=KeyManagementRoutes.KEY_UPDATE.value,
|
||
):
|
||
return
|
||
else:
|
||
raise HTTPException(
|
||
status_code=403,
|
||
detail=f"User={change_initiated_by.user_id} is not a Proxy Admin or Team Admin for team={team.team_id}. Please ask your Proxy Admin to allow this action under 'Member Permissions' for this team.",
|
||
)
|
||
|
||
|
||
@router.post(
|
||
"/key/delete", tags=["key management"], dependencies=[Depends(user_api_key_auth)]
|
||
)
|
||
@management_endpoint_wrapper
|
||
async def delete_key_fn(
|
||
data: KeyRequest,
|
||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||
litellm_changed_by: Optional[str] = Header(
|
||
None,
|
||
description="The litellm-changed-by header enables tracking of actions performed by authorized users on behalf of other users, providing an audit trail for accountability",
|
||
),
|
||
):
|
||
"""
|
||
Delete a key from the key management system.
|
||
|
||
Parameters::
|
||
- keys (List[str]): A list of keys or hashed keys to delete. Example {"keys": ["sk-QWrxEynunsNpV1zT48HIrw", "837e17519f44683334df5291321d97b8bf1098cd490e49e215f6fea935aa28be"]}
|
||
- key_aliases (List[str]): A list of key aliases to delete. Can be passed instead of `keys`.Example {"key_aliases": ["alias1", "alias2"]}
|
||
|
||
Returns:
|
||
- deleted_keys (List[str]): A list of deleted keys. Example {"deleted_keys": ["sk-QWrxEynunsNpV1zT48HIrw", "837e17519f44683334df5291321d97b8bf1098cd490e49e215f6fea935aa28be"]}
|
||
|
||
Example:
|
||
```bash
|
||
curl --location 'http://0.0.0.0:4000/key/delete' \
|
||
--header 'Authorization: Bearer sk-1234' \
|
||
--header 'Content-Type: application/json' \
|
||
--data '{
|
||
"keys": ["sk-QWrxEynunsNpV1zT48HIrw"]
|
||
}'
|
||
```
|
||
|
||
Raises:
|
||
HTTPException: If an error occurs during key deletion.
|
||
"""
|
||
try:
|
||
from litellm.proxy.proxy_server import prisma_client, user_api_key_cache
|
||
|
||
if prisma_client is None:
|
||
raise Exception("Not connected to DB!")
|
||
|
||
# Normalize litellm_changed_by: if it's a Header object or not a string, convert to None
|
||
if litellm_changed_by is not None and not isinstance(litellm_changed_by, str):
|
||
litellm_changed_by = None
|
||
|
||
## only allow user to delete keys they own
|
||
verbose_proxy_logger.debug(
|
||
f"user_api_key_dict.user_role: {user_api_key_dict.user_role}"
|
||
)
|
||
|
||
num_keys_to_be_deleted = 0
|
||
deleted_keys = []
|
||
if data.keys:
|
||
number_deleted_keys, _keys_being_deleted = await delete_verification_tokens(
|
||
tokens=data.keys,
|
||
user_api_key_cache=user_api_key_cache,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
)
|
||
num_keys_to_be_deleted = len(data.keys)
|
||
deleted_keys = data.keys
|
||
elif data.key_aliases:
|
||
number_deleted_keys, _keys_being_deleted = await delete_key_aliases(
|
||
key_aliases=data.key_aliases,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
)
|
||
num_keys_to_be_deleted = len(data.key_aliases)
|
||
deleted_keys = data.key_aliases
|
||
else:
|
||
raise ValueError("Invalid request type")
|
||
|
||
if number_deleted_keys is None:
|
||
raise ProxyException(
|
||
message="Failed to delete keys got None response from delete_verification_token",
|
||
type=ProxyErrorTypes.internal_server_error,
|
||
param="keys",
|
||
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||
)
|
||
verbose_proxy_logger.debug(f"/key/delete - deleted_keys={number_deleted_keys}")
|
||
|
||
try:
|
||
assert num_keys_to_be_deleted == len(deleted_keys)
|
||
except Exception:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={
|
||
"error": f"Not all keys passed in were deleted. This probably means you don't have access to delete all the keys passed in. Keys passed in={num_keys_to_be_deleted}, Deleted keys ={number_deleted_keys}"
|
||
},
|
||
)
|
||
|
||
verbose_proxy_logger.debug(
|
||
f"/keys/delete - cache after delete: {user_api_key_cache.in_memory_cache.cache_dict}"
|
||
)
|
||
|
||
asyncio.create_task(
|
||
KeyManagementEventHooks.async_key_deleted_hook(
|
||
data=data,
|
||
keys_being_deleted=_keys_being_deleted,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
response=number_deleted_keys,
|
||
)
|
||
)
|
||
|
||
return {"deleted_keys": deleted_keys}
|
||
except Exception as e:
|
||
verbose_proxy_logger.exception(
|
||
"litellm.proxy.proxy_server.delete_key_fn(): Exception occured - {}".format(
|
||
str(e)
|
||
)
|
||
)
|
||
raise handle_exception_on_proxy(e)
|
||
|
||
|
||
@router.post(
|
||
"/v2/key/info",
|
||
tags=["key management"],
|
||
dependencies=[Depends(user_api_key_auth)],
|
||
include_in_schema=False,
|
||
)
|
||
async def info_key_fn_v2(
|
||
data: Optional[KeyRequest] = None,
|
||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||
):
|
||
"""
|
||
Retrieve information about a list of keys.
|
||
|
||
**New endpoint**. Currently admin only.
|
||
Parameters:
|
||
keys: Optional[list] = body parameter representing the key(s) in the request
|
||
user_api_key_dict: UserAPIKeyAuth = Dependency representing the user's API key
|
||
Returns:
|
||
Dict containing the key and its associated information
|
||
|
||
Example Curl:
|
||
```
|
||
curl -X GET "http://0.0.0.0:4000/key/info" \
|
||
-H "Authorization: Bearer sk-1234" \
|
||
-d {"keys": ["sk-1", "sk-2", "sk-3"]}
|
||
```
|
||
"""
|
||
from litellm.proxy.proxy_server import prisma_client
|
||
|
||
try:
|
||
if prisma_client is None:
|
||
raise Exception(
|
||
"Database not connected. Connect a database to your proxy - https://docs.litellm.ai/docs/simple_proxy#managing-auth---virtual-keys"
|
||
)
|
||
if data is None:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_422_UNPROCESSABLE_ENTITY,
|
||
detail={"message": "Malformed request. No keys passed in."},
|
||
)
|
||
|
||
key_info = await prisma_client.get_data(
|
||
token=data.keys, table_name="key", query_type="find_all"
|
||
)
|
||
if key_info is None:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_404_NOT_FOUND,
|
||
detail={"message": "No keys found"},
|
||
)
|
||
filtered_key_info = []
|
||
for k in key_info:
|
||
try:
|
||
k = k.model_dump() # noqa
|
||
except Exception:
|
||
# if using pydantic v1
|
||
k = k.dict()
|
||
filtered_key_info.append(k)
|
||
return {"key": data.keys, "info": filtered_key_info}
|
||
|
||
except Exception as e:
|
||
raise handle_exception_on_proxy(e)
|
||
|
||
|
||
@router.get(
|
||
"/key/info", tags=["key management"], dependencies=[Depends(user_api_key_auth)]
|
||
)
|
||
async def info_key_fn(
|
||
key: Optional[str] = fastapi.Query(
|
||
default=None, description="Key in the request parameters"
|
||
),
|
||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||
):
|
||
"""
|
||
Retrieve information about a key.
|
||
Parameters:
|
||
key: Optional[str] = Query parameter representing the key in the request
|
||
user_api_key_dict: UserAPIKeyAuth = Dependency representing the user's API key
|
||
Returns:
|
||
Dict containing the key and its associated information
|
||
|
||
Example Curl:
|
||
```
|
||
curl -X GET "http://0.0.0.0:4000/key/info?key=sk-test-example-key-123" \
|
||
-H "Authorization: Bearer sk-1234"
|
||
```
|
||
|
||
Example Curl - if no key is passed, it will use the Key Passed in Authorization Header
|
||
```
|
||
curl -X GET "http://0.0.0.0:4000/key/info" \
|
||
-H "Authorization: Bearer sk-test-example-key-123"
|
||
```
|
||
"""
|
||
from litellm.proxy.proxy_server import prisma_client
|
||
|
||
try:
|
||
if prisma_client is None:
|
||
raise Exception(
|
||
"Database not connected. Connect a database to your proxy - https://docs.litellm.ai/docs/simple_proxy#managing-auth---virtual-keys"
|
||
)
|
||
|
||
# default to using Auth token if no key is passed in
|
||
key = key or user_api_key_dict.api_key
|
||
hashed_key: Optional[str] = key
|
||
if key is not None:
|
||
hashed_key = _hash_token_if_needed(token=key)
|
||
key_info = await prisma_client.db.litellm_verificationtoken.find_unique(
|
||
where={"token": hashed_key}, # type: ignore
|
||
include={"litellm_budget_table": True},
|
||
)
|
||
if key_info is None:
|
||
raise ProxyException(
|
||
message="Key not found in database",
|
||
type=ProxyErrorTypes.not_found_error,
|
||
param="key",
|
||
code=status.HTTP_404_NOT_FOUND,
|
||
)
|
||
|
||
if (
|
||
await _can_user_query_key_info(
|
||
user_api_key_dict=user_api_key_dict,
|
||
key=key,
|
||
key_info=key_info,
|
||
)
|
||
is not True
|
||
):
|
||
raise HTTPException(
|
||
status_code=status.HTTP_403_FORBIDDEN,
|
||
detail="You are not allowed to access this key's info. Your role={}".format(
|
||
user_api_key_dict.user_role
|
||
),
|
||
)
|
||
## REMOVE HASHED TOKEN INFO BEFORE RETURNING ##
|
||
try:
|
||
key_info = key_info.model_dump() # noqa
|
||
except Exception:
|
||
# if using pydantic v1
|
||
key_info = key_info.dict()
|
||
key_info.pop("token")
|
||
|
||
# Attach object_permission if object_permission_id is set
|
||
key_info = await attach_object_permission_to_dict(key_info, prisma_client)
|
||
|
||
return {"key": key, "info": key_info}
|
||
except Exception as e:
|
||
raise handle_exception_on_proxy(e)
|
||
|
||
|
||
def _check_model_access_group(
|
||
models: Optional[List[str]], llm_router: Optional[Router], premium_user: bool
|
||
) -> Literal[True]:
|
||
"""
|
||
if is_model_access_group is True + is_wildcard_route is True, check if user is a premium user
|
||
|
||
Return True if user is a premium user, False otherwise
|
||
"""
|
||
if models is None or llm_router is None:
|
||
return True
|
||
|
||
for model in models:
|
||
if llm_router._is_model_access_group_for_wildcard_route(
|
||
model_access_group=model
|
||
):
|
||
if not premium_user:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_403_FORBIDDEN,
|
||
detail={
|
||
"error": "Setting a model access group on a wildcard model is only available for LiteLLM Enterprise users.{}".format(
|
||
CommonProxyErrors.not_premium_user.value
|
||
)
|
||
},
|
||
)
|
||
|
||
return True
|
||
|
||
|
||
async def generate_key_helper_fn( # noqa: PLR0915
|
||
request_type: Literal[
|
||
"user", "key"
|
||
], # identifies if this request is from /user/new or /key/generate
|
||
duration: Optional[str] = None,
|
||
models: list = [],
|
||
aliases: dict = {},
|
||
config: dict = {},
|
||
spend: float = 0.0,
|
||
key_max_budget: Optional[float] = None, # key_max_budget is used to Budget Per key
|
||
key_budget_duration: Optional[str] = None,
|
||
budget_id: Optional[float] = None, # budget id <-> LiteLLM_BudgetTable
|
||
soft_budget: Optional[
|
||
float
|
||
] = None, # soft_budget is used to set soft Budgets Per user
|
||
max_budget: Optional[float] = None, # max_budget is used to Budget Per user
|
||
blocked: Optional[bool] = None,
|
||
budget_duration: Optional[str] = None, # max_budget is used to Budget Per user
|
||
token: Optional[str] = None,
|
||
key: Optional[
|
||
str
|
||
] = None, # dev-friendly alt param for 'token'. Exposed on `/key/generate` for setting key value yourself.
|
||
user_id: Optional[str] = None,
|
||
user_alias: Optional[str] = None,
|
||
team_id: Optional[str] = None,
|
||
agent_id: Optional[str] = None,
|
||
user_email: Optional[str] = None,
|
||
user_role: Optional[str] = None,
|
||
max_parallel_requests: Optional[int] = None,
|
||
metadata: Optional[dict] = {},
|
||
tpm_limit: Optional[int] = None,
|
||
rpm_limit: Optional[int] = None,
|
||
query_type: Literal["insert_data", "update_data"] = "insert_data",
|
||
update_key_values: Optional[dict] = None,
|
||
key_alias: Optional[str] = None,
|
||
allowed_cache_controls: Optional[list] = [],
|
||
permissions: Optional[dict] = {},
|
||
model_max_budget: Optional[dict] = {},
|
||
model_rpm_limit: Optional[dict] = None,
|
||
model_tpm_limit: Optional[dict] = None,
|
||
guardrails: Optional[list] = None,
|
||
policies: Optional[list] = None,
|
||
prompts: Optional[list] = None,
|
||
teams: Optional[list] = None,
|
||
organization_id: Optional[str] = None,
|
||
project_id: Optional[str] = None,
|
||
table_name: Optional[Literal["key", "user"]] = None,
|
||
send_invite_email: Optional[bool] = None,
|
||
created_by: Optional[str] = None,
|
||
updated_by: Optional[str] = None,
|
||
allowed_routes: Optional[list] = None,
|
||
sso_user_id: Optional[str] = None,
|
||
object_permission_id: Optional[
|
||
str
|
||
] = None, # object_permission_id <-> LiteLLM_ObjectPermissionTable
|
||
object_permission: Optional[LiteLLM_ObjectPermissionBase] = None,
|
||
auto_rotate: Optional[bool] = None,
|
||
rotation_interval: Optional[str] = None,
|
||
router_settings: Optional[dict] = None,
|
||
access_group_ids: Optional[list] = None,
|
||
):
|
||
from litellm.proxy.proxy_server import premium_user, prisma_client
|
||
|
||
if prisma_client is None:
|
||
raise Exception(
|
||
"Connect Proxy to database to generate keys - https://docs.litellm.ai/docs/proxy/virtual_keys "
|
||
)
|
||
|
||
if token is None:
|
||
if key is not None:
|
||
token = key
|
||
else:
|
||
token = f"sk-{secrets.token_urlsafe(LENGTH_OF_LITELLM_GENERATED_KEY)}"
|
||
|
||
if duration is None: # allow tokens that never expire
|
||
expires = None
|
||
else:
|
||
# Add duration to current time for exact expiration (not standardized reset time)
|
||
duration_seconds = duration_in_seconds(duration)
|
||
expires = datetime.now(timezone.utc) + timedelta(seconds=duration_seconds)
|
||
|
||
if key_budget_duration is None: # one-time budget
|
||
key_reset_at = None
|
||
else:
|
||
key_reset_at = get_budget_reset_time(budget_duration=key_budget_duration)
|
||
|
||
if budget_duration is None: # one-time budget
|
||
reset_at = None
|
||
else:
|
||
reset_at = get_budget_reset_time(budget_duration=budget_duration)
|
||
|
||
aliases_json = json.dumps(aliases)
|
||
config_json = json.dumps(config)
|
||
permissions_json = json.dumps(permissions)
|
||
router_settings_json = (
|
||
safe_dumps(router_settings) if router_settings is not None else safe_dumps({})
|
||
)
|
||
|
||
# Add model_rpm_limit and model_tpm_limit to metadata
|
||
if model_rpm_limit is not None:
|
||
metadata = metadata or {}
|
||
metadata["model_rpm_limit"] = model_rpm_limit
|
||
if model_tpm_limit is not None:
|
||
metadata = metadata or {}
|
||
metadata["model_tpm_limit"] = model_tpm_limit
|
||
if guardrails is not None:
|
||
metadata = metadata or {}
|
||
metadata["guardrails"] = guardrails
|
||
if policies is not None:
|
||
metadata = metadata or {}
|
||
metadata["policies"] = policies
|
||
if prompts is not None:
|
||
metadata = metadata or {}
|
||
metadata["prompts"] = prompts
|
||
|
||
metadata_json = json.dumps(metadata)
|
||
validate_model_max_budget(model_max_budget)
|
||
model_max_budget_json = json.dumps(model_max_budget)
|
||
user_role = user_role
|
||
tpm_limit = tpm_limit
|
||
rpm_limit = rpm_limit
|
||
allowed_cache_controls = allowed_cache_controls
|
||
|
||
try:
|
||
# Create a new verification token (you may want to enhance this logic based on your needs)
|
||
|
||
user_data = {
|
||
"max_budget": max_budget,
|
||
"user_email": user_email,
|
||
"user_id": user_id,
|
||
"user_alias": user_alias,
|
||
"team_id": team_id,
|
||
"organization_id": organization_id,
|
||
"user_role": user_role,
|
||
"spend": spend,
|
||
"models": models,
|
||
"metadata": metadata_json,
|
||
"max_parallel_requests": max_parallel_requests,
|
||
"tpm_limit": tpm_limit,
|
||
"rpm_limit": rpm_limit,
|
||
"budget_duration": budget_duration,
|
||
"budget_reset_at": reset_at,
|
||
"allowed_cache_controls": allowed_cache_controls,
|
||
"sso_user_id": sso_user_id,
|
||
"object_permission_id": object_permission_id,
|
||
}
|
||
if teams is not None:
|
||
user_data["teams"] = teams
|
||
key_data = {
|
||
"token": token,
|
||
"key_alias": key_alias,
|
||
"expires": expires,
|
||
"models": models,
|
||
"aliases": aliases_json,
|
||
"config": config_json,
|
||
"spend": spend,
|
||
"max_budget": key_max_budget,
|
||
"user_id": user_id,
|
||
"team_id": team_id,
|
||
"agent_id": agent_id,
|
||
"project_id": project_id,
|
||
"max_parallel_requests": max_parallel_requests,
|
||
"metadata": metadata_json,
|
||
"tpm_limit": tpm_limit,
|
||
"rpm_limit": rpm_limit,
|
||
"budget_duration": key_budget_duration,
|
||
"budget_reset_at": key_reset_at,
|
||
"allowed_cache_controls": allowed_cache_controls,
|
||
"permissions": permissions_json,
|
||
"model_max_budget": model_max_budget_json,
|
||
"organization_id": organization_id,
|
||
"budget_id": budget_id,
|
||
"blocked": blocked,
|
||
"created_by": created_by,
|
||
"updated_by": updated_by,
|
||
"allowed_routes": allowed_routes or [],
|
||
"object_permission_id": object_permission_id,
|
||
"router_settings": router_settings_json,
|
||
"access_group_ids": access_group_ids or [],
|
||
}
|
||
|
||
# Add rotation fields if auto_rotate is enabled
|
||
_set_key_rotation_fields(
|
||
data=key_data,
|
||
auto_rotate=auto_rotate or False,
|
||
rotation_interval=rotation_interval,
|
||
)
|
||
|
||
if (
|
||
get_secret("DISABLE_KEY_NAME", False) is True
|
||
): # allow user to disable storing abbreviated key name (shown in UI, to help figure out which key spent how much)
|
||
pass
|
||
else:
|
||
key_data["key_name"] = abbreviate_api_key(api_key=token)
|
||
saved_token = copy.deepcopy(key_data)
|
||
if isinstance(saved_token["aliases"], str):
|
||
saved_token["aliases"] = json.loads(saved_token["aliases"])
|
||
if isinstance(saved_token["config"], str):
|
||
saved_token["config"] = json.loads(saved_token["config"])
|
||
if isinstance(saved_token["metadata"], str):
|
||
saved_token["metadata"] = json.loads(saved_token["metadata"])
|
||
if isinstance(saved_token["permissions"], str):
|
||
if (
|
||
"get_spend_routes" in saved_token["permissions"]
|
||
and premium_user is not True
|
||
):
|
||
raise ValueError(
|
||
"get_spend_routes permission is only available for LiteLLM Enterprise users"
|
||
)
|
||
|
||
saved_token["permissions"] = json.loads(saved_token["permissions"])
|
||
if isinstance(saved_token["model_max_budget"], str):
|
||
saved_token["model_max_budget"] = json.loads(
|
||
saved_token["model_max_budget"]
|
||
)
|
||
router_settings = cast(Optional[dict], saved_token.get("router_settings"))
|
||
if router_settings is not None and isinstance(router_settings, str):
|
||
try:
|
||
saved_token["router_settings"] = yaml.safe_load(router_settings)
|
||
except yaml.YAMLError:
|
||
# If it's not valid JSON/YAML, keep as is or set to empty dict
|
||
saved_token["router_settings"] = {}
|
||
|
||
if saved_token.get("expires", None) is not None and isinstance(
|
||
saved_token["expires"], datetime
|
||
):
|
||
saved_token["expires"] = saved_token["expires"].isoformat()
|
||
if prisma_client is not None:
|
||
if (
|
||
table_name is None or table_name == "user"
|
||
): # do not auto-create users for `/key/generate`
|
||
## CREATE USER (If necessary)
|
||
if query_type == "insert_data":
|
||
user_row = await prisma_client.insert_data(
|
||
data=user_data, table_name="user"
|
||
)
|
||
|
||
if user_row is None:
|
||
raise Exception("Failed to create user")
|
||
## use default user model list if no key-specific model list provided
|
||
if len(user_row.models) > 0 and len(key_data["models"]) == 0: # type: ignore
|
||
key_data["models"] = user_row.models # type: ignore
|
||
elif query_type == "update_data":
|
||
user_row = await prisma_client.update_data(
|
||
data=user_data,
|
||
table_name="user",
|
||
update_key_values=update_key_values,
|
||
)
|
||
if table_name is not None and table_name == "user":
|
||
# do not create a key if table name is set to just 'user'
|
||
# we only need to ensure this exists in the user table
|
||
# the LiteLLM_VerificationToken table will increase in size if we don't do this check
|
||
return user_data
|
||
|
||
## CREATE KEY
|
||
verbose_proxy_logger.debug("prisma_client: Creating Key= %s", key_data)
|
||
create_key_response = await prisma_client.insert_data(
|
||
data=key_data, table_name="key"
|
||
)
|
||
|
||
key_data["token_id"] = getattr(create_key_response, "token", None)
|
||
key_data["litellm_budget_table"] = getattr(
|
||
create_key_response, "litellm_budget_table", None
|
||
)
|
||
key_data["created_at"] = getattr(create_key_response, "created_at", None)
|
||
key_data["updated_at"] = getattr(create_key_response, "updated_at", None)
|
||
|
||
# Deserialize router_settings from JSON string to dict for response
|
||
router_settings_value = key_data.get("router_settings")
|
||
if router_settings_value is not None and isinstance(
|
||
router_settings_value, str
|
||
):
|
||
try:
|
||
key_data["router_settings"] = yaml.safe_load(router_settings_value)
|
||
except yaml.YAMLError:
|
||
# If it's not valid JSON/YAML, keep as is or set to empty dict
|
||
key_data["router_settings"] = {}
|
||
except Exception as e:
|
||
verbose_proxy_logger.error(
|
||
"litellm.proxy.proxy_server.generate_key_helper_fn(): Exception occured - {}".format(
|
||
str(e)
|
||
)
|
||
)
|
||
verbose_proxy_logger.debug(traceback.format_exc())
|
||
if isinstance(e, HTTPException):
|
||
raise e
|
||
raise HTTPException(
|
||
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||
detail={"error": "Internal Server Error."},
|
||
)
|
||
|
||
# Add budget related info in key_data - this ensures it's returned
|
||
key_data["budget_id"] = budget_id
|
||
|
||
if request_type == "user":
|
||
# if this is a /user/new request update the key_date with user_data fields
|
||
key_data.update(user_data)
|
||
|
||
return key_data
|
||
|
||
|
||
async def _team_key_deletion_check(
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
key_info: LiteLLM_VerificationToken,
|
||
prisma_client: PrismaClient,
|
||
user_api_key_cache: DualCache,
|
||
):
|
||
is_team_key = _is_team_key(data=key_info)
|
||
|
||
if is_team_key and key_info.team_id is not None:
|
||
team_table = await get_team_object(
|
||
team_id=key_info.team_id,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
check_db_only=True,
|
||
)
|
||
if (
|
||
litellm.key_generation_settings is not None
|
||
and "team_key_generation" in litellm.key_generation_settings
|
||
):
|
||
_team_key_generation = litellm.key_generation_settings[
|
||
"team_key_generation"
|
||
]
|
||
else:
|
||
_team_key_generation = TeamUIKeyGenerationConfig(
|
||
allowed_team_member_roles=["admin", "user"],
|
||
)
|
||
# check if user is team admin
|
||
if team_table is not None:
|
||
return _team_key_operation_team_member_check(
|
||
assigned_user_id=user_api_key_dict.user_id,
|
||
team_table=team_table,
|
||
user_api_key_dict=user_api_key_dict,
|
||
team_key_generation=_team_key_generation,
|
||
route=KeyManagementRoutes.KEY_DELETE,
|
||
)
|
||
else:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_404_NOT_FOUND,
|
||
detail={
|
||
"error": f"Team not found in db, and user not proxy admin. Team id = {key_info.team_id}"
|
||
},
|
||
)
|
||
return False
|
||
|
||
|
||
async def can_modify_verification_token(
|
||
key_info: LiteLLM_VerificationToken,
|
||
user_api_key_cache: DualCache,
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
prisma_client: PrismaClient,
|
||
) -> bool:
|
||
"""
|
||
Check if user has permission to modify (delete/regenerate) a verification token.
|
||
|
||
Rules:
|
||
- Proxy admin can modify any key
|
||
- Internal jobs service account can modify any key (for auto-rotation)
|
||
- For team keys: only team admin or key owner can modify
|
||
- For personal keys: only key owner can modify
|
||
|
||
Args:
|
||
key_info: The verification token to check
|
||
user_api_key_cache: Cache for user API keys
|
||
user_api_key_dict: The user making the request
|
||
prisma_client: Prisma client for database access
|
||
|
||
Returns:
|
||
True if user can modify the key, False otherwise
|
||
"""
|
||
from litellm.constants import LITELLM_INTERNAL_JOBS_SERVICE_ACCOUNT_NAME
|
||
|
||
is_team_key = _is_team_key(data=key_info)
|
||
|
||
# 1. Proxy admin can modify any key
|
||
if user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value:
|
||
return True
|
||
|
||
# 2. Internal jobs service account can modify any key (for auto-rotation)
|
||
if user_api_key_dict.api_key == LITELLM_INTERNAL_JOBS_SERVICE_ACCOUNT_NAME:
|
||
return True
|
||
|
||
# 3. For team keys: only team admin or key owner can modify
|
||
if is_team_key and key_info.team_id is not None:
|
||
# Get team object to check if user is team admin
|
||
team_table = await get_team_object(
|
||
team_id=key_info.team_id,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
check_db_only=True,
|
||
)
|
||
|
||
if team_table is None:
|
||
return False
|
||
|
||
# Check if user is team admin
|
||
if _is_user_team_admin(
|
||
user_api_key_dict=user_api_key_dict,
|
||
team_obj=team_table,
|
||
):
|
||
return True
|
||
|
||
# Check if the key belongs to the user (they own it)
|
||
if (
|
||
key_info.user_id is not None
|
||
and key_info.user_id == user_api_key_dict.user_id
|
||
):
|
||
return True
|
||
|
||
# Not team admin and doesn't own the key
|
||
return False
|
||
|
||
# 4. For personal keys: only key owner can modify
|
||
if key_info.user_id is not None and key_info.user_id == user_api_key_dict.user_id:
|
||
return True
|
||
|
||
# Default: deny
|
||
return False
|
||
|
||
|
||
async def delete_verification_tokens(
|
||
tokens: List,
|
||
user_api_key_cache: DualCache,
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
litellm_changed_by: Optional[str] = None,
|
||
) -> Tuple[Optional[Dict], List[LiteLLM_VerificationToken]]:
|
||
"""
|
||
Helper that deletes the list of tokens from the database
|
||
|
||
- check if user is proxy admin
|
||
- check if user is team admin and key is a team key
|
||
|
||
Args:
|
||
tokens: List of tokens to delete
|
||
user_id: Optional user_id to filter by
|
||
|
||
Returns:
|
||
Tuple[Optional[Dict], List[LiteLLM_VerificationToken]]:
|
||
Optional[Dict]:
|
||
- Number of deleted tokens
|
||
List[LiteLLM_VerificationToken]:
|
||
- List of keys being deleted, this contains information about the key_alias, token, and user_id being deleted,
|
||
this is passed down to the KeyManagementEventHooks to delete the keys from the secret manager and handle audit logs
|
||
"""
|
||
from litellm.proxy.proxy_server import prisma_client
|
||
|
||
failed_tokens: List = []
|
||
try:
|
||
if prisma_client:
|
||
tokens = [_hash_token_if_needed(token=key) for key in tokens]
|
||
_keys_being_deleted: List[
|
||
LiteLLM_VerificationToken
|
||
] = await prisma_client.db.litellm_verificationtoken.find_many(
|
||
where={"token": {"in": tokens}}
|
||
)
|
||
|
||
if len(_keys_being_deleted) == 0:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_404_NOT_FOUND,
|
||
detail={"error": "No keys found"},
|
||
)
|
||
|
||
if user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value:
|
||
authorized_keys = _keys_being_deleted
|
||
else:
|
||
authorized_keys = []
|
||
for key in _keys_being_deleted:
|
||
if await can_modify_verification_token(
|
||
key_info=key,
|
||
user_api_key_cache=user_api_key_cache,
|
||
user_api_key_dict=user_api_key_dict,
|
||
prisma_client=prisma_client,
|
||
):
|
||
authorized_keys.append(key)
|
||
else:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_403_FORBIDDEN,
|
||
detail={
|
||
"error": "You are not authorized to delete this key"
|
||
},
|
||
)
|
||
await _persist_deleted_verification_tokens(
|
||
keys=authorized_keys,
|
||
prisma_client=prisma_client,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
)
|
||
|
||
if user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value:
|
||
deleted_tokens = await prisma_client.delete_data(tokens=tokens)
|
||
if deleted_tokens is not None and len(deleted_tokens) != len(tokens):
|
||
failed_tokens = [
|
||
token for token in tokens if token not in deleted_tokens
|
||
]
|
||
else:
|
||
deletion_tasks = [
|
||
prisma_client.delete_data(tokens=[key.token])
|
||
for key in authorized_keys
|
||
]
|
||
await asyncio.gather(*deletion_tasks)
|
||
|
||
deleted_tokens = [key.token for key in authorized_keys]
|
||
if len(deleted_tokens) != len(tokens):
|
||
failed_tokens = [
|
||
token for token in tokens if token not in deleted_tokens
|
||
]
|
||
else:
|
||
raise Exception("DB not connected. prisma_client is None")
|
||
except Exception as e:
|
||
verbose_proxy_logger.exception(
|
||
"litellm.proxy.proxy_server.delete_verification_tokens(): Exception occured - {}".format(
|
||
str(e)
|
||
)
|
||
)
|
||
verbose_proxy_logger.debug(traceback.format_exc())
|
||
raise e
|
||
|
||
for key in tokens:
|
||
user_api_key_cache.delete_cache(key)
|
||
# remove hash token from cache
|
||
hashed_token = hash_token(cast(str, key))
|
||
user_api_key_cache.delete_cache(hashed_token)
|
||
|
||
return {
|
||
"deleted_keys": deleted_tokens,
|
||
"failed_tokens": failed_tokens,
|
||
}, _keys_being_deleted
|
||
|
||
|
||
def _transform_verification_tokens_to_deleted_records(
|
||
keys: List[LiteLLM_VerificationToken],
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
litellm_changed_by: Optional[str] = None,
|
||
) -> List[Dict[str, Any]]:
|
||
"""Transform verification tokens into deleted token records ready for persistence."""
|
||
if not keys:
|
||
return []
|
||
|
||
deleted_at = datetime.now(timezone.utc)
|
||
records = []
|
||
for key in keys:
|
||
key_payload = key.model_dump()
|
||
deleted_record = LiteLLM_DeletedVerificationToken(
|
||
**key_payload,
|
||
deleted_at=deleted_at,
|
||
deleted_by=user_api_key_dict.user_id,
|
||
deleted_by_api_key=user_api_key_dict.api_key,
|
||
litellm_changed_by=litellm_changed_by,
|
||
)
|
||
record = deleted_record.model_dump()
|
||
|
||
# Map org_id to organization_id (model uses org_id, but schema expects organization_id)
|
||
org_id_value = record.pop("org_id", None)
|
||
if org_id_value is not None:
|
||
record["organization_id"] = org_id_value
|
||
|
||
for json_field in [
|
||
"aliases",
|
||
"config",
|
||
"permissions",
|
||
"metadata",
|
||
"model_spend",
|
||
"model_max_budget",
|
||
"router_settings",
|
||
]:
|
||
if json_field in record and record[json_field] is not None:
|
||
record[json_field] = json.dumps(record[json_field])
|
||
|
||
for rel_key in (
|
||
"litellm_budget_table",
|
||
"litellm_organization_table",
|
||
"object_permission",
|
||
"id",
|
||
):
|
||
record.pop(rel_key, None)
|
||
|
||
records.append(record)
|
||
|
||
return records
|
||
|
||
|
||
async def _save_deleted_verification_token_records(
|
||
records: List[Dict[str, Any]],
|
||
prisma_client: PrismaClient,
|
||
) -> None:
|
||
"""Save deleted verification token records to the database."""
|
||
if not records:
|
||
return
|
||
await prisma_client.db.litellm_deletedverificationtoken.create_many(data=records)
|
||
|
||
|
||
async def _persist_deleted_verification_tokens(
|
||
keys: List[LiteLLM_VerificationToken],
|
||
prisma_client: PrismaClient,
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
litellm_changed_by: Optional[str] = None,
|
||
) -> None:
|
||
"""Persist deleted verification token records by transforming and saving them."""
|
||
records = _transform_verification_tokens_to_deleted_records(
|
||
keys=keys,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
)
|
||
await _save_deleted_verification_token_records(
|
||
records=records,
|
||
prisma_client=prisma_client,
|
||
)
|
||
|
||
|
||
async def delete_key_aliases(
|
||
key_aliases: List[str],
|
||
user_api_key_cache: DualCache,
|
||
prisma_client: PrismaClient,
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
litellm_changed_by: Optional[str] = None,
|
||
) -> Tuple[Optional[Dict], List[LiteLLM_VerificationToken]]:
|
||
_keys_being_deleted = await prisma_client.db.litellm_verificationtoken.find_many(
|
||
where={"key_alias": {"in": key_aliases}}
|
||
)
|
||
|
||
tokens = [key.token for key in _keys_being_deleted]
|
||
return await delete_verification_tokens(
|
||
tokens=tokens,
|
||
user_api_key_cache=user_api_key_cache,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
)
|
||
|
||
|
||
async def _rotate_master_key( # noqa: PLR0915
|
||
prisma_client: PrismaClient,
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
current_master_key: str,
|
||
new_master_key: str,
|
||
) -> None:
|
||
"""
|
||
Rotate the master key
|
||
|
||
1. Get the values from the DB
|
||
- Get models from DB
|
||
- Get config from DB
|
||
2. Decrypt the values
|
||
- ModelTable
|
||
- [{"model_name": "str", "litellm_params": {}}]
|
||
- ConfigTable
|
||
3. Encrypt the values with the new master key
|
||
4. Update the values in the DB
|
||
"""
|
||
import prisma
|
||
|
||
from litellm.proxy.proxy_server import proxy_config
|
||
|
||
try:
|
||
models: Optional[
|
||
List
|
||
] = await prisma_client.db.litellm_proxymodeltable.find_many()
|
||
except Exception:
|
||
models = None
|
||
# 2. process model table
|
||
if models:
|
||
decrypted_models = proxy_config.decrypt_model_list_from_db(new_models=models)
|
||
verbose_proxy_logger.debug(
|
||
"ABLE TO DECRYPT MODELS - len(decrypted_models): %s", len(decrypted_models)
|
||
)
|
||
new_models = []
|
||
for model in decrypted_models:
|
||
new_model = await _add_model_to_db(
|
||
model_params=Deployment(**model),
|
||
user_api_key_dict=user_api_key_dict,
|
||
prisma_client=prisma_client,
|
||
new_encryption_key=new_master_key,
|
||
should_create_model_in_db=False,
|
||
)
|
||
if new_model:
|
||
_dumped = new_model.model_dump(exclude_none=True)
|
||
_dumped["litellm_params"] = prisma.Json(_dumped["litellm_params"]) # type: ignore[attr-defined]
|
||
_dumped["model_info"] = prisma.Json(_dumped["model_info"]) # type: ignore[attr-defined]
|
||
new_models.append(_dumped)
|
||
verbose_proxy_logger.debug("Resetting proxy model table")
|
||
async with prisma_client.db.tx() as tx:
|
||
await tx.litellm_proxymodeltable.delete_many()
|
||
verbose_proxy_logger.debug("Creating %s models", len(new_models))
|
||
await tx.litellm_proxymodeltable.create_many(
|
||
data=new_models,
|
||
)
|
||
# 3. process config table
|
||
try:
|
||
config = await prisma_client.db.litellm_config.find_many()
|
||
except Exception:
|
||
config = None
|
||
|
||
if config:
|
||
"""If environment_variables is found, decrypt it and encrypt it with the new master key"""
|
||
environment_variables_dict = {}
|
||
for c in config:
|
||
if c.param_name == "environment_variables":
|
||
environment_variables_dict = c.param_value
|
||
|
||
if environment_variables_dict:
|
||
decrypted_env_vars = proxy_config._decrypt_and_set_db_env_variables(
|
||
environment_variables=environment_variables_dict
|
||
)
|
||
encrypted_env_vars = proxy_config._encrypt_env_variables(
|
||
environment_variables=decrypted_env_vars,
|
||
new_encryption_key=new_master_key,
|
||
)
|
||
|
||
if encrypted_env_vars:
|
||
await prisma_client.db.litellm_config.update(
|
||
where={"param_name": "environment_variables"},
|
||
data={"param_value": prisma.Json(encrypted_env_vars)}, # type: ignore[attr-defined]
|
||
)
|
||
|
||
# 4. process MCP server table
|
||
try:
|
||
await rotate_mcp_server_credentials_master_key(
|
||
prisma_client=prisma_client,
|
||
touched_by=user_api_key_dict.user_id or LITELLM_PROXY_ADMIN_NAME,
|
||
new_master_key=new_master_key,
|
||
)
|
||
except Exception as e:
|
||
verbose_proxy_logger.warning(
|
||
"Failed to rotate MCP server credentials: %s", str(e)
|
||
)
|
||
|
||
# 5. process credentials table
|
||
try:
|
||
credentials = await prisma_client.db.litellm_credentialstable.find_many()
|
||
except Exception:
|
||
credentials = None
|
||
if credentials:
|
||
from litellm.proxy.credential_endpoints.endpoints import update_db_credential
|
||
|
||
for cred in credentials:
|
||
try:
|
||
decrypted_cred = proxy_config.decrypt_credentials(cred)
|
||
encrypted_cred = update_db_credential(
|
||
db_credential=cred,
|
||
updated_patch=decrypted_cred,
|
||
new_encryption_key=new_master_key,
|
||
)
|
||
_cred_data = encrypted_cred.model_dump(exclude_none=True)
|
||
if "credential_values" in _cred_data:
|
||
_cred_data["credential_values"] = prisma.Json( # type: ignore[attr-defined]
|
||
_cred_data["credential_values"]
|
||
)
|
||
if "credential_info" in _cred_data:
|
||
_cred_data["credential_info"] = prisma.Json( # type: ignore[attr-defined]
|
||
_cred_data["credential_info"]
|
||
)
|
||
await prisma_client.db.litellm_credentialstable.update(
|
||
where={"credential_name": cred.credential_name},
|
||
data={
|
||
**_cred_data,
|
||
"updated_by": user_api_key_dict.user_id,
|
||
},
|
||
)
|
||
except Exception as e:
|
||
verbose_proxy_logger.error(
|
||
f"Failed to re-encrypt credential {cred.credential_name}: {str(e)}"
|
||
)
|
||
# Continue with next credential instead of failing entire rotation
|
||
continue
|
||
verbose_proxy_logger.debug(
|
||
f"Successfully re-encrypted {len(credentials)} credentials with new master key"
|
||
)
|
||
|
||
|
||
async def get_new_token(data: Optional[RegenerateKeyRequest]) -> str:
|
||
if data and data.new_key is not None:
|
||
# Reject custom key values if disabled by admin
|
||
await _check_custom_key_allowed(data.new_key)
|
||
new_token = data.new_key
|
||
if not data.new_key.startswith("sk-"):
|
||
raise HTTPException(
|
||
status_code=status.HTTP_400_BAD_REQUEST,
|
||
detail={
|
||
"error": "New key must start with 'sk-'. This is to distinguish a key hash (used by litellm for logging / internal logic) from the actual key."
|
||
},
|
||
)
|
||
else:
|
||
new_token = f"sk-{secrets.token_urlsafe(LENGTH_OF_LITELLM_GENERATED_KEY)}"
|
||
return new_token
|
||
|
||
|
||
async def _insert_deprecated_key(
|
||
prisma_client: "PrismaClient",
|
||
old_token_hash: str,
|
||
new_token_hash: str,
|
||
grace_period: Optional[str],
|
||
) -> None:
|
||
"""
|
||
Insert old key into deprecated table so it remains valid during grace period.
|
||
|
||
Uses upsert to handle concurrent rotations gracefully.
|
||
|
||
Parameters:
|
||
prisma_client: DB client
|
||
old_token_hash: Hash of the old key being rotated out
|
||
new_token_hash: Hash of the new replacement key
|
||
grace_period: Duration string (e.g. "24h", "2d") or None/empty for immediate revoke
|
||
"""
|
||
grace_period_value = grace_period or os.getenv(
|
||
"LITELLM_KEY_ROTATION_GRACE_PERIOD", ""
|
||
)
|
||
if not grace_period_value:
|
||
return
|
||
|
||
try:
|
||
grace_seconds = duration_in_seconds(grace_period_value)
|
||
except ValueError:
|
||
verbose_proxy_logger.warning(
|
||
"Invalid grace_period format: %s. Expected format like '24h', '2d'.",
|
||
grace_period_value,
|
||
)
|
||
return
|
||
|
||
if grace_seconds <= 0:
|
||
return
|
||
|
||
try:
|
||
revoke_at = datetime.now(timezone.utc) + timedelta(seconds=grace_seconds)
|
||
await prisma_client.db.litellm_deprecatedverificationtoken.upsert(
|
||
where={"token": old_token_hash},
|
||
data={
|
||
"create": {
|
||
"token": old_token_hash,
|
||
"active_token_id": new_token_hash,
|
||
"revoke_at": revoke_at,
|
||
},
|
||
"update": {
|
||
"active_token_id": new_token_hash,
|
||
"revoke_at": revoke_at,
|
||
},
|
||
},
|
||
)
|
||
verbose_proxy_logger.debug(
|
||
"Deprecated key retained for %s (revoke_at: %s)",
|
||
grace_period_value,
|
||
revoke_at,
|
||
)
|
||
except Exception as deprecated_err:
|
||
verbose_proxy_logger.warning(
|
||
"Failed to insert deprecated key for grace period: %s",
|
||
deprecated_err,
|
||
)
|
||
|
||
|
||
async def _execute_virtual_key_regeneration(
|
||
*,
|
||
prisma_client: PrismaClient,
|
||
key_in_db: LiteLLM_VerificationToken,
|
||
hashed_api_key: str,
|
||
key: str,
|
||
data: Optional[RegenerateKeyRequest],
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
litellm_changed_by: Optional[str],
|
||
user_api_key_cache: DualCache,
|
||
proxy_logging_obj: ProxyLogging,
|
||
) -> GenerateKeyResponse:
|
||
"""Generate new token, update DB, invalidate cache, and return response."""
|
||
from litellm.proxy.proxy_server import hash_token
|
||
|
||
new_token = await get_new_token(data=data)
|
||
new_token_hash = hash_token(new_token)
|
||
new_token_key_name = f"sk-...{new_token[-4:]}"
|
||
update_data = {"token": new_token_hash, "key_name": new_token_key_name}
|
||
|
||
non_default_values = {}
|
||
if data is not None:
|
||
non_default_values = await prepare_key_update_data(
|
||
data=data, existing_key_row=key_in_db
|
||
)
|
||
# Only validate key_alias format if it's actually being changed
|
||
new_key_alias = non_default_values.get("key_alias")
|
||
if new_key_alias != key_in_db.key_alias:
|
||
_validate_key_alias_format(key_alias=new_key_alias)
|
||
verbose_proxy_logger.debug("non_default_values: %s", non_default_values)
|
||
update_data.update(non_default_values)
|
||
update_data = prisma_client.jsonify_object(data=update_data)
|
||
|
||
# If grace period set, insert deprecated key so old key remains valid
|
||
await _insert_deprecated_key(
|
||
prisma_client=prisma_client,
|
||
old_token_hash=hashed_api_key,
|
||
new_token_hash=new_token_hash,
|
||
grace_period=data.grace_period if data else None,
|
||
)
|
||
|
||
updated_token = await prisma_client.db.litellm_verificationtoken.update(
|
||
where={"token": hashed_api_key},
|
||
data=update_data, # type: ignore
|
||
)
|
||
updated_token_dict = dict(updated_token) if updated_token is not None else {}
|
||
updated_token_dict["key"] = new_token
|
||
updated_token_dict["token_id"] = updated_token_dict.pop("token")
|
||
|
||
if hashed_api_key or key:
|
||
await _delete_cache_key_object(
|
||
hashed_token=hash_token(key),
|
||
user_api_key_cache=user_api_key_cache,
|
||
proxy_logging_obj=proxy_logging_obj,
|
||
)
|
||
|
||
response = GenerateKeyResponse(**updated_token_dict)
|
||
asyncio.create_task(
|
||
KeyManagementEventHooks.async_key_rotated_hook(
|
||
data=data,
|
||
existing_key_row=key_in_db,
|
||
response=response,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
)
|
||
)
|
||
return response
|
||
|
||
|
||
@router.post(
|
||
"/key/{key:path}/regenerate",
|
||
tags=["key management"],
|
||
dependencies=[Depends(user_api_key_auth)],
|
||
)
|
||
@router.post(
|
||
"/key/regenerate",
|
||
tags=["key management"],
|
||
dependencies=[Depends(user_api_key_auth)],
|
||
)
|
||
@management_endpoint_wrapper
|
||
async def regenerate_key_fn( # noqa: PLR0915
|
||
key: Optional[str] = None,
|
||
data: Optional[RegenerateKeyRequest] = None,
|
||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||
litellm_changed_by: Optional[str] = Header(
|
||
None,
|
||
description="The litellm-changed-by header enables tracking of actions performed by authorized users on behalf of other users, providing an audit trail for accountability",
|
||
),
|
||
) -> Optional[GenerateKeyResponse]:
|
||
"""
|
||
Regenerate an existing API key while optionally updating its parameters.
|
||
|
||
Parameters:
|
||
- key: str (path parameter) - The key to regenerate
|
||
- data: Optional[RegenerateKeyRequest] - Request body containing optional parameters to update
|
||
- key: Optional[str] - The key to regenerate.
|
||
- new_master_key: Optional[str] - The new master key to use, if key is the master key.
|
||
- new_key: Optional[str] - The new key to use, if key is not the master key. If both set, new_master_key will be used.
|
||
- key_alias: Optional[str] - User-friendly key alias
|
||
- user_id: Optional[str] - User ID associated with key
|
||
- team_id: Optional[str] - Team ID associated with key
|
||
- models: Optional[list] - Model_name's a user is allowed to call
|
||
- tags: Optional[List[str]] - Tags for organizing keys (Enterprise only)
|
||
- spend: Optional[float] - Amount spent by key
|
||
- max_budget: Optional[float] - Max budget for key
|
||
- model_max_budget: Optional[Dict[str, BudgetConfig]] - Model-specific budgets {"gpt-4": {"budget_limit": 0.0005, "time_period": "30d"}}
|
||
- budget_duration: Optional[str] - Budget reset period ("30d", "1h", etc.)
|
||
- soft_budget: Optional[float] - Soft budget limit (warning vs. hard stop). Will trigger a slack alert when this soft budget is reached.
|
||
- max_parallel_requests: Optional[int] - Rate limit for parallel requests
|
||
- metadata: Optional[dict] - Metadata for key. Example {"team": "core-infra", "app": "app2"}
|
||
- tpm_limit: Optional[int] - Tokens per minute limit
|
||
- rpm_limit: Optional[int] - Requests per minute limit
|
||
- model_rpm_limit: Optional[dict] - Model-specific RPM limits {"gpt-4": 100, "claude-v1": 200}
|
||
- model_tpm_limit: Optional[dict] - Model-specific TPM limits {"gpt-4": 100000, "claude-v1": 200000}
|
||
- allowed_cache_controls: Optional[list] - List of allowed cache control values
|
||
- duration: Optional[str] - Key validity duration ("30d", "1h", etc.)
|
||
- permissions: Optional[dict] - Key-specific permissions
|
||
- guardrails: Optional[List[str]] - List of active guardrails for the key
|
||
- blocked: Optional[bool] - Whether the key is blocked
|
||
- grace_period: Optional[str] - Duration to keep old key valid after rotation (e.g. "24h", "2d"). Omitted = immediate revoke. Env: LITELLM_KEY_ROTATION_GRACE_PERIOD
|
||
|
||
|
||
Returns:
|
||
- GenerateKeyResponse containing the new key and its updated parameters
|
||
|
||
Example:
|
||
```bash
|
||
curl --location --request POST 'http://localhost:4000/key/sk-1234/regenerate' \
|
||
--header 'Authorization: Bearer sk-1234' \
|
||
--header 'Content-Type: application/json' \
|
||
--data-raw '{
|
||
"max_budget": 100,
|
||
"metadata": {"team": "core-infra"},
|
||
"models": ["gpt-4", "gpt-3.5-turbo"]
|
||
}'
|
||
```
|
||
|
||
Note: This is an Enterprise feature. It requires a premium license to use.
|
||
"""
|
||
try:
|
||
from litellm.proxy.proxy_server import (
|
||
hash_token,
|
||
master_key,
|
||
premium_user,
|
||
prisma_client,
|
||
proxy_logging_obj,
|
||
user_api_key_cache,
|
||
)
|
||
|
||
is_master_key_regeneration = data and data.new_master_key is not None
|
||
|
||
if (
|
||
premium_user is not True and not is_master_key_regeneration
|
||
): # allow master key regeneration for non-premium users
|
||
raise ValueError(
|
||
f"Regenerating Virtual Keys is an Enterprise feature, {CommonProxyErrors.not_premium_user.value}"
|
||
)
|
||
|
||
# Check if key exists, raise exception if key is not in the DB
|
||
key = data.key if data and data.key else key
|
||
if not key:
|
||
raise HTTPException(status_code=400, detail={"error": "No key passed in."})
|
||
### 1. Create New copy that is duplicate of existing key
|
||
######################################################################
|
||
|
||
# create duplicate of existing key
|
||
# set token = new token generated
|
||
# insert new token in DB
|
||
|
||
# create hash of token
|
||
if prisma_client is None:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||
detail={"error": "DB not connected. prisma_client is None"},
|
||
)
|
||
|
||
_is_master_key_valid = _is_master_key(api_key=key, _master_key=master_key)
|
||
|
||
if master_key is not None and data and _is_master_key_valid:
|
||
if data.new_master_key is None:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_400_BAD_REQUEST,
|
||
detail={"error": "New master key is required."},
|
||
)
|
||
await _rotate_master_key(
|
||
prisma_client=prisma_client,
|
||
user_api_key_dict=user_api_key_dict,
|
||
current_master_key=master_key,
|
||
new_master_key=data.new_master_key,
|
||
)
|
||
return GenerateKeyResponse(
|
||
key=data.new_master_key,
|
||
token=data.new_master_key,
|
||
key_name=data.new_master_key,
|
||
expires=None,
|
||
)
|
||
|
||
if "sk" not in key:
|
||
hashed_api_key = key
|
||
else:
|
||
hashed_api_key = hash_token(key)
|
||
|
||
_key_in_db = await prisma_client.db.litellm_verificationtoken.find_unique(
|
||
where={"token": hashed_api_key},
|
||
)
|
||
if _key_in_db is None:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_404_NOT_FOUND,
|
||
detail={"error": f"Key {key} not found."},
|
||
)
|
||
|
||
# check if user has permission to regenerate key
|
||
await TeamMemberPermissionChecks.can_team_member_execute_key_management_endpoint(
|
||
user_api_key_dict=user_api_key_dict,
|
||
route=KeyManagementRoutes.KEY_REGENERATE,
|
||
prisma_client=prisma_client,
|
||
existing_key_row=_key_in_db,
|
||
user_api_key_cache=user_api_key_cache,
|
||
)
|
||
|
||
# check if user has ownership permission to regenerate key
|
||
if not await can_modify_verification_token(
|
||
key_info=_key_in_db,
|
||
user_api_key_cache=user_api_key_cache,
|
||
user_api_key_dict=user_api_key_dict,
|
||
prisma_client=prisma_client,
|
||
):
|
||
raise HTTPException(
|
||
status_code=status.HTTP_403_FORBIDDEN,
|
||
detail={"error": "You are not authorized to regenerate this key"},
|
||
)
|
||
|
||
verbose_proxy_logger.info(
|
||
"Key regeneration requested: key_alias=%s",
|
||
getattr(_key_in_db, "key_alias", None),
|
||
)
|
||
verbose_proxy_logger.debug("key_in_db: %s", _key_in_db)
|
||
|
||
# Normalize litellm_changed_by: if it's a Header object or not a string, convert to None
|
||
if litellm_changed_by is not None and not isinstance(litellm_changed_by, str):
|
||
litellm_changed_by = None
|
||
|
||
# Save the old key record to deleted table before regeneration.
|
||
# This preserves key_alias and team_id metadata for historical spend records.
|
||
# If this fails, abort the regeneration to avoid permanently losing the
|
||
# old hash→metadata mapping.
|
||
await _persist_deleted_verification_tokens(
|
||
keys=[_key_in_db],
|
||
prisma_client=prisma_client,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
)
|
||
|
||
return await _execute_virtual_key_regeneration(
|
||
prisma_client=prisma_client,
|
||
key_in_db=_key_in_db,
|
||
hashed_api_key=hashed_api_key,
|
||
key=key,
|
||
data=data,
|
||
user_api_key_dict=user_api_key_dict,
|
||
litellm_changed_by=litellm_changed_by,
|
||
user_api_key_cache=user_api_key_cache,
|
||
proxy_logging_obj=proxy_logging_obj,
|
||
)
|
||
except Exception as e:
|
||
verbose_proxy_logger.exception("Error regenerating key: %s", e)
|
||
raise handle_exception_on_proxy(e)
|
||
|
||
|
||
async def _check_proxy_or_team_admin_for_key(
|
||
key_in_db: LiteLLM_VerificationToken,
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
prisma_client: PrismaClient,
|
||
user_api_key_cache: DualCache,
|
||
) -> None:
|
||
if user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value:
|
||
return
|
||
|
||
if key_in_db.team_id is not None:
|
||
team_table = await get_team_object(
|
||
team_id=key_in_db.team_id,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
check_db_only=True,
|
||
)
|
||
if team_table is not None:
|
||
if _is_user_team_admin(
|
||
user_api_key_dict=user_api_key_dict,
|
||
team_obj=team_table,
|
||
):
|
||
return
|
||
|
||
raise HTTPException(
|
||
status_code=status.HTTP_403_FORBIDDEN,
|
||
detail={"error": "You must be a proxy admin or team admin to reset key spend"},
|
||
)
|
||
|
||
|
||
def _validate_reset_spend_value(
|
||
reset_to: Any, key_in_db: LiteLLM_VerificationToken
|
||
) -> float:
|
||
if not isinstance(reset_to, (int, float)):
|
||
raise HTTPException(
|
||
status_code=status.HTTP_400_BAD_REQUEST,
|
||
detail={"error": "reset_to must be a float"},
|
||
)
|
||
|
||
reset_to = float(reset_to)
|
||
|
||
if reset_to < 0:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_400_BAD_REQUEST,
|
||
detail={"error": "reset_to must be >= 0"},
|
||
)
|
||
|
||
current_spend = key_in_db.spend or 0.0
|
||
if reset_to > current_spend:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_400_BAD_REQUEST,
|
||
detail={
|
||
"error": f"reset_to ({reset_to}) must be <= current spend ({current_spend})"
|
||
},
|
||
)
|
||
|
||
max_budget = key_in_db.max_budget
|
||
if key_in_db.litellm_budget_table is not None:
|
||
budget_max_budget = getattr(key_in_db.litellm_budget_table, "max_budget", None)
|
||
if budget_max_budget is not None:
|
||
if max_budget is None or budget_max_budget < max_budget:
|
||
max_budget = budget_max_budget
|
||
|
||
if max_budget is not None and reset_to > max_budget:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_400_BAD_REQUEST,
|
||
detail={"error": f"reset_to ({reset_to}) must be <= budget ({max_budget})"},
|
||
)
|
||
|
||
return reset_to
|
||
|
||
|
||
@router.post(
|
||
"/key/{key:path}/reset_spend",
|
||
tags=["key management"],
|
||
dependencies=[Depends(user_api_key_auth)],
|
||
)
|
||
@management_endpoint_wrapper
|
||
async def reset_key_spend_fn(
|
||
key: str,
|
||
data: ResetSpendRequest,
|
||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||
litellm_changed_by: Optional[str] = Header(
|
||
None,
|
||
description="The litellm-changed-by header enables tracking of actions performed by authorized users on behalf of other users, providing an audit trail for accountability",
|
||
),
|
||
) -> Dict[str, Any]:
|
||
try:
|
||
from litellm.proxy.proxy_server import (
|
||
hash_token,
|
||
prisma_client,
|
||
proxy_logging_obj,
|
||
user_api_key_cache,
|
||
)
|
||
|
||
if prisma_client is None:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||
detail={"error": "DB not connected. prisma_client is None"},
|
||
)
|
||
|
||
if "sk" not in key:
|
||
hashed_api_key = key
|
||
else:
|
||
hashed_api_key = hash_token(key)
|
||
|
||
_key_in_db = await prisma_client.db.litellm_verificationtoken.find_unique(
|
||
where={"token": hashed_api_key},
|
||
include={"litellm_budget_table": True},
|
||
)
|
||
if _key_in_db is None:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_404_NOT_FOUND,
|
||
detail={"error": f"Key {key} not found."},
|
||
)
|
||
|
||
current_spend = _key_in_db.spend or 0.0
|
||
reset_to = _validate_reset_spend_value(data.reset_to, _key_in_db)
|
||
|
||
await _check_proxy_or_team_admin_for_key(
|
||
key_in_db=_key_in_db,
|
||
user_api_key_dict=user_api_key_dict,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
)
|
||
|
||
updated_key = await prisma_client.db.litellm_verificationtoken.update(
|
||
where={"token": hashed_api_key},
|
||
data={"spend": reset_to},
|
||
)
|
||
|
||
if updated_key is None:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||
detail={"error": "Failed to update key spend"},
|
||
)
|
||
|
||
await _delete_cache_key_object(
|
||
hashed_token=hashed_api_key,
|
||
user_api_key_cache=user_api_key_cache,
|
||
proxy_logging_obj=proxy_logging_obj,
|
||
)
|
||
|
||
max_budget = updated_key.max_budget
|
||
budget_reset_at = updated_key.budget_reset_at
|
||
|
||
return {
|
||
"key_hash": hashed_api_key,
|
||
"spend": reset_to,
|
||
"previous_spend": current_spend,
|
||
"max_budget": max_budget,
|
||
"budget_reset_at": budget_reset_at,
|
||
}
|
||
except HTTPException:
|
||
raise
|
||
except Exception as e:
|
||
verbose_proxy_logger.exception("Error resetting key spend: %s", e)
|
||
raise handle_exception_on_proxy(e)
|
||
|
||
|
||
async def validate_key_list_check(
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
user_id: Optional[str],
|
||
team_id: Optional[str],
|
||
organization_id: Optional[str],
|
||
key_alias: Optional[str],
|
||
key_hash: Optional[str],
|
||
prisma_client: PrismaClient,
|
||
) -> Optional[LiteLLM_UserTable]:
|
||
if user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value:
|
||
return None
|
||
|
||
if user_api_key_dict.user_id is None:
|
||
raise ProxyException(
|
||
message="You are not authorized to access this endpoint. No 'user_id' is associated with your API key.",
|
||
type=ProxyErrorTypes.bad_request_error,
|
||
param="user_id",
|
||
code=status.HTTP_403_FORBIDDEN,
|
||
)
|
||
complete_user_info_db_obj: Optional[
|
||
BaseModel
|
||
] = await prisma_client.db.litellm_usertable.find_unique(
|
||
where={"user_id": user_api_key_dict.user_id},
|
||
include={"organization_memberships": True},
|
||
)
|
||
|
||
if complete_user_info_db_obj is None:
|
||
raise ProxyException(
|
||
message="You are not authorized to access this endpoint. No 'user_id' is associated with your API key.",
|
||
type=ProxyErrorTypes.bad_request_error,
|
||
param="user_id",
|
||
code=status.HTTP_403_FORBIDDEN,
|
||
)
|
||
|
||
complete_user_info = LiteLLM_UserTable(**complete_user_info_db_obj.model_dump())
|
||
|
||
# internal user can only see their own keys
|
||
if user_id:
|
||
if complete_user_info.user_id != user_id:
|
||
raise ProxyException(
|
||
message="You are not authorized to check another user's keys",
|
||
type=ProxyErrorTypes.bad_request_error,
|
||
param="user_id",
|
||
code=status.HTTP_403_FORBIDDEN,
|
||
)
|
||
|
||
if team_id:
|
||
if team_id not in complete_user_info.teams:
|
||
raise ProxyException(
|
||
message="You are not authorized to check this team's keys",
|
||
type=ProxyErrorTypes.bad_request_error,
|
||
param="team_id",
|
||
code=status.HTTP_403_FORBIDDEN,
|
||
)
|
||
|
||
if organization_id:
|
||
if (
|
||
complete_user_info.organization_memberships is None
|
||
or organization_id
|
||
not in [
|
||
membership.organization_id
|
||
for membership in complete_user_info.organization_memberships
|
||
]
|
||
):
|
||
raise ProxyException(
|
||
message="You are not authorized to check this organization's keys",
|
||
type=ProxyErrorTypes.bad_request_error,
|
||
param="organization_id",
|
||
code=status.HTTP_403_FORBIDDEN,
|
||
)
|
||
|
||
if key_hash:
|
||
try:
|
||
key_info = await prisma_client.db.litellm_verificationtoken.find_unique(
|
||
where={"token": key_hash},
|
||
)
|
||
except Exception:
|
||
raise ProxyException(
|
||
message="Key Hash not found.",
|
||
type=ProxyErrorTypes.bad_request_error,
|
||
param="key_hash",
|
||
code=status.HTTP_403_FORBIDDEN,
|
||
)
|
||
can_user_query_key_info = await _can_user_query_key_info(
|
||
user_api_key_dict=user_api_key_dict,
|
||
key=key_hash,
|
||
key_info=key_info,
|
||
)
|
||
if not can_user_query_key_info:
|
||
raise HTTPException(
|
||
status_code=status.HTTP_403_FORBIDDEN,
|
||
detail="You are not allowed to access this key's info. Your role={}".format(
|
||
user_api_key_dict.user_role
|
||
),
|
||
)
|
||
return complete_user_info
|
||
|
||
|
||
async def _fetch_user_team_objects(
|
||
complete_user_info: Optional[LiteLLM_UserTable],
|
||
prisma_client: PrismaClient,
|
||
) -> List[LiteLLM_TeamTable]:
|
||
"""Fetch team objects for all teams a user belongs to (single DB query)."""
|
||
if complete_user_info is None or not complete_user_info.teams:
|
||
return []
|
||
|
||
teams: Optional[
|
||
List[BaseModel]
|
||
] = await prisma_client.db.litellm_teamtable.find_many(
|
||
where={"team_id": {"in": complete_user_info.teams}}
|
||
)
|
||
if teams is None:
|
||
return []
|
||
|
||
return [LiteLLM_TeamTable(**team.model_dump()) for team in teams]
|
||
|
||
|
||
def _get_admin_team_ids_from_objects(
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
team_objects: List[LiteLLM_TeamTable],
|
||
) -> List[str]:
|
||
"""Filter team objects to those where the user is an admin."""
|
||
return [
|
||
team.team_id
|
||
for team in team_objects
|
||
if _is_user_team_admin(user_api_key_dict=user_api_key_dict, team_obj=team)
|
||
]
|
||
|
||
|
||
def _get_member_team_ids_from_objects(
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
team_objects: List[LiteLLM_TeamTable],
|
||
) -> List[str]:
|
||
"""Filter team objects to those where the user is a member (any role)."""
|
||
return [
|
||
team.team_id
|
||
for team in team_objects
|
||
if any(
|
||
member.user_id is not None and member.user_id == user_api_key_dict.user_id
|
||
for member in team.members_with_roles
|
||
)
|
||
]
|
||
|
||
|
||
async def get_admin_team_ids(
|
||
complete_user_info: Optional[LiteLLM_UserTable],
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
prisma_client: PrismaClient,
|
||
) -> List[str]:
|
||
"""Get all team IDs where the user is an admin."""
|
||
team_objects = await _fetch_user_team_objects(complete_user_info, prisma_client)
|
||
return _get_admin_team_ids_from_objects(user_api_key_dict, team_objects)
|
||
|
||
|
||
async def get_member_team_ids(
|
||
complete_user_info: Optional[LiteLLM_UserTable],
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
prisma_client: PrismaClient,
|
||
) -> List[str]:
|
||
"""
|
||
Get all team IDs where the user is a member (any role, including admin).
|
||
|
||
Used to determine which teams' service accounts (keys with user_id=NULL)
|
||
a regular team member can see.
|
||
"""
|
||
team_objects = await _fetch_user_team_objects(complete_user_info, prisma_client)
|
||
return _get_member_team_ids_from_objects(user_api_key_dict, team_objects)
|
||
|
||
|
||
@router.get(
|
||
"/key/list",
|
||
tags=["key management"],
|
||
dependencies=[Depends(user_api_key_auth)],
|
||
)
|
||
@management_endpoint_wrapper
|
||
async def list_keys(
|
||
request: Request,
|
||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||
page: int = Query(1, description="Page number", ge=1),
|
||
size: int = Query(10, description="Page size", ge=1, le=100),
|
||
user_id: Optional[str] = Query(None, description="Filter keys by user ID"),
|
||
team_id: Optional[str] = Query(None, description="Filter keys by team ID"),
|
||
organization_id: Optional[str] = Query(
|
||
None, description="Filter keys by organization ID"
|
||
),
|
||
key_hash: Optional[str] = Query(None, description="Filter keys by key hash"),
|
||
key_alias: Optional[str] = Query(None, description="Filter keys by key alias"),
|
||
return_full_object: bool = Query(False, description="Return full key object"),
|
||
include_team_keys: bool = Query(
|
||
False, description="Include all keys for teams that user is an admin of."
|
||
),
|
||
include_created_by_keys: bool = Query(
|
||
False, description="Include keys created by the user"
|
||
),
|
||
sort_by: Optional[str] = Query(
|
||
default=None,
|
||
description="Column to sort by (e.g. 'user_id', 'created_at', 'spend')",
|
||
),
|
||
sort_order: str = Query(default="desc", description="Sort order ('asc' or 'desc')"),
|
||
expand: Optional[List[str]] = Query(
|
||
None, description="Expand related objects (e.g. 'user')"
|
||
),
|
||
status: Optional[str] = Query(
|
||
None, description="Filter by status (e.g. 'deleted')"
|
||
),
|
||
project_id: Optional[str] = Query(None, description="Filter keys by project ID"),
|
||
access_group_id: Optional[str] = Query(
|
||
None, description="Filter keys by access group ID"
|
||
),
|
||
) -> KeyListResponseObject:
|
||
"""
|
||
List all keys for a given user / team / organization.
|
||
|
||
Parameters:
|
||
expand: Optional[List[str]] - Expand related objects (e.g. 'user' to include user information)
|
||
status: Optional[str] - Filter by status. Currently supports "deleted" to query deleted keys.
|
||
|
||
Returns:
|
||
{
|
||
"keys": List[str] or List[UserAPIKeyAuth],
|
||
"total_count": int,
|
||
"current_page": int,
|
||
"total_pages": int,
|
||
}
|
||
|
||
When expand includes "user", each key object will include a "user" field with the associated user object.
|
||
Note: When expand=user is specified, full key objects are returned regardless of the return_full_object parameter.
|
||
"""
|
||
try:
|
||
from litellm.proxy.proxy_server import prisma_client
|
||
|
||
verbose_proxy_logger.debug("Entering list_keys function")
|
||
|
||
if prisma_client is None:
|
||
verbose_proxy_logger.error("Database not connected")
|
||
raise Exception("Database not connected")
|
||
|
||
# Validate status parameter
|
||
if status is not None and status != "deleted":
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={
|
||
"error": "Invalid status value. Currently only 'deleted' is supported."
|
||
},
|
||
)
|
||
|
||
complete_user_info = await validate_key_list_check(
|
||
user_api_key_dict=user_api_key_dict,
|
||
user_id=user_id,
|
||
team_id=team_id,
|
||
organization_id=organization_id,
|
||
key_alias=key_alias,
|
||
key_hash=key_hash,
|
||
prisma_client=prisma_client,
|
||
)
|
||
|
||
# Fetch team objects once when needed for either admin or member filtering.
|
||
# This avoids duplicate DB queries for the same team data.
|
||
if include_team_keys or include_created_by_keys:
|
||
team_objects = await _fetch_user_team_objects(
|
||
complete_user_info=complete_user_info,
|
||
prisma_client=prisma_client,
|
||
)
|
||
member_team_ids = _get_member_team_ids_from_objects(
|
||
user_api_key_dict=user_api_key_dict,
|
||
team_objects=team_objects,
|
||
)
|
||
else:
|
||
team_objects = []
|
||
member_team_ids = None
|
||
|
||
if include_team_keys:
|
||
admin_team_ids = _get_admin_team_ids_from_objects(
|
||
user_api_key_dict=user_api_key_dict,
|
||
team_objects=team_objects,
|
||
)
|
||
else:
|
||
admin_team_ids = None
|
||
|
||
if not user_id and user_api_key_dict.user_role not in [
|
||
LitellmUserRoles.PROXY_ADMIN.value,
|
||
LitellmUserRoles.PROXY_ADMIN_VIEW_ONLY.value,
|
||
]:
|
||
user_id = user_api_key_dict.user_id
|
||
|
||
response = await _list_key_helper(
|
||
prisma_client=prisma_client,
|
||
page=page,
|
||
size=size,
|
||
user_id=user_id,
|
||
team_id=team_id,
|
||
key_alias=key_alias,
|
||
key_hash=key_hash,
|
||
return_full_object=return_full_object,
|
||
organization_id=organization_id,
|
||
admin_team_ids=admin_team_ids,
|
||
member_team_ids=member_team_ids,
|
||
include_created_by_keys=include_created_by_keys,
|
||
sort_by=sort_by,
|
||
sort_order=sort_order,
|
||
expand=expand,
|
||
status=status,
|
||
project_id=project_id,
|
||
access_group_id=access_group_id,
|
||
)
|
||
|
||
verbose_proxy_logger.debug("Successfully prepared response")
|
||
|
||
return response
|
||
|
||
except Exception as e:
|
||
verbose_proxy_logger.exception(f"Error in list_keys: {e}")
|
||
if isinstance(e, HTTPException):
|
||
raise ProxyException(
|
||
message=getattr(e, "detail", f"error({str(e)})"),
|
||
type=ProxyErrorTypes.internal_server_error,
|
||
param=getattr(e, "param", "None"),
|
||
code=getattr(
|
||
e, "status_code", fastapi.status.HTTP_500_INTERNAL_SERVER_ERROR
|
||
),
|
||
)
|
||
elif isinstance(e, ProxyException):
|
||
raise e
|
||
raise ProxyException(
|
||
message="Authentication Error, " + str(e),
|
||
type=ProxyErrorTypes.internal_server_error,
|
||
param=getattr(e, "param", "None"),
|
||
code=fastapi.status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||
)
|
||
|
||
|
||
@router.get(
|
||
"/key/aliases",
|
||
tags=["key management"],
|
||
dependencies=[Depends(user_api_key_auth)],
|
||
)
|
||
@management_endpoint_wrapper
|
||
async def key_aliases(
|
||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||
page: int = Query(1, ge=1, description="Page number"),
|
||
size: int = Query(50, ge=1, le=100, description="Page size"),
|
||
search: Optional[str] = Query(
|
||
None, description="Search key aliases (case-insensitive partial match)"
|
||
),
|
||
) -> Dict[str, Any]:
|
||
"""
|
||
Lists key aliases with pagination and optional search.
|
||
|
||
Non-admin users only see aliases for keys they own or keys belonging to
|
||
their teams.
|
||
|
||
Returns:
|
||
{
|
||
"aliases": List[str],
|
||
"total_count": int,
|
||
"current_page": int,
|
||
"total_pages": int,
|
||
"size": int,
|
||
}
|
||
"""
|
||
try:
|
||
from litellm.proxy.proxy_server import prisma_client
|
||
|
||
verbose_proxy_logger.debug("Entering key_aliases function")
|
||
|
||
if prisma_client is None:
|
||
verbose_proxy_logger.error("Database not connected")
|
||
raise Exception("Database not connected")
|
||
|
||
# Build a parameterized WHERE clause to avoid loading full rows into
|
||
# memory. Raw SQL is used because the Prisma client wrapper does not
|
||
# support column-level SELECT projection on find_many.
|
||
#
|
||
# $1 is always UI_SESSION_TOKEN_TEAM_ID (filters out UI session tokens).
|
||
query_params: List[Any] = [UI_SESSION_TOKEN_TEAM_ID]
|
||
where_parts = [
|
||
"key_alias IS NOT NULL",
|
||
"key_alias != ''",
|
||
"(team_id IS NULL OR team_id != $1)",
|
||
]
|
||
|
||
# Scope results for non-admin users: only show aliases for keys the
|
||
# user owns or keys belonging to teams they are a member of.
|
||
is_proxy_admin = user_api_key_dict.user_role in [
|
||
LitellmUserRoles.PROXY_ADMIN.value,
|
||
LitellmUserRoles.PROXY_ADMIN_VIEW_ONLY.value,
|
||
]
|
||
if not is_proxy_admin:
|
||
scope_conditions: List[str] = []
|
||
if user_api_key_dict.user_id:
|
||
query_params.append(user_api_key_dict.user_id)
|
||
scope_conditions.append(f"user_id = ${len(query_params)}")
|
||
|
||
# Look up the user's teams from the user table
|
||
user_teams: List[str] = []
|
||
if user_api_key_dict.user_id:
|
||
user_row = await prisma_client.db.litellm_usertable.find_unique(
|
||
where={"user_id": user_api_key_dict.user_id}
|
||
)
|
||
if user_row is not None:
|
||
user_teams = getattr(user_row, "teams", []) or []
|
||
|
||
if user_teams:
|
||
team_placeholders = ", ".join(
|
||
f"${len(query_params) + i + 1}" for i in range(len(user_teams))
|
||
)
|
||
query_params.extend(user_teams)
|
||
scope_conditions.append(f"team_id IN ({team_placeholders})")
|
||
|
||
if scope_conditions:
|
||
where_parts.append(f"({' OR '.join(scope_conditions)})")
|
||
else:
|
||
# No user_id and no teams — return nothing
|
||
where_parts.append("FALSE")
|
||
|
||
if search:
|
||
query_params.append(f"%{search}%")
|
||
where_parts.append(f"key_alias ILIKE ${len(query_params)}")
|
||
|
||
where_sql = " AND ".join(where_parts)
|
||
|
||
count_sql = f'SELECT COUNT(*) AS count FROM "LiteLLM_VerificationToken" WHERE {where_sql}'
|
||
count_rows = await prisma_client.db.query_raw(count_sql, *query_params)
|
||
total_count = int(count_rows[0]["count"]) if count_rows else 0
|
||
|
||
aliases_params = query_params + [size, (page - 1) * size]
|
||
limit_idx = len(aliases_params) - 1
|
||
offset_idx = len(aliases_params)
|
||
aliases_sql = (
|
||
f"SELECT key_alias"
|
||
f' FROM "LiteLLM_VerificationToken"'
|
||
f" WHERE {where_sql}"
|
||
f" ORDER BY key_alias ASC"
|
||
f" LIMIT ${limit_idx} OFFSET ${offset_idx}"
|
||
)
|
||
alias_rows = await prisma_client.db.query_raw(aliases_sql, *aliases_params)
|
||
aliases: List[str] = [
|
||
row["key_alias"] for row in alias_rows if row.get("key_alias")
|
||
]
|
||
|
||
total_pages = -(-total_count // size) if total_count > 0 else 0
|
||
verbose_proxy_logger.debug(
|
||
f"key_aliases: page={page}, size={size}, search={search!r}, "
|
||
f"total_count={total_count}, total_pages={total_pages}"
|
||
)
|
||
|
||
return {
|
||
"aliases": aliases,
|
||
"total_count": total_count,
|
||
"current_page": page,
|
||
"total_pages": total_pages,
|
||
"size": size,
|
||
}
|
||
|
||
except Exception as e:
|
||
verbose_proxy_logger.exception(f"Error in key_aliases: {e}")
|
||
if isinstance(e, HTTPException):
|
||
raise ProxyException(
|
||
message=getattr(e, "detail", f"error({str(e)})"),
|
||
type=ProxyErrorTypes.internal_server_error,
|
||
param=getattr(e, "param", "None"),
|
||
code=getattr(e, "status_code", status.HTTP_500_INTERNAL_SERVER_ERROR),
|
||
)
|
||
elif isinstance(e, ProxyException):
|
||
raise e
|
||
raise ProxyException(
|
||
message="Authentication Error, " + str(e),
|
||
type=ProxyErrorTypes.internal_server_error,
|
||
param=getattr(e, "param", "None"),
|
||
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||
)
|
||
|
||
|
||
def _validate_sort_params(
|
||
sort_by: Optional[str], sort_order: str
|
||
) -> Optional[Dict[str, str]]:
|
||
order_by: Dict[str, str] = {}
|
||
|
||
if sort_by is None:
|
||
return None
|
||
# Validate sort_by is a valid column
|
||
valid_columns = [
|
||
"spend",
|
||
"max_budget",
|
||
"created_at",
|
||
"updated_at",
|
||
"token",
|
||
"key_alias",
|
||
]
|
||
if sort_by not in valid_columns:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={
|
||
"error": f"Invalid sort column. Must be one of: {', '.join(valid_columns)}"
|
||
},
|
||
)
|
||
|
||
# Validate sort_order
|
||
if sort_order.lower() not in ["asc", "desc"]:
|
||
raise HTTPException(
|
||
status_code=400,
|
||
detail={"error": "Invalid sort order. Must be 'asc' or 'desc'"},
|
||
)
|
||
|
||
order_by[sort_by] = sort_order.lower()
|
||
|
||
return order_by
|
||
|
||
|
||
def _build_key_filter_conditions(
|
||
user_id: Optional[str],
|
||
team_id: Optional[str],
|
||
organization_id: Optional[str],
|
||
key_alias: Optional[str],
|
||
key_hash: Optional[str],
|
||
exclude_team_id: Optional[str],
|
||
admin_team_ids: Optional[List[str]],
|
||
member_team_ids: Optional[List[str]] = None,
|
||
include_created_by_keys: bool = False,
|
||
project_id: Optional[str] = None,
|
||
access_group_id: Optional[str] = None,
|
||
) -> Dict[str, Union[str, Dict[str, Any], List[Dict[str, Any]]]]:
|
||
"""Build filter conditions for key listing.
|
||
|
||
Visibility rules:
|
||
- Users always see their own keys (user_id match)
|
||
- Team admins see ALL keys for their admin teams (via admin_team_ids)
|
||
- Regular team members see only service accounts (user_id=NULL) for their
|
||
teams (via member_team_ids). This prevents leaking other members' spend data.
|
||
- created_by visibility is scoped to teams the user currently belongs to,
|
||
so former members cannot see service accounts they created after leaving.
|
||
"""
|
||
# Prepare filter conditions
|
||
where: Dict[str, Union[str, Dict[str, Any], List[Dict[str, Any]]]] = {}
|
||
where.update(_get_condition_to_filter_out_ui_session_tokens())
|
||
|
||
# Build the OR conditions for user's keys and admin team keys
|
||
or_conditions: List[Dict[str, Any]] = []
|
||
|
||
# Base conditions for user's own keys
|
||
user_condition: Dict[str, Any] = {}
|
||
if user_id and isinstance(user_id, str):
|
||
user_condition["user_id"] = user_id
|
||
if key_alias and isinstance(key_alias, str):
|
||
user_condition["key_alias"] = key_alias
|
||
if exclude_team_id and isinstance(exclude_team_id, str):
|
||
user_condition["team_id"] = {"not": exclude_team_id}
|
||
if organization_id and isinstance(organization_id, str):
|
||
user_condition["organization_id"] = organization_id
|
||
if key_hash and isinstance(key_hash, str):
|
||
user_condition["token"] = key_hash
|
||
|
||
if user_condition:
|
||
or_conditions.append(user_condition)
|
||
|
||
# Add condition for created_by keys, scoped to user's current teams
|
||
if include_created_by_keys and user_id:
|
||
if member_team_ids is not None:
|
||
if member_team_ids:
|
||
# Scope created_by keys to teams user is still a member of,
|
||
# or keys that have no team (personal keys)
|
||
or_conditions.append(
|
||
{
|
||
"AND": [
|
||
{"created_by": user_id},
|
||
{
|
||
"OR": [
|
||
{"team_id": {"in": member_team_ids}},
|
||
{"team_id": None},
|
||
]
|
||
},
|
||
]
|
||
}
|
||
)
|
||
else:
|
||
# User is not a member of any team, only show non-team created_by keys
|
||
or_conditions.append(
|
||
{"AND": [{"created_by": user_id}, {"team_id": None}]}
|
||
)
|
||
else:
|
||
# No team membership info provided (backward compatibility for
|
||
# direct _list_key_helper callers like Prometheus)
|
||
or_conditions.append({"created_by": user_id})
|
||
|
||
# Add condition for admin team keys (admins see ALL team keys)
|
||
if admin_team_ids:
|
||
or_conditions.append({"team_id": {"in": admin_team_ids}})
|
||
|
||
# Add condition for member team service accounts (members only see keys with user_id=NULL)
|
||
if member_team_ids:
|
||
# Exclude teams where user is already admin (those are covered above with full visibility)
|
||
member_only_team_ids = [
|
||
tid for tid in member_team_ids if tid not in (admin_team_ids or [])
|
||
]
|
||
if member_only_team_ids:
|
||
or_conditions.append(
|
||
{
|
||
"AND": [
|
||
{"team_id": {"in": member_only_team_ids}},
|
||
{"user_id": None},
|
||
]
|
||
}
|
||
)
|
||
|
||
# Combine conditions with OR if we have multiple conditions
|
||
if len(or_conditions) > 1:
|
||
where = {"AND": [where, {"OR": or_conditions}]}
|
||
elif len(or_conditions) == 1:
|
||
where.update(or_conditions[0])
|
||
|
||
# Apply team_id, project_id and access_group_id as global AND filters so they
|
||
# narrow results across all visibility conditions (own keys, team keys, etc.)
|
||
if team_id and isinstance(team_id, str):
|
||
where = {"AND": [where, {"team_id": team_id}]}
|
||
if project_id:
|
||
where = {"AND": [where, {"project_id": project_id}]}
|
||
if access_group_id:
|
||
where = {"AND": [where, {"access_group_ids": {"hasSome": [access_group_id]}}]}
|
||
|
||
verbose_proxy_logger.debug(f"Filter conditions: {where}")
|
||
return where
|
||
|
||
|
||
async def _list_key_helper(
|
||
prisma_client: PrismaClient,
|
||
page: int,
|
||
size: int,
|
||
user_id: Optional[str],
|
||
team_id: Optional[str],
|
||
organization_id: Optional[str],
|
||
key_alias: Optional[str],
|
||
key_hash: Optional[str],
|
||
exclude_team_id: Optional[str] = None,
|
||
return_full_object: bool = False,
|
||
admin_team_ids: Optional[
|
||
List[str]
|
||
] = None, # New parameter for teams where user is admin
|
||
member_team_ids: Optional[
|
||
List[str]
|
||
] = None, # Team IDs where user is a member (any role) - for service account visibility
|
||
include_created_by_keys: bool = False,
|
||
sort_by: Optional[str] = None,
|
||
sort_order: str = "desc",
|
||
expand: Optional[List[str]] = None,
|
||
status: Optional[str] = None,
|
||
project_id: Optional[str] = None,
|
||
access_group_id: Optional[str] = None,
|
||
) -> KeyListResponseObject:
|
||
"""
|
||
Helper function to list keys
|
||
Args:
|
||
page: int
|
||
size: int
|
||
user_id: Optional[str]
|
||
team_id: Optional[str]
|
||
key_alias: Optional[str]
|
||
exclude_team_id: Optional[str] # exclude a specific team_id
|
||
return_full_object: bool # when true, will return UserAPIKeyAuth objects instead of just the token
|
||
admin_team_ids: Optional[List[str]] # list of team IDs where the user is an admin
|
||
member_team_ids: Optional[List[str]] # list of team IDs where user is a member (for service account visibility)
|
||
|
||
Returns:
|
||
KeyListResponseObject
|
||
{
|
||
"keys": List[str] or List[UserAPIKeyAuth], # Updated to reflect possible return types
|
||
"total_count": int,
|
||
"current_page": int,
|
||
"total_pages": int,
|
||
}
|
||
"""
|
||
where = _build_key_filter_conditions(
|
||
user_id=user_id,
|
||
team_id=team_id,
|
||
organization_id=organization_id,
|
||
key_alias=key_alias,
|
||
key_hash=key_hash,
|
||
exclude_team_id=exclude_team_id,
|
||
admin_team_ids=admin_team_ids,
|
||
member_team_ids=member_team_ids,
|
||
include_created_by_keys=include_created_by_keys,
|
||
project_id=project_id,
|
||
access_group_id=access_group_id,
|
||
)
|
||
|
||
# Calculate skip for pagination
|
||
skip = (page - 1) * size
|
||
|
||
verbose_proxy_logger.debug(f"Pagination: skip={skip}, take={size}")
|
||
|
||
order_by: Optional[Dict[str, str]] = (
|
||
_validate_sort_params(sort_by, sort_order)
|
||
if sort_by is not None and isinstance(sort_by, str)
|
||
else None
|
||
)
|
||
|
||
# Determine which table to query based on status
|
||
use_deleted_table = status == "deleted"
|
||
|
||
# Fetch keys with pagination
|
||
if use_deleted_table:
|
||
keys = await prisma_client.db.litellm_deletedverificationtoken.find_many(
|
||
where=where, # type: ignore
|
||
skip=skip, # type: ignore
|
||
take=size, # type: ignore
|
||
order=(
|
||
order_by
|
||
if order_by
|
||
else [
|
||
{"created_at": "desc"},
|
||
{"token": "desc"}, # fallback sort
|
||
]
|
||
),
|
||
)
|
||
else:
|
||
keys = await prisma_client.db.litellm_verificationtoken.find_many(
|
||
where=where, # type: ignore
|
||
skip=skip, # type: ignore
|
||
take=size, # type: ignore
|
||
order=(
|
||
order_by
|
||
if order_by
|
||
else [
|
||
{"created_at": "desc"},
|
||
{"token": "desc"}, # fallback sort
|
||
]
|
||
),
|
||
include={"object_permission": True},
|
||
)
|
||
|
||
verbose_proxy_logger.debug(f"Fetched {len(keys)} keys")
|
||
|
||
# Get total count of keys
|
||
if use_deleted_table:
|
||
total_count = await prisma_client.db.litellm_deletedverificationtoken.count(
|
||
where=where # type: ignore
|
||
)
|
||
else:
|
||
total_count = await prisma_client.db.litellm_verificationtoken.count(
|
||
where=where # type: ignore
|
||
)
|
||
|
||
verbose_proxy_logger.debug(f"Total count of keys: {total_count}")
|
||
|
||
# Calculate total pages
|
||
total_pages = -(-total_count // size) # Ceiling division
|
||
|
||
# Fetch user information if expand includes "user"
|
||
user_map = {}
|
||
if expand and "user" in expand:
|
||
user_ids = [key.user_id for key in keys if key.user_id]
|
||
created_by_ids = [key.created_by for key in keys if key.created_by]
|
||
all_ids = list(set(user_ids + created_by_ids)) # Remove duplicates
|
||
if all_ids:
|
||
users = await prisma_client.db.litellm_usertable.find_many(
|
||
where={"user_id": {"in": all_ids}}
|
||
)
|
||
user_map = {user.user_id: user for user in users}
|
||
|
||
# Prepare response
|
||
key_list: List[Union[str, UserAPIKeyAuth, LiteLLM_DeletedVerificationToken]] = []
|
||
for key in keys:
|
||
# Convert Prisma model to dict (supports both Pydantic v1 and v2)
|
||
try:
|
||
key_dict = key.model_dump()
|
||
except Exception:
|
||
# Fallback for Pydantic v1 compatibility
|
||
key_dict = key.dict()
|
||
# Attach object_permission if object_permission_id is set (only for non-deleted keys)
|
||
if not use_deleted_table:
|
||
key_dict = await attach_object_permission_to_dict(key_dict, prisma_client)
|
||
|
||
# Include user information if expand includes "user"
|
||
if expand and "user" in expand:
|
||
if key.user_id and key.user_id in user_map:
|
||
try:
|
||
key_dict["user"] = user_map[key.user_id].model_dump()
|
||
except Exception:
|
||
key_dict["user"] = user_map[key.user_id].dict()
|
||
if key.created_by and key.created_by in user_map:
|
||
created_by_user = user_map[key.created_by]
|
||
key_dict["created_by_user"] = {
|
||
"user_id": created_by_user.user_id,
|
||
"user_email": created_by_user.user_email,
|
||
"user_alias": created_by_user.user_alias,
|
||
}
|
||
|
||
if return_full_object is True or (expand and "user" in expand):
|
||
if use_deleted_table:
|
||
# Use deleted key type to preserve deleted_at, deleted_by, etc.
|
||
key_list.append(LiteLLM_DeletedVerificationToken(**key_dict))
|
||
else:
|
||
key_list.append(UserAPIKeyAuth(**key_dict)) # Return full key object
|
||
else:
|
||
_token = key_dict.get("token")
|
||
key_list.append(cast(str, _token)) # Return only the token
|
||
|
||
return KeyListResponseObject(
|
||
keys=key_list,
|
||
total_count=total_count,
|
||
current_page=page,
|
||
total_pages=total_pages,
|
||
)
|
||
|
||
|
||
def _get_condition_to_filter_out_ui_session_tokens() -> Dict[str, Any]:
|
||
"""
|
||
Condition to filter out UI session tokens
|
||
"""
|
||
return {
|
||
"OR": [
|
||
{"team_id": None}, # Include records where team_id is null
|
||
{
|
||
"team_id": {"not": UI_SESSION_TOKEN_TEAM_ID}
|
||
}, # Include records where team_id != UI_SESSION_TOKEN_TEAM_ID
|
||
]
|
||
}
|
||
|
||
|
||
async def _check_key_admin_access(
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
hashed_token: str,
|
||
prisma_client: Any,
|
||
user_api_key_cache: DualCache,
|
||
route: str,
|
||
) -> None:
|
||
"""
|
||
Check that the caller has admin privileges for the target key.
|
||
|
||
Allowed callers:
|
||
- Proxy admin
|
||
- Team admin for the key's team
|
||
- Org admin for the key's team's organization
|
||
|
||
Raises HTTPException(403) if the caller is not authorized.
|
||
"""
|
||
|
||
if user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value:
|
||
return
|
||
|
||
# Look up the target key to find its team
|
||
target_key_row = await prisma_client.db.litellm_verificationtoken.find_unique(
|
||
where={"token": hashed_token}
|
||
)
|
||
if target_key_row is None:
|
||
raise HTTPException(
|
||
status_code=404,
|
||
detail={"error": f"Key not found: {hashed_token}"},
|
||
)
|
||
|
||
# If the key belongs to a team, check team admin / org admin
|
||
if target_key_row.team_id:
|
||
team_obj = await get_team_object(
|
||
team_id=target_key_row.team_id,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
check_db_only=True,
|
||
)
|
||
if team_obj is not None:
|
||
if _is_user_team_admin(
|
||
user_api_key_dict=user_api_key_dict, team_obj=team_obj
|
||
):
|
||
return
|
||
if await _is_user_org_admin_for_team(
|
||
user_api_key_dict=user_api_key_dict, team_obj=team_obj
|
||
):
|
||
return
|
||
|
||
raise HTTPException(
|
||
status_code=403,
|
||
detail={
|
||
"error": f"Only proxy admins, team admins, or org admins can call {route}. "
|
||
f"user_role={user_api_key_dict.user_role}, user_id={user_api_key_dict.user_id}"
|
||
},
|
||
)
|
||
|
||
|
||
@router.post(
|
||
"/key/block", tags=["key management"], dependencies=[Depends(user_api_key_auth)]
|
||
)
|
||
@management_endpoint_wrapper
|
||
async def block_key(
|
||
data: BlockKeyRequest,
|
||
http_request: Request,
|
||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||
litellm_changed_by: Optional[str] = Header(
|
||
None,
|
||
description="The litellm-changed-by header enables tracking of actions performed by authorized users on behalf of other users, providing an audit trail for accountability",
|
||
),
|
||
) -> Optional[LiteLLM_VerificationToken]:
|
||
"""
|
||
Block an Virtual key from making any requests.
|
||
|
||
Parameters:
|
||
- key: str - The key to block. Can be either the unhashed key (sk-...) or the hashed key value
|
||
|
||
Example:
|
||
```bash
|
||
curl --location 'http://0.0.0.0:4000/key/block' \
|
||
--header 'Authorization: Bearer sk-1234' \
|
||
--header 'Content-Type: application/json' \
|
||
--data '{
|
||
"key": "sk-Fn8Ej39NxjAXrvpUGKghGw"
|
||
}'
|
||
```
|
||
|
||
Note: This is an admin-only endpoint. Only proxy admins, team admins, or org admins can block keys.
|
||
"""
|
||
from litellm.proxy.proxy_server import (
|
||
create_audit_log_for_update,
|
||
hash_token,
|
||
litellm_proxy_admin_name,
|
||
prisma_client,
|
||
proxy_logging_obj,
|
||
user_api_key_cache,
|
||
)
|
||
|
||
if prisma_client is None:
|
||
raise Exception("{}".format(CommonProxyErrors.db_not_connected_error.value))
|
||
|
||
if not is_valid_api_key(data.key):
|
||
raise ProxyException(
|
||
message="Invalid key format.",
|
||
type=ProxyErrorTypes.bad_request_error,
|
||
param="key",
|
||
code=status.HTTP_400_BAD_REQUEST,
|
||
)
|
||
if data.key.startswith("sk-"):
|
||
hashed_token = hash_token(token=data.key)
|
||
else:
|
||
hashed_token = data.key
|
||
|
||
# Admin-only: only proxy admins, team admins, or org admins can block keys
|
||
await _check_key_admin_access(
|
||
user_api_key_dict=user_api_key_dict,
|
||
hashed_token=hashed_token,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
route="/key/block",
|
||
)
|
||
|
||
if litellm.store_audit_logs is True:
|
||
# make an audit log for key update
|
||
record = await prisma_client.db.litellm_verificationtoken.find_unique(
|
||
where={"token": hashed_token}
|
||
)
|
||
if record is None:
|
||
raise ProxyException(
|
||
message=f"Key {data.key} not found",
|
||
type=ProxyErrorTypes.bad_request_error,
|
||
param="key",
|
||
code=status.HTTP_404_NOT_FOUND,
|
||
)
|
||
asyncio.create_task(
|
||
create_audit_log_for_update(
|
||
request_data=LiteLLM_AuditLogs(
|
||
id=str(uuid.uuid4()),
|
||
updated_at=datetime.now(timezone.utc),
|
||
changed_by=litellm_changed_by
|
||
or user_api_key_dict.user_id
|
||
or litellm_proxy_admin_name,
|
||
changed_by_api_key=user_api_key_dict.api_key,
|
||
table_name=LitellmTableNames.KEY_TABLE_NAME,
|
||
object_id=hashed_token,
|
||
action="blocked",
|
||
updated_values="{}",
|
||
before_value=record.model_dump_json(),
|
||
)
|
||
)
|
||
)
|
||
|
||
record = await prisma_client.db.litellm_verificationtoken.update(
|
||
where={"token": hashed_token}, data={"blocked": True} # type: ignore
|
||
)
|
||
|
||
## UPDATE KEY CACHE
|
||
|
||
### get cached object ###
|
||
key_object = await get_key_object(
|
||
hashed_token=hashed_token,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
parent_otel_span=None,
|
||
proxy_logging_obj=proxy_logging_obj,
|
||
)
|
||
|
||
### update cached object ###
|
||
key_object.blocked = True
|
||
|
||
### store cached object ###
|
||
await _cache_key_object(
|
||
hashed_token=hashed_token,
|
||
user_api_key_obj=key_object,
|
||
user_api_key_cache=user_api_key_cache,
|
||
proxy_logging_obj=proxy_logging_obj,
|
||
)
|
||
|
||
return record
|
||
|
||
|
||
@router.post(
|
||
"/key/unblock", tags=["key management"], dependencies=[Depends(user_api_key_auth)]
|
||
)
|
||
@management_endpoint_wrapper
|
||
async def unblock_key(
|
||
data: BlockKeyRequest,
|
||
http_request: Request,
|
||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||
litellm_changed_by: Optional[str] = Header(
|
||
None,
|
||
description="The litellm-changed-by header enables tracking of actions performed by authorized users on behalf of other users, providing an audit trail for accountability",
|
||
),
|
||
):
|
||
"""
|
||
Unblock a Virtual key to allow it to make requests again.
|
||
|
||
Parameters:
|
||
- key: str - The key to unblock. Can be either the unhashed key (sk-...) or the hashed key value
|
||
|
||
Example:
|
||
```bash
|
||
curl --location 'http://0.0.0.0:4000/key/unblock' \
|
||
--header 'Authorization: Bearer sk-1234' \
|
||
--header 'Content-Type: application/json' \
|
||
--data '{
|
||
"key": "sk-Fn8Ej39NxjAXrvpUGKghGw"
|
||
}'
|
||
```
|
||
|
||
Note: This is an admin-only endpoint. Only proxy admins, team admins, or org admins can unblock keys.
|
||
"""
|
||
from litellm.proxy.proxy_server import (
|
||
create_audit_log_for_update,
|
||
hash_token,
|
||
litellm_proxy_admin_name,
|
||
prisma_client,
|
||
proxy_logging_obj,
|
||
user_api_key_cache,
|
||
)
|
||
|
||
if prisma_client is None:
|
||
raise Exception("{}".format(CommonProxyErrors.db_not_connected_error.value))
|
||
|
||
if not is_valid_api_key(data.key):
|
||
raise ProxyException(
|
||
message="Invalid key format.",
|
||
type=ProxyErrorTypes.bad_request_error,
|
||
param="key",
|
||
code=status.HTTP_400_BAD_REQUEST,
|
||
)
|
||
if data.key.startswith("sk-"):
|
||
hashed_token = hash_token(token=data.key)
|
||
else:
|
||
hashed_token = data.key
|
||
|
||
# Admin-only: only proxy admins, team admins, or org admins can unblock keys
|
||
await _check_key_admin_access(
|
||
user_api_key_dict=user_api_key_dict,
|
||
hashed_token=hashed_token,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
route="/key/unblock",
|
||
)
|
||
|
||
if litellm.store_audit_logs is True:
|
||
# make an audit log for key update
|
||
record = await prisma_client.db.litellm_verificationtoken.find_unique(
|
||
where={"token": hashed_token}
|
||
)
|
||
if record is None:
|
||
raise ProxyException(
|
||
message=f"Key {data.key} not found",
|
||
type=ProxyErrorTypes.bad_request_error,
|
||
param="key",
|
||
code=status.HTTP_404_NOT_FOUND,
|
||
)
|
||
asyncio.create_task(
|
||
create_audit_log_for_update(
|
||
request_data=LiteLLM_AuditLogs(
|
||
id=str(uuid.uuid4()),
|
||
updated_at=datetime.now(timezone.utc),
|
||
changed_by=litellm_changed_by
|
||
or user_api_key_dict.user_id
|
||
or litellm_proxy_admin_name,
|
||
changed_by_api_key=user_api_key_dict.api_key,
|
||
table_name=LitellmTableNames.KEY_TABLE_NAME,
|
||
object_id=hashed_token,
|
||
action="blocked",
|
||
updated_values="{}",
|
||
before_value=record.model_dump_json(),
|
||
)
|
||
)
|
||
)
|
||
|
||
record = await prisma_client.db.litellm_verificationtoken.update(
|
||
where={"token": hashed_token}, data={"blocked": False} # type: ignore
|
||
)
|
||
|
||
## UPDATE KEY CACHE
|
||
|
||
### get cached object ###
|
||
key_object = await get_key_object(
|
||
hashed_token=hashed_token,
|
||
prisma_client=prisma_client,
|
||
user_api_key_cache=user_api_key_cache,
|
||
parent_otel_span=None,
|
||
proxy_logging_obj=proxy_logging_obj,
|
||
)
|
||
|
||
### update cached object ###
|
||
key_object.blocked = False
|
||
|
||
### store cached object ###
|
||
await _cache_key_object(
|
||
hashed_token=hashed_token,
|
||
user_api_key_obj=key_object,
|
||
user_api_key_cache=user_api_key_cache,
|
||
proxy_logging_obj=proxy_logging_obj,
|
||
)
|
||
|
||
return record
|
||
|
||
|
||
@router.post(
|
||
"/key/health",
|
||
tags=["key management"],
|
||
dependencies=[Depends(user_api_key_auth)],
|
||
response_model=KeyHealthResponse,
|
||
)
|
||
@management_endpoint_wrapper
|
||
async def key_health(
|
||
request: Request,
|
||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||
):
|
||
"""
|
||
Check the health of the key
|
||
|
||
Checks:
|
||
- If key based logging is configured correctly - sends a test log
|
||
|
||
Usage
|
||
|
||
Pass the key in the request header
|
||
|
||
```bash
|
||
curl -X POST "http://localhost:4000/key/health" \
|
||
-H "Authorization: Bearer sk-1234" \
|
||
-H "Content-Type: application/json"
|
||
```
|
||
|
||
Response when logging callbacks are setup correctly:
|
||
|
||
```json
|
||
{
|
||
"key": "healthy",
|
||
"logging_callbacks": {
|
||
"callbacks": [
|
||
"gcs_bucket"
|
||
],
|
||
"status": "healthy",
|
||
"details": "No logger exceptions triggered, system is healthy. Manually check if logs were sent to ['gcs_bucket']"
|
||
}
|
||
}
|
||
```
|
||
|
||
|
||
Response when logging callbacks are not setup correctly:
|
||
```json
|
||
{
|
||
"key": "unhealthy",
|
||
"logging_callbacks": {
|
||
"callbacks": [
|
||
"gcs_bucket"
|
||
],
|
||
"status": "unhealthy",
|
||
"details": "Logger exceptions triggered, system is unhealthy: Failed to load vertex credentials. Check to see if credentials containing partial/invalid information."
|
||
}
|
||
}
|
||
```
|
||
"""
|
||
try:
|
||
# Get the key's metadata
|
||
key_metadata = user_api_key_dict.metadata
|
||
|
||
health_status: KeyHealthResponse = KeyHealthResponse(
|
||
key="healthy",
|
||
logging_callbacks=None,
|
||
)
|
||
|
||
# Check if logging is configured in metadata
|
||
if key_metadata and "logging" in key_metadata:
|
||
logging_statuses = await test_key_logging(
|
||
user_api_key_dict=user_api_key_dict,
|
||
request=request,
|
||
key_logging=key_metadata["logging"],
|
||
)
|
||
health_status["logging_callbacks"] = logging_statuses
|
||
|
||
# Check if any logging callback is unhealthy
|
||
if logging_statuses.get("status") == "unhealthy":
|
||
health_status["key"] = "unhealthy"
|
||
|
||
return KeyHealthResponse(**health_status)
|
||
|
||
except Exception as e:
|
||
raise ProxyException(
|
||
message=f"Key health check failed: {str(e)}",
|
||
type=ProxyErrorTypes.internal_server_error,
|
||
param=getattr(e, "param", "None"),
|
||
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||
)
|
||
|
||
|
||
async def _can_user_query_key_info(
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
key: Optional[str],
|
||
key_info: LiteLLM_VerificationToken,
|
||
) -> bool:
|
||
"""
|
||
Helper to check if the user has access to the key's info
|
||
"""
|
||
if (
|
||
user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value
|
||
or user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN_VIEW_ONLY.value
|
||
):
|
||
return True
|
||
elif user_api_key_dict.api_key == key:
|
||
return True
|
||
# user can query their own key info
|
||
elif key_info.user_id == user_api_key_dict.user_id:
|
||
return True
|
||
elif await TeamMemberPermissionChecks.user_belongs_to_keys_team(
|
||
user_api_key_dict=user_api_key_dict,
|
||
existing_key_row=key_info,
|
||
):
|
||
return True
|
||
return False
|
||
|
||
|
||
async def test_key_logging(
|
||
user_api_key_dict: UserAPIKeyAuth,
|
||
request: Request,
|
||
key_logging: List[Dict[str, Any]],
|
||
) -> LoggingCallbackStatus:
|
||
"""
|
||
Test the key-based logging
|
||
|
||
- Test that key logging is correctly formatted and all args are passed correctly
|
||
- Make a mock completion call -> user can check if it's correctly logged
|
||
- Check if any logger.exceptions were triggered -> if they were then returns it to the user client side
|
||
"""
|
||
import logging
|
||
from io import StringIO
|
||
|
||
from litellm.proxy.litellm_pre_call_utils import add_litellm_data_to_request
|
||
from litellm.proxy.proxy_server import general_settings, proxy_config
|
||
|
||
logging_callbacks: List[str] = []
|
||
for callback in key_logging:
|
||
if callback.get("callback_name") is not None:
|
||
logging_callbacks.append(callback["callback_name"])
|
||
else:
|
||
raise ValueError("callback_name is required in key_logging")
|
||
|
||
log_capture_string = StringIO()
|
||
ch = logging.StreamHandler(log_capture_string)
|
||
ch.setLevel(logging.ERROR)
|
||
logger = logging.getLogger()
|
||
logger.addHandler(ch)
|
||
|
||
try:
|
||
data = {
|
||
"model": "openai/litellm-key-health-test",
|
||
"messages": [
|
||
{
|
||
"role": "user",
|
||
"content": "Hello, this is a test from litellm /key/health. No LLM API call was made for this",
|
||
}
|
||
],
|
||
"mock_response": "test response",
|
||
}
|
||
data = await add_litellm_data_to_request(
|
||
data=data,
|
||
user_api_key_dict=user_api_key_dict,
|
||
proxy_config=proxy_config,
|
||
general_settings=general_settings,
|
||
request=request,
|
||
)
|
||
await litellm.acompletion(
|
||
**data
|
||
) # make mock completion call to trigger key based callbacks
|
||
except Exception as e:
|
||
return LoggingCallbackStatus(
|
||
callbacks=logging_callbacks,
|
||
status="unhealthy",
|
||
details=f"Logging test failed: {str(e)}",
|
||
)
|
||
|
||
await asyncio.sleep(
|
||
2
|
||
) # wait for callbacks to run, callbacks use batching so wait for the flush event
|
||
|
||
# Check if any logger exceptions were triggered
|
||
log_contents = log_capture_string.getvalue()
|
||
logger.removeHandler(ch)
|
||
if log_contents:
|
||
return LoggingCallbackStatus(
|
||
callbacks=logging_callbacks,
|
||
status="unhealthy",
|
||
details=f"Logger exceptions triggered, system is unhealthy: {log_contents}",
|
||
)
|
||
else:
|
||
return LoggingCallbackStatus(
|
||
callbacks=logging_callbacks,
|
||
status="healthy",
|
||
details=f"No logger exceptions triggered, system is healthy. Manually check if logs were sent to {logging_callbacks} ",
|
||
)
|
||
|
||
|
||
_KEY_ALIAS_PATTERN = re.compile(r"^[a-zA-Z0-9][a-zA-Z0-9_\-/\.@]{0,253}[a-zA-Z0-9]$")
|
||
|
||
|
||
def _validate_key_alias_format(key_alias: Optional[str]) -> None:
|
||
"""
|
||
Validate the format of the key_alias.
|
||
|
||
Gated behind ``litellm.enable_key_alias_format_validation`` (default **False**).
|
||
When disabled, no validation is performed so existing workflows are not broken.
|
||
|
||
Rules (when enabled):
|
||
- None is OK (no alias).
|
||
- Otherwise must be 2–255 chars
|
||
- start/end with alphanumeric
|
||
- only allow a-zA-Z0-9_-/.@
|
||
"""
|
||
if not litellm.enable_key_alias_format_validation:
|
||
return
|
||
|
||
if key_alias is None:
|
||
return
|
||
|
||
if not _KEY_ALIAS_PATTERN.match(key_alias):
|
||
raise ProxyException(
|
||
message="Invalid key_alias format. Must be 2-255 characters, start/end with alphanumeric, and only contain a-zA-Z0-9_-/.@.",
|
||
type=ProxyErrorTypes.bad_request_error,
|
||
param="key_alias",
|
||
code=400,
|
||
)
|
||
|
||
|
||
async def _enforce_unique_key_alias(
|
||
key_alias: Optional[str],
|
||
prisma_client: Any,
|
||
existing_key_token: Optional[str] = None,
|
||
) -> None:
|
||
"""
|
||
Helper to enforce unique key aliases across all keys.
|
||
|
||
Args:
|
||
key_alias (Optional[str]): The key alias to check
|
||
prisma_client (Any): Prisma client instance
|
||
existing_key_token (Optional[str]): ID of existing key being updated, to exclude from uniqueness check
|
||
(The Admin UI passes key_alias, in all Edit key requests. So we need to be sure that if we find a key with the same alias, it's not the same key we're updating)
|
||
|
||
Raises:
|
||
ProxyException: If key alias already exists on a different key
|
||
"""
|
||
if key_alias is not None and prisma_client is not None:
|
||
where_clause: dict[str, Any] = {"key_alias": key_alias}
|
||
if existing_key_token:
|
||
# Exclude the current key from the uniqueness check
|
||
where_clause["NOT"] = {"token": existing_key_token}
|
||
|
||
existing_key = await prisma_client.db.litellm_verificationtoken.find_first(
|
||
where=where_clause
|
||
)
|
||
if existing_key is not None:
|
||
raise ProxyException(
|
||
message=f"Key with alias '{key_alias}' already exists. Unique key aliases across all keys are required.",
|
||
type=ProxyErrorTypes.bad_request_error,
|
||
param="key_alias",
|
||
code=status.HTTP_400_BAD_REQUEST,
|
||
)
|
||
|
||
|
||
def validate_model_max_budget(model_max_budget: Optional[Dict]) -> None:
|
||
"""
|
||
Validate the model_max_budget is GenericBudgetConfigType + enforce user has an enterprise license
|
||
|
||
Raises:
|
||
Exception: If model_max_budget is not a valid GenericBudgetConfigType
|
||
"""
|
||
try:
|
||
if model_max_budget is None:
|
||
return
|
||
if len(model_max_budget) == 0:
|
||
return
|
||
if model_max_budget is not None:
|
||
from litellm.proxy.proxy_server import CommonProxyErrors, premium_user
|
||
|
||
if premium_user is not True:
|
||
raise ValueError(
|
||
f"You must have an enterprise license to set model_max_budget. {CommonProxyErrors.not_premium_user.value}"
|
||
)
|
||
for _model, _budget_info in model_max_budget.items():
|
||
assert isinstance(_model, str)
|
||
|
||
# Normalize to dict (Pydantic may already parse nested values as BudgetConfig)
|
||
_info = (
|
||
_budget_info.model_dump()
|
||
if hasattr(_budget_info, "model_dump")
|
||
else dict(_budget_info)
|
||
)
|
||
# /CRUD endpoints can pass budget_limit as a string, so we need to convert it to a float
|
||
if "budget_limit" in _info:
|
||
_info["budget_limit"] = float(_info["budget_limit"])
|
||
BudgetConfig(**_info)
|
||
except Exception as e:
|
||
raise ValueError(
|
||
f"Invalid model_max_budget: {str(e)}. Example of valid model_max_budget: https://docs.litellm.ai/docs/proxy/users"
|
||
)
|