diff --git a/docs/my-website/blog/april_townhall_announcement/index.md b/docs/my-website/blog/april_townhall_announcement/index.md
index 466d9e845f..1f842536f8 100644
--- a/docs/my-website/blog/april_townhall_announcement/index.md
+++ b/docs/my-website/blog/april_townhall_announcement/index.md
@@ -4,6 +4,7 @@ title: "April Townhall: Security + Product Roadmap"
date: 2026-04-02T07:30:00
authors:
- krrish
+ - ishaan-alt
description: "Join the LiteLLM April townhall on Friday, 10 April at 7:30 AM to learn about LiteLLM's security and product roadmap."
tags: [announcement, townhall]
hide_table_of_contents: true
diff --git a/docs/my-website/blog/april_townhall_updates/index.md b/docs/my-website/blog/april_townhall_updates/index.md
new file mode 100644
index 0000000000..c726d1b7f8
--- /dev/null
+++ b/docs/my-website/blog/april_townhall_updates/index.md
@@ -0,0 +1,162 @@
+---
+slug: april-townhall-updates
+title: "April Townhall Updates: CI/CD v2, Stability, and Product Roadmap"
+date: 2026-04-10T12:00:00
+authors:
+ - krrish
+ - ishaan-alt
+description: "A recap of the April LiteLLM town hall covering CI/CD v2, product stability work, and the near-term roadmap."
+tags: [townhall, security, reliability, product]
+hide_table_of_contents: false
+---
+
+import Image from '@theme/IdealImage';
+
+Thank you to everyone who joined our April town hall.
+
+We used the session to share our CI/CD v2 improvements, product stability work, and what we are prioritizing next across reliability and product roadmap.
+
+{/* truncate */}
+
+## CI/CD v2 improvements
+
+Our CI/CD v2 work is centered around four goals:
+
+1. **Limit** what each package can access
+2. **Reduce** the number of sensitive environment variables
+3. **Avoid** compromised packages
+4. **Reduce the risk of** release tampering
+
+#### New architecture: isolated environments
+
+We have begun moving to isolated environments for distinct CI/CD stages to reduce the chance that a single compromised step can inherit broad access across the entire pipeline.
+
+
+
+#### Current rollout status
+
+These changes are deployed in our current release workflow. [See here](https://github.com/BerriAI/litellm/tags)
+
+#### Independently verify releases
+
+A key part of CI/CD v2 is supporting independent verification of release artifacts using our published verification process, while reducing reliance on any single credential or release path.
+
+[**Learn more about how to verify releases**](https://docs.litellm.ai/docs/proxy/docker_image_security)
+
+
+
+## Stability improvements
+
+### SDLC improvements
+
+This month, we're focusing on process stability improvements around:
+- Improving main-branch stability
+- Mapping UI QA to built Docker images for 1:1 environment parity
+- Consistent release tags across PyPI and Docker
+- Fixing release notes publication
+
+#### Improving main-branch stability
+
+We're introducing a staging-gated flow:
+
+
+
+- Only an internal staging branch can push to `main`.
+- PRs to that staging branch must pass CircleCI LLM API testing.
+- Collision handling happens on staging, which is designed to reduce unstable changes reaching `main`.
+
+#### UI QA in Docker environment
+
+Moving forward, all UI QA will be performed in the built Docker image that users run.
+
+Previously, some UI QA paths were run in local environments that did not fully replicate Docker runtime conditions.
+
+That contributed to release-specific issues, including MCP registration problems in `v1.82.3`.
+
+#### Consistent release tags
+
+Today we publish releases for multiple scenarios:
+- Dev (Built of a PR for a customer-specific scenario)
+- Nightly (Passes all CI/CD checks)
+- Release Candidate (Passes all CI/CD checks + manual UI QA)
+- Stable (intended to pass all CI/CD checks + manual UI QA + 7 days of production testing)
+
+We are targeting a consistent naming convention across PyPI and Docker by the end of April.
+
+#### Release notes
+
+CI/CD v2 changes moved release notes to a manual path. This is a temporary solution while we investigate a better automated workflow. We are targeting a more consistent process by the end of April.
+
+### Product stability improvements
+
+#### Stable Prisma migrations
+
+Today, we have observed several migration failure classes:
+- Migration not applied
+- Migration marked applied but incomplete
+- Migration not applied due to non-root image issues
+
+We're prioritizing this work this month and have assigned an engineering owner to the effort. Our target is to resolve these error classes by the end of April.
+
+#### UI type safety
+
+Another area of focus is improving the stability of the UI. Today, one cause of errors is that the UI maintains its own assumptions about backend API types. This can lead to issues when backend responses differ from UI assumptions.
+
+We aim to move to having the UI and Backend be in sync with each other, and are exploring OpenAPI-driven mapping to achieve this.
+
+## Product roadmap
+
+### Our Assumptions
+
+Over the next few years, we expect:
+- Companies will give employees more AI tools.
+- More AI agents will move into production workflows across HR, finance, support, and operations.
+
+### Our Inferences
+#### Near-term
+
+- AI spend will increase.
+- Uptime and latency will become even more important.
+- More AI resources (skills, CLIs, and related assets) will require governance.
+- Agent and MCP usage patterns will require deeper controls.
+- Broader developer adoption will increase the need for simpler, more discoverable tooling.
+
+#### Long-term
+
+- We expect many organizations to treat agent auditability (how decisions were made across LLM + MCP + sub-agent inputs/outputs) as a compliance expectation.
+- Permission management will get more complex as user-agent interaction chains deepen.
+
+Roadmap timelines in this post are targets and may evolve based on validation and user feedback.
+
+## April investments
+
+### Reliability
+
+- Increase uptime for 10k+ RPS scenarios.
+- Investigate latency overhead for long-running Claude Code requests.
+
+### Feature reliability
+
+- Polish MCP authentication.
+- Better understand how teams are using agents through LiteLLM.
+
+### Governance
+
+- Launch Skills as a first-class citizen in LiteLLM.
+
+## Q&A
+
+Thank you again for all the questions and direct feedback. We will keep sharing concrete progress updates as these efforts ship.
+
+## Hiring
+
+We are actively hiring across several roles, please apply [here](https://jobs.ashbyhq.com/litellm) if you're interested!
\ No newline at end of file
diff --git a/docs/my-website/blog/authors.yml b/docs/my-website/blog/authors.yml
index 1b1ef4d34c..c8a1bab7ed 100644
--- a/docs/my-website/blog/authors.yml
+++ b/docs/my-website/blog/authors.yml
@@ -24,7 +24,7 @@ ishaan:
# Alias for typo in name
ishaan-alt:
- name: Ishaan Jaff
+ name: Ishaan Jaffer
title: CTO, LiteLLM
url: https://www.linkedin.com/in/reffajnaahsi/
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
diff --git a/docs/my-website/img/april_townhall_isolated_environments.png b/docs/my-website/img/april_townhall_isolated_environments.png
new file mode 100644
index 0000000000..120e5cec9b
Binary files /dev/null and b/docs/my-website/img/april_townhall_isolated_environments.png differ
diff --git a/docs/my-website/img/stable_main.png b/docs/my-website/img/stable_main.png
new file mode 100644
index 0000000000..f050b54f6e
Binary files /dev/null and b/docs/my-website/img/stable_main.png differ
diff --git a/docs/my-website/img/verify_releases.png b/docs/my-website/img/verify_releases.png
new file mode 100644
index 0000000000..270a999d8d
Binary files /dev/null and b/docs/my-website/img/verify_releases.png differ
diff --git a/litellm/integrations/dotprompt/prompt_manager.py b/litellm/integrations/dotprompt/prompt_manager.py
index 997a40d545..6407a18d0b 100644
--- a/litellm/integrations/dotprompt/prompt_manager.py
+++ b/litellm/integrations/dotprompt/prompt_manager.py
@@ -7,7 +7,8 @@ from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple, Union
import yaml
-from jinja2 import DictLoader, Environment, select_autoescape
+from jinja2 import DictLoader, select_autoescape
+from jinja2.sandbox import ImmutableSandboxedEnvironment
class PromptTemplate:
@@ -59,7 +60,10 @@ class PromptManager:
self.prompt_directory = Path(prompt_directory) if prompt_directory else None
self.prompts: Dict[str, PromptTemplate] = {}
self.prompt_file = prompt_file
- self.jinja_env = Environment(
+ # Sandboxed env: templates can come from user input via /prompts/test,
+ # so we must block access to unsafe Python attributes and mutation of
+ # caller-supplied mutables.
+ self.jinja_env = ImmutableSandboxedEnvironment(
loader=DictLoader({}),
autoescape=select_autoescape(["html", "xml"]),
# Use Handlebars-style delimiters to match Dotprompt spec
diff --git a/litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py b/litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py
index c1eccaebd0..3ac579dae1 100644
--- a/litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py
+++ b/litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py
@@ -538,9 +538,9 @@ class AmazonAnthropicClaudeMessagesConfig(
merges usage from message_start and message_delta but ignores
message_stop. This method buffers message_delta and, when
message_stop arrives with cache usage, merges those fields into the
- message_delta usage and also updates the input_tokens on
- message_delta to include the full count (uncached + cache_creation +
- cache_read).
+ message_delta usage. input_tokens is kept as the uncached-only
+ count; downstream calculate_usage adds cache tokens to
+ prompt_tokens.
"""
_CACHE_FIELDS = ("cache_creation_input_tokens", "cache_read_input_tokens")
pending_delta = None
@@ -569,12 +569,7 @@ class AmazonAnthropicClaudeMessagesConfig(
raw_input = stop_usage.get("input_tokens")
if raw_input is not None:
- uncached = raw_input if isinstance(raw_input, int) else 0
- raw_cc = delta_usage.get("cache_creation_input_tokens", 0)
- cache_creation = raw_cc if isinstance(raw_cc, int) else 0
- raw_cr = delta_usage.get("cache_read_input_tokens", 0)
- cache_read = raw_cr if isinstance(raw_cr, int) else 0
- delta_usage["input_tokens"] = uncached + cache_creation + cache_read
+ delta_usage["input_tokens"] = raw_input if isinstance(raw_input, int) else 0
if delta_usage:
pending_delta["usage"] = delta_usage # type: ignore[arg-type]
diff --git a/litellm/llms/litellm_proxy/skills/prompt_injection.py b/litellm/llms/litellm_proxy/skills/prompt_injection.py
index 2b86f74122..86b6e22351 100644
--- a/litellm/llms/litellm_proxy/skills/prompt_injection.py
+++ b/litellm/llms/litellm_proxy/skills/prompt_injection.py
@@ -5,6 +5,7 @@ Handles extraction of skill content (SKILL.md) from stored ZIP files
and injection into the system prompt for non-Anthropic models.
"""
+import posixpath
import zipfile
from io import BytesIO
from typing import Any, Dict, List, Optional
@@ -103,8 +104,18 @@ class SkillPromptInjectionHandler:
else:
clean_path = name
- if clean_path:
- files[clean_path] = zf.read(name)
+ if not clean_path:
+ continue
+
+ # Ensure the path stays within the intended directory
+ normalized = posixpath.normpath(clean_path)
+ if normalized.startswith("..") or posixpath.isabs(normalized):
+ verbose_logger.warning(
+ f"SkillPromptInjectionHandler: Skipping entry with invalid path in skill {skill.skill_id}: {name}"
+ )
+ continue
+
+ files[normalized] = zf.read(name)
except Exception as e:
verbose_logger.warning(
f"SkillPromptInjectionHandler: Error extracting files from skill {skill.skill_id}: {e}"
diff --git a/litellm/llms/litellm_proxy/skills/sandbox_executor.py b/litellm/llms/litellm_proxy/skills/sandbox_executor.py
index 42e5b941bc..4514512fc5 100644
--- a/litellm/llms/litellm_proxy/skills/sandbox_executor.py
+++ b/litellm/llms/litellm_proxy/skills/sandbox_executor.py
@@ -94,9 +94,15 @@ class SkillsSandboxExecutor:
# Create a temp directory to stage files
with tempfile.TemporaryDirectory() as tmpdir:
+ tmpdir_abs = os.path.abspath(tmpdir)
for path, content in skill_files.items():
# Create the file in temp directory
- local_path = os.path.join(tmpdir, path)
+ local_path = os.path.abspath(os.path.join(tmpdir, path))
+ if not local_path.startswith(tmpdir_abs + os.sep):
+ verbose_logger.warning(
+ f"SkillsSandboxExecutor: Skipping file with invalid path: {path}"
+ )
+ continue
os.makedirs(os.path.dirname(local_path), exist_ok=True)
with open(local_path, "wb") as f:
f.write(content)
diff --git a/litellm/proxy/_types.py b/litellm/proxy/_types.py
index 793742891f..96e221a9ac 100644
--- a/litellm/proxy/_types.py
+++ b/litellm/proxy/_types.py
@@ -494,10 +494,12 @@ class LiteLLMRoutes(enum.Enum):
"/v2/key/info",
"/model_group/info",
"/health",
+ "/health/services",
"/key/list",
"/user/filter/ui",
"/models",
"/v1/models",
+ "/sso/get/ui_settings",
]
# NOTE: ROUTES ONLY FOR MASTER KEY - only the Master Key should be able to Reset Spend
@@ -566,6 +568,8 @@ class LiteLLMRoutes(enum.Enum):
"/spend/tags",
"/spend/calculate",
"/spend/logs",
+ "/spend/logs/ui",
+ "/spend/logs/session/ui",
"/cost/estimate",
]
@@ -581,6 +585,7 @@ class LiteLLMRoutes(enum.Enum):
"/global/spend/report",
"/global/spend/provider",
"/global/spend/tags",
+ "/global/spend/all_tag_names",
]
public_routes = set(
@@ -602,6 +607,9 @@ class LiteLLMRoutes(enum.Enum):
]
)
+ # Retained for backwards compatibility with JWT auth configs that reference
+ # "ui_routes" in admin_allowed_routes. Not used by the proxy's own route
+ # authorization — UI tokens now go through the same RBAC path as API tokens.
ui_routes = [
"/sso",
"/sso/get/ui_settings",
@@ -627,19 +635,16 @@ class LiteLLMRoutes(enum.Enum):
internal_user_routes = (
[
- "/global/spend/tags",
- "/global/spend/keys",
- "/global/spend/models",
- "/global/spend/provider",
- "/global/spend/end_users",
"/global/activity",
"/global/activity/model",
+ "/global/activity/cache_hits",
"/v1/models/{model_id}",
"/models/{model_id}",
"/guardrails/list",
"/v2/guardrails/list",
]
+ spend_tracking_routes
+ + global_spend_tracking_routes
+ key_management_routes
)
@@ -694,6 +699,9 @@ class LiteLLMRoutes(enum.Enum):
"/tag/list",
"/audit",
"/audit/{id}",
+ "/global/activity",
+ "/global/activity/model",
+ "/global/activity/cache_hits",
] + info_routes
# All routes accesible by an Org Admin
@@ -892,9 +900,9 @@ class GenerateRequestBase(LiteLLMPydanticObjectBase):
allowed_cache_controls: Optional[list] = []
config: Optional[dict] = {}
permissions: Optional[dict] = {}
- model_max_budget: Optional[dict] = (
- {}
- ) # {"gpt-4": 5.0, "gpt-3.5-turbo": 5.0}, defaults to {}
+ model_max_budget: Optional[
+ dict
+ ] = {} # {"gpt-4": 5.0, "gpt-3.5-turbo": 5.0}, defaults to {}
model_config = ConfigDict(protected_namespaces=())
model_rpm_limit: Optional[dict] = None
@@ -1036,9 +1044,9 @@ class RegenerateKeyRequest(GenerateKeyRequest):
spend: Optional[float] = None
metadata: Optional[dict] = None
new_master_key: Optional[str] = None
- grace_period: Optional[str] = (
- None # Duration to keep old key valid (e.g. "24h", "2d"); None = immediate revoke
- )
+ grace_period: Optional[
+ str
+ ] = None # Duration to keep old key valid (e.g. "24h", "2d"); None = immediate revoke
class ResetSpendRequest(LiteLLMPydanticObjectBase):
@@ -1562,12 +1570,12 @@ class NewCustomerRequest(BudgetNewRequest):
blocked: bool = False # allow/disallow requests for this end-user
budget_id: Optional[str] = None # give either a budget_id or max_budget
spend: Optional[float] = None
- allowed_model_region: Optional[AllowedModelRegion] = (
- None # require all user requests to use models in this specific region
- )
- default_model: Optional[str] = (
- None # if no equivalent model in allowed region - default all requests to this model
- )
+ allowed_model_region: Optional[
+ AllowedModelRegion
+ ] = None # require all user requests to use models in this specific region
+ default_model: Optional[
+ str
+ ] = None # if no equivalent model in allowed region - default all requests to this model
object_permission: Optional[LiteLLM_ObjectPermissionBase] = None
@model_validator(mode="before")
@@ -1590,12 +1598,12 @@ class UpdateCustomerRequest(LiteLLMPydanticObjectBase):
blocked: bool = False # allow/disallow requests for this end-user
max_budget: Optional[float] = None
budget_id: Optional[str] = None # give either a budget_id or max_budget
- allowed_model_region: Optional[AllowedModelRegion] = (
- None # require all user requests to use models in this specific region
- )
- default_model: Optional[str] = (
- None # if no equivalent model in allowed region - default all requests to this model
- )
+ allowed_model_region: Optional[
+ AllowedModelRegion
+ ] = None # require all user requests to use models in this specific region
+ default_model: Optional[
+ str
+ ] = None # if no equivalent model in allowed region - default all requests to this model
object_permission: Optional[LiteLLM_ObjectPermissionBase] = None
@@ -1685,15 +1693,15 @@ class NewTeamRequest(TeamBase):
] = None # raise an error if 'guaranteed_throughput' is set and we're overallocating tpm
model_tpm_limit: Optional[Dict[str, int]] = None
- team_member_budget: Optional[float] = (
- None # allow user to set a budget for all team members
- )
- team_member_rpm_limit: Optional[int] = (
- None # allow user to set RPM limit for all team members
- )
- team_member_tpm_limit: Optional[int] = (
- None # allow user to set TPM limit for all team members
- )
+ team_member_budget: Optional[
+ float
+ ] = None # allow user to set a budget for all team members
+ team_member_rpm_limit: Optional[
+ int
+ ] = None # allow user to set RPM limit for all team members
+ team_member_tpm_limit: Optional[
+ int
+ ] = None # allow user to set TPM limit for all team members
team_member_key_duration: Optional[str] = None # e.g. "1d", "1w", "1m"
team_member_budget_duration: Optional[str] = None # e.g. "30d", "1mo"
allowed_vector_store_indexes: Optional[List[AllowedVectorStoreIndexItem]] = None
@@ -1790,9 +1798,9 @@ class BlockKeyRequest(LiteLLMPydanticObjectBase):
class AddTeamCallback(LiteLLMPydanticObjectBase):
callback_name: str
- callback_type: Optional[Literal["success", "failure", "success_and_failure"]] = (
- "success_and_failure"
- )
+ callback_type: Optional[
+ Literal["success", "failure", "success_and_failure"]
+ ] = "success_and_failure"
callback_vars: Dict[str, str]
@model_validator(mode="before")
@@ -2134,9 +2142,9 @@ class ConfigList(LiteLLMPydanticObjectBase):
stored_in_db: Optional[bool]
field_default_value: Any
premium_field: bool = False
- nested_fields: Optional[List[FieldDetail]] = (
- None # For nested dictionary or Pydantic fields
- )
+ nested_fields: Optional[
+ List[FieldDetail]
+ ] = None # For nested dictionary or Pydantic fields
class UserHeaderMapping(LiteLLMPydanticObjectBase):
@@ -2495,9 +2503,9 @@ class UserAPIKeyAuth(
user_max_budget: Optional[float] = None
request_route: Optional[str] = None
user: Optional[Any] = None # Expanded user object when expand=user is used
- created_by_user: Optional[Any] = (
- None # Expanded created_by user when expand=user is used
- )
+ created_by_user: Optional[
+ Any
+ ] = None # Expanded created_by user when expand=user is used
end_user_object_permission: Optional[LiteLLM_ObjectPermissionTable] = None
# Decoded upstream IdP claims (groups, roles, etc.) propagated by JWT auth machinery
# and forwarded into outbound tokens by guardrails such as MCPJWTSigner.
@@ -2636,9 +2644,9 @@ class LiteLLM_OrganizationMembershipTable(LiteLLMPydanticObjectBase):
budget_id: Optional[str] = None
created_at: datetime
updated_at: datetime
- user: Optional[Any] = (
- None # You might want to replace 'Any' with a more specific type if available
- )
+ user: Optional[
+ Any
+ ] = None # You might want to replace 'Any' with a more specific type if available
litellm_budget_table: Optional[LiteLLM_BudgetTable] = None
user_email: Optional[str] = None
@@ -3793,9 +3801,9 @@ class TeamModelDeleteRequest(BaseModel):
# Organization Member Requests
class OrganizationMemberAddRequest(OrgMemberAddRequest):
organization_id: str
- max_budget_in_organization: Optional[float] = (
- None # Users max budget within the organization
- )
+ max_budget_in_organization: Optional[
+ float
+ ] = None # Users max budget within the organization
class OrganizationMemberDeleteRequest(MemberDeleteRequest):
@@ -4050,9 +4058,9 @@ class ProviderBudgetResponse(LiteLLMPydanticObjectBase):
Maps provider names to their budget configs.
"""
- providers: Dict[str, ProviderBudgetResponseObject] = (
- {}
- ) # Dictionary mapping provider names to their budget configurations
+ providers: Dict[
+ str, ProviderBudgetResponseObject
+ ] = {} # Dictionary mapping provider names to their budget configurations
class ProxyStateVariables(TypedDict):
@@ -4214,9 +4222,9 @@ class LiteLLM_JWTAuth(LiteLLMPydanticObjectBase):
enforce_rbac: bool = False
roles_jwt_field: Optional[str] = None # v2 on role mappings
role_mappings: Optional[List[RoleMapping]] = None
- object_id_jwt_field: Optional[str] = (
- None # can be either user / team, inferred from the role mapping
- )
+ object_id_jwt_field: Optional[
+ str
+ ] = None # can be either user / team, inferred from the role mapping
scope_mappings: Optional[List[ScopeMapping]] = None
enforce_scope_based_access: bool = False
enforce_team_based_model_access: bool = False
diff --git a/litellm/proxy/agent_endpoints/endpoints.py b/litellm/proxy/agent_endpoints/endpoints.py
index 6e5d4562b5..64c20d5ed5 100644
--- a/litellm/proxy/agent_endpoints/endpoints.py
+++ b/litellm/proxy/agent_endpoints/endpoints.py
@@ -28,6 +28,7 @@ from litellm.types.agents import (
MakeAgentsPublicRequest,
PatchAgentRequest,
)
+from litellm.litellm_core_utils.litellm_logging import _get_masked_values
from litellm.types.llms.custom_http import httpxSpecialProvider
from litellm.types.proxy.management_endpoints.common_daily_activity import (
SpendAnalyticsPaginatedResponse,
@@ -36,6 +37,28 @@ from litellm.types.proxy.management_endpoints.common_daily_activity import (
router = APIRouter()
+def _redact_sensitive_agent_fields(
+ agents: List[AgentResponse],
+) -> List[AgentResponse]:
+ """
+ Return copies of the given agents with sensitive configuration fields
+ redacted. The original objects are not modified.
+ """
+ redacted: List[AgentResponse] = []
+ for agent in agents:
+ copy = agent.model_copy(deep=True)
+ copy.static_headers = None
+ copy.extra_headers = None
+ if copy.litellm_params:
+ copy.litellm_params = _get_masked_values(
+ copy.litellm_params,
+ unmasked_length=4,
+ number_of_asterisks=4,
+ )
+ redacted.append(copy)
+ return redacted
+
+
def _check_agent_management_permission(user_api_key_dict: UserAPIKeyAuth) -> None:
"""
Raises HTTP 403 if the caller does not have permission to create, update,
@@ -183,6 +206,14 @@ async def get_agents(
agent.agent_id in litellm.public_agent_groups
)
+ # Redact sensitive fields for non-admin users
+ is_admin = (
+ user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN
+ or user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value
+ )
+ if not is_admin:
+ returned_agents = _redact_sensitive_agent_fields(returned_agents)
+
if health_check:
agents_with_url = [
agent
@@ -399,6 +430,14 @@ async def get_agent_by_id(
status_code=404, detail=f"Agent with ID {agent_id} not found"
)
+ # Redact sensitive fields for non-admin users
+ is_admin = (
+ user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN
+ or user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value
+ )
+ if not is_admin:
+ agent = _redact_sensitive_agent_fields([agent])[0]
+
return agent
except HTTPException:
raise
diff --git a/litellm/proxy/auth/auth_checks.py b/litellm/proxy/auth/auth_checks.py
index 68bde8434a..56958a88f6 100644
--- a/litellm/proxy/auth/auth_checks.py
+++ b/litellm/proxy/auth/auth_checks.py
@@ -196,9 +196,7 @@ def _is_model_cost_zero(
return True
-def _is_cost_explicitly_configured(
- model: str, llm_router: "Router"
-) -> bool:
+def _is_cost_explicitly_configured(model: str, llm_router: "Router") -> bool:
"""
Check if any deployment in the model group has cost fields explicitly
set in its litellm.model_cost entry.
@@ -215,10 +213,7 @@ def _is_cost_explicitly_configured(
if model_id is None:
continue
raw_entry = litellm.model_cost.get(model_id, {})
- if (
- "input_cost_per_token" in raw_entry
- or "output_cost_per_token" in raw_entry
- ):
+ if "input_cost_per_token" in raw_entry or "output_cost_per_token" in raw_entry:
return True
return False
@@ -596,17 +591,12 @@ async def common_checks( # noqa: PLR0915
user_object=user_object, route=route, request_body=request_body
)
- token_team = getattr(valid_token, "team_id", None)
- token_type: Literal["ui", "api"] = (
- "ui" if token_team is not None and token_team == "litellm-dashboard" else "api"
- )
- _is_route_allowed = _is_allowed_route(
+ _is_route_allowed = _is_api_route_allowed(
route=route,
- token_type=token_type,
- user_obj=user_object,
request=request,
request_data=request_body,
valid_token=valid_token,
+ user_obj=user_object,
)
# 11. [OPTIONAL] Vector store checks - is the object allowed to access the vector store
@@ -629,31 +619,6 @@ async def common_checks( # noqa: PLR0915
return True
-def _is_ui_route(
- route: str,
- user_obj: Optional[LiteLLM_UserTable] = None,
-) -> bool:
- """
- - Check if the route is a UI used route
- """
- # this token is only used for managing the ui
- allowed_routes = LiteLLMRoutes.ui_routes.value
- # check if the current route startswith any of the allowed routes
- if (
- route is not None
- and isinstance(route, str)
- and any(route.startswith(allowed_route) for allowed_route in allowed_routes)
- ):
- # Do something if the current route starts with any of the allowed routes
- return True
- elif any(
- RouteChecks._route_matches_pattern(route=route, pattern=allowed_route)
- for allowed_route in allowed_routes
- ):
- return True
- return False
-
-
def _get_user_role(
user_obj: Optional[LiteLLM_UserTable],
) -> Optional[LitellmUserRoles]:
@@ -717,30 +682,6 @@ def _is_user_proxy_admin(user_obj: Optional[LiteLLM_UserTable]):
return False
-def _is_allowed_route(
- route: str,
- token_type: Literal["ui", "api"],
- request: Request,
- request_data: dict,
- valid_token: Optional[UserAPIKeyAuth],
- user_obj: Optional[LiteLLM_UserTable] = None,
-) -> bool:
- """
- - Route b/w ui token check and normal token check
- """
-
- if token_type == "ui" and _is_ui_route(route=route, user_obj=user_obj):
- return True
- else:
- return _is_api_route_allowed(
- route=route,
- request=request,
- request_data=request_data,
- valid_token=valid_token,
- user_obj=user_obj,
- )
-
-
def _allowed_routes_check(user_route: str, allowed_routes: list) -> bool:
"""
Return if a user is allowed to access route. Helper function for `allowed_routes_check`.
diff --git a/litellm/proxy/guardrails/guardrail_endpoints.py b/litellm/proxy/guardrails/guardrail_endpoints.py
index 422bdc1378..6814729258 100644
--- a/litellm/proxy/guardrails/guardrail_endpoints.py
+++ b/litellm/proxy/guardrails/guardrail_endpoints.py
@@ -60,12 +60,20 @@ def _get_guardrails_list_response(
"""
Helper function to get the guardrails list response
"""
+ from litellm.litellm_core_utils.litellm_logging import _get_masked_values
+
guardrail_configs: List[GuardrailInfoResponse] = []
for guardrail in guardrails_config:
+ litellm_params = guardrail.get("litellm_params") or {}
+ masked_params = _get_masked_values(
+ litellm_params,
+ unmasked_length=4,
+ number_of_asterisks=4,
+ )
guardrail_configs.append(
GuardrailInfoResponse(
guardrail_name=guardrail.get("guardrail_name"),
- litellm_params=guardrail.get("litellm_params"),
+ litellm_params=masked_params,
guardrail_info=guardrail.get("guardrail_info"),
)
)
diff --git a/litellm/proxy/management_endpoints/key_management_endpoints.py b/litellm/proxy/management_endpoints/key_management_endpoints.py
index 323ff7fd53..6e8a691ce9 100644
--- a/litellm/proxy/management_endpoints/key_management_endpoints.py
+++ b/litellm/proxy/management_endpoints/key_management_endpoints.py
@@ -456,6 +456,34 @@ def handle_key_type(data: GenerateKeyRequest, data_json: dict) -> dict:
return data_json
+def _check_allowed_routes_caller_permission(
+ allowed_routes: Optional[list],
+ user_api_key_dict: UserAPIKeyAuth,
+) -> None:
+ """
+ Only proxy admins may set `allowed_routes` on a key.
+
+ `allowed_routes` bypasses the standard role-based route gate in
+ RouteChecks.non_proxy_admin_allowed_routes_check, so if a non-admin is
+ allowed to set it they can grant themselves access to any endpoint.
+ Non-admins should use `key_type` to pick a preset route bucket instead.
+ """
+ # Empty list is the default on GenerateKeyRequest — treat as "not set".
+ if not allowed_routes:
+ return
+ if user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value:
+ return
+ raise HTTPException(
+ status_code=403,
+ detail={
+ "error": (
+ "Only proxy admins can set `allowed_routes` on a key. "
+ "Use `key_type` to pick a preset route bucket instead."
+ )
+ },
+ )
+
+
async def validate_team_id_used_in_service_account_request(
team_id: Optional[str],
prisma_client: Optional[PrismaClient],
@@ -740,9 +768,9 @@ async def _common_key_generation_helper( # noqa: PLR0915
request_type="key", **data_json, table_name="key"
)
- response[
- "soft_budget"
- ] = data.soft_budget # include the user-input soft budget in the response
+ response["soft_budget"] = (
+ data.soft_budget
+ ) # include the user-input soft budget in the response
response = GenerateKeyResponse(**response)
@@ -1254,6 +1282,12 @@ async def generate_key_fn(
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN, detail=message
)
+
+ _check_allowed_routes_caller_permission(
+ allowed_routes=data.allowed_routes,
+ user_api_key_dict=user_api_key_dict,
+ )
+
# For non-admin internal users: auto-assign caller's user_id if not provided
# This prevents creating unbound keys with no user association (LIT-1884)
_is_proxy_admin = (
@@ -1888,6 +1922,11 @@ async def _validate_update_key_data(
"""Validate permissions and constraints for key update."""
_is_proxy_admin = user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value
+ _check_allowed_routes_caller_permission(
+ allowed_routes=data.allowed_routes,
+ user_api_key_dict=user_api_key_dict,
+ )
+
# Prevent non-admin from removing user_id (setting to empty string) (LIT-1884)
if data.user_id is not None and data.user_id == "" and not _is_proxy_admin:
raise HTTPException(
@@ -3233,10 +3272,10 @@ async def delete_verification_tokens(
try:
if prisma_client:
tokens = [_hash_token_if_needed(token=key) for key in tokens]
- _keys_being_deleted: List[
- LiteLLM_VerificationToken
- ] = await prisma_client.db.litellm_verificationtoken.find_many(
- where={"token": {"in": tokens}}
+ _keys_being_deleted: List[LiteLLM_VerificationToken] = (
+ await prisma_client.db.litellm_verificationtoken.find_many(
+ where={"token": {"in": tokens}}
+ )
)
if len(_keys_being_deleted) == 0:
@@ -3436,9 +3475,9 @@ async def _rotate_master_key( # noqa: PLR0915
from litellm.proxy.proxy_server import proxy_config
try:
- models: Optional[
- List
- ] = await prisma_client.db.litellm_proxymodeltable.find_many()
+ models: Optional[List] = (
+ await prisma_client.db.litellm_proxymodeltable.find_many()
+ )
except Exception:
models = None
# 2. process model table
@@ -4078,11 +4117,11 @@ async def validate_key_list_check(
param="user_id",
code=status.HTTP_403_FORBIDDEN,
)
- complete_user_info_db_obj: Optional[
- BaseModel
- ] = await prisma_client.db.litellm_usertable.find_unique(
- where={"user_id": user_api_key_dict.user_id},
- include={"organization_memberships": True},
+ complete_user_info_db_obj: Optional[BaseModel] = (
+ await prisma_client.db.litellm_usertable.find_unique(
+ where={"user_id": user_api_key_dict.user_id},
+ include={"organization_memberships": True},
+ )
)
if complete_user_info_db_obj is None:
@@ -4165,10 +4204,10 @@ async def _fetch_user_team_objects(
if complete_user_info is None or not complete_user_info.teams:
return []
- teams: Optional[
- List[BaseModel]
- ] = await prisma_client.db.litellm_teamtable.find_many(
- where={"team_id": {"in": complete_user_info.teams}}
+ teams: Optional[List[BaseModel]] = (
+ await prisma_client.db.litellm_teamtable.find_many(
+ where={"team_id": {"in": complete_user_info.teams}}
+ )
)
if teams is None:
return []
diff --git a/litellm/proxy/ui_crud_endpoints/proxy_setting_endpoints.py b/litellm/proxy/ui_crud_endpoints/proxy_setting_endpoints.py
index 60bf41709e..0349f289b4 100644
--- a/litellm/proxy/ui_crud_endpoints/proxy_setting_endpoints.py
+++ b/litellm/proxy/ui_crud_endpoints/proxy_setting_endpoints.py
@@ -1,6 +1,7 @@
#### CRUD ENDPOINTS for UI Settings #####
import json
from typing import Any, Dict, List, Optional, Union
+from urllib.parse import urlparse
from fastapi import APIRouter, Depends, File, HTTPException, UploadFile
@@ -817,6 +818,29 @@ async def get_ui_theme_settings():
)
+def _validate_public_image_url(value: Optional[str], field_name: str) -> None:
+ """
+ Reject anything that isn't a plain http(s) URL with a host. This value is
+ later served via the unauthenticated /get_image endpoint, so local paths
+ like "/etc/passwd" or "file://..." must not be accepted.
+ """
+ if value is None:
+ return
+ if not isinstance(value, str) or not value.strip():
+ return
+ parsed = urlparse(value.strip())
+ if parsed.scheme not in ("http", "https") or not parsed.netloc:
+ raise HTTPException(
+ status_code=400,
+ detail={
+ "error": (
+ f"Invalid {field_name}: must be an http(s) URL with a host. "
+ "Local filesystem paths and non-http schemes are not allowed."
+ )
+ },
+ )
+
+
@router.patch(
"/update/ui_theme_settings",
tags=["UI Theme Settings"],
@@ -831,6 +855,9 @@ async def update_ui_theme_settings(theme_config: UIThemeConfig):
from litellm.proxy.proxy_server import proxy_config, store_model_in_db
+ _validate_public_image_url(theme_config.logo_url, "logo_url")
+ _validate_public_image_url(theme_config.favicon_url, "favicon_url")
+
if store_model_in_db is not True:
raise HTTPException(
status_code=500,
diff --git a/litellm/proxy/utils.py b/litellm/proxy/utils.py
index 635204f336..88a2e1e95c 100644
--- a/litellm/proxy/utils.py
+++ b/litellm/proxy/utils.py
@@ -2645,7 +2645,7 @@ class PrismaClient:
raise e
async def _query_first_with_cached_plan_fallback(
- self, sql_query: str
+ self, sql_query: str, *args
) -> Optional[dict]:
"""
Execute a query with automatic fallback for PostgreSQL cached plan errors.
@@ -2664,7 +2664,7 @@ class PrismaClient:
Original exception if not a cached plan error
"""
try:
- return await self.db.query_first(query=sql_query)
+ return await self.db.query_first(sql_query, *args)
except Exception as e:
error_str = str(e)
if "cached plan must not change result type" in error_str:
@@ -2679,7 +2679,7 @@ class PrismaClient:
"retrying with fresh plan. This may occur during rolling deployments "
"when schema changes are applied."
)
- return await self.db.query_first(query=sql_query_retry)
+ return await self.db.query_first(sql_query_retry, *args)
else:
raise
@@ -2978,7 +2978,7 @@ class PrismaClient:
detail={"error": f"No token passed in. Token={token}"},
)
- sql_query = f"""
+ sql_query = """
SELECT
v.*,
t.spend AS team_spend,
@@ -3016,11 +3016,11 @@ class PrismaClient:
LEFT JOIN "LiteLLM_ProjectTable" AS p ON v.project_id = p.project_id
LEFT JOIN "LiteLLM_OrganizationTable" AS o ON v.organization_id = o.organization_id
LEFT JOIN "LiteLLM_BudgetTable" AS b2 ON o.budget_id = b2.budget_id
- WHERE v.token = '{token}'
+ WHERE v.token = $1
"""
response = await self._query_first_with_cached_plan_fallback(
- sql_query
+ sql_query, hashed_token
)
# If not found in main table, check deprecated keys (grace period)
diff --git a/litellm/responses/main.py b/litellm/responses/main.py
index c82574278b..bcc3f7f05e 100644
--- a/litellm/responses/main.py
+++ b/litellm/responses/main.py
@@ -1967,11 +1967,18 @@ async def _aresponses_websocket(
)
# Extract params that we're passing explicitly to avoid duplicates in **kwargs
- remaining_kwargs = {
- k: v
- for k, v in kwargs.items()
- if k not in {"user_api_key_dict", "litellm_metadata"}
+ _explicit_keys = {
+ "user_api_key_dict",
+ "litellm_metadata",
+ "custom_llm_provider",
+ "model",
+ "websocket",
+ "litellm_logging_obj",
+ "api_base",
+ "api_key",
+ "timeout",
}
+ remaining_kwargs = {k: v for k, v in kwargs.items() if k not in _explicit_keys}
await base_llm_http_handler.async_responses_websocket(
model=model,
diff --git a/pyproject.toml b/pyproject.toml
index 2cce0e30ee..6e901682c1 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
[project]
name = "litellm"
-version = "1.83.5"
+version = "1.83.6"
description = "Library to easily interface with LLM API providers"
readme = "README.md"
requires-python = ">=3.9"
@@ -238,7 +238,7 @@ source-exclude = [
profile = "black"
[tool.commitizen]
-version = "1.83.5"
+version = "1.83.6"
version_files = [
"pyproject.toml:^version",
]
diff --git a/tests/proxy_unit_tests/test_jwt.py b/tests/proxy_unit_tests/test_jwt.py
index a5be1a3a42..9a8d6d3702 100644
--- a/tests/proxy_unit_tests/test_jwt.py
+++ b/tests/proxy_unit_tests/test_jwt.py
@@ -934,10 +934,7 @@ async def mock_user_object(*args, **kwargs):
user_id = kwargs.get("user_id")
user_email = kwargs.get("user_email")
return LiteLLM_UserTable(
- spend=0,
- user_id=user_id,
- max_budget=None,
- user_email=user_email
+ spend=0, user_id=user_id, max_budget=None, user_email=user_email
)
@@ -1170,15 +1167,13 @@ async def test_end_user_jwt_auth(monkeypatch):
# use generated key to auth in
from litellm import Router
from litellm.types.router import RouterGeneralSettings
-
+
# Create a router with pass_through_all_models enabled
router = Router(
model_list=[],
- router_general_settings=RouterGeneralSettings(
- pass_through_all_models=True
- ),
+ router_general_settings=RouterGeneralSettings(pass_through_all_models=True),
)
-
+
setattr(litellm.proxy.proxy_server, "premium_user", True)
setattr(
litellm.proxy.proxy_server,
@@ -1196,7 +1191,7 @@ async def test_end_user_jwt_auth(monkeypatch):
cost_tracking()
result = await user_api_key_auth(request=request, api_key=bearer_token)
-
+
# Assert that end_user_id is correctly extracted from JWT token's 'sub' field
assert result.end_user_id == "81b3e52a-67a6-4efb-9645-70527e101479"
@@ -1228,7 +1223,9 @@ async def test_end_user_jwt_auth(monkeypatch):
),
)
- with patch("litellm.acompletion", new=AsyncMock(return_value=mock_response)) as mock_completion:
+ with patch(
+ "litellm.acompletion", new=AsyncMock(return_value=mock_response)
+ ) as mock_completion:
resp = await chat_completion(
request=request,
fastapi_response=temp_response,
@@ -1243,10 +1240,13 @@ async def test_end_user_jwt_auth(monkeypatch):
# Verify the completion was called with correct end_user_id
mock_completion.assert_called_once()
call_kwargs = mock_completion.call_args.kwargs
-
+
# end_user_id is passed in metadata as 'user_api_key_end_user_id'
metadata = call_kwargs.get("metadata", {})
- assert metadata.get("user_api_key_end_user_id") == "81b3e52a-67a6-4efb-9645-70527e101479"
+ assert (
+ metadata.get("user_api_key_end_user_id")
+ == "81b3e52a-67a6-4efb-9645-70527e101479"
+ )
def test_can_rbac_role_call_route():
@@ -1278,13 +1278,13 @@ def test_user_api_key_auth_jwt_hashing():
"""
from litellm.proxy._types import UserAPIKeyAuth
from litellm.proxy.auth.handle_jwt import JWTHandler
-
+
# Test with a JWT token (3 parts separated by dots)
jwt_token = "test-jwt-token-header.payload.signature"
-
+
# Create UserAPIKeyAuth instance with JWT
user_auth = UserAPIKeyAuth(api_key=jwt_token)
-
+
# Verify that the API key is hashed with "hashed-jwt-" prefix
# critical - the raw JWT token should not be in the api_key or token
assert user_auth.api_key.startswith("hashed-jwt-")
@@ -1292,19 +1292,18 @@ def test_user_api_key_auth_jwt_hashing():
assert jwt_token not in user_auth.api_key
assert jwt_token not in user_auth.token
-
# Test with a regular API key (should not be hashed)
regular_api_key = "sk-1234567890abcdef"
user_auth_regular = UserAPIKeyAuth(api_key=regular_api_key)
-
+
# Verify that regular API key is hashed normally (without "hashed-jwt-" prefix)
assert not user_auth_regular.api_key.startswith("hashed-jwt-")
assert not user_auth_regular.token.startswith("hashed-jwt-")
-
+
# Test with a non-JWT, non-sk string (should not be hashed)
non_jwt_key = "some-random-key"
user_auth_non_jwt = UserAPIKeyAuth(api_key=non_jwt_key)
-
+
# Verify that non-JWT key is not hashed
assert user_auth_non_jwt.api_key == non_jwt_key
assert user_auth_non_jwt.token == non_jwt_key
@@ -1315,19 +1314,19 @@ def test_jwt_handler_is_jwt_static_method():
Test that JWTHandler.is_jwt is a static method and works correctly
"""
from litellm.proxy.auth.handle_jwt import JWTHandler
-
+
# Test with valid JWT format
valid_jwt = "test-jwt-token-header.payload.signature"
assert JWTHandler.is_jwt(valid_jwt) == True
-
+
# Test with invalid JWT format (only 2 parts)
invalid_jwt = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ"
assert JWTHandler.is_jwt(invalid_jwt) == False
-
+
# Test with regular API key
regular_key = "sk-1234567890abcdef"
assert JWTHandler.is_jwt(regular_key) == False
-
+
# Test with empty string
assert JWTHandler.is_jwt("") == False
@@ -1461,7 +1460,13 @@ async def test_auth_jwt_es256_jwk_path(monkeypatch):
now = int(time.time())
token = jwt.encode(
- {"sub": "alice", "aud": "litellm-proxy", "iss": "http://example", "iat": now, "exp": now + 300},
+ {
+ "sub": "alice",
+ "aud": "litellm-proxy",
+ "iss": "http://example",
+ "iat": now,
+ "exp": now + 300,
+ },
ec_priv_pem,
algorithm="ES256",
headers={"kid": "ec1"},
@@ -1508,7 +1513,13 @@ async def test_auth_jwt_rs256_regression(monkeypatch):
now = int(time.time())
token = jwt.encode(
- {"sub": "bob", "aud": "litellm-proxy", "iss": "http://example", "iat": now, "exp": now + 300},
+ {
+ "sub": "bob",
+ "aud": "litellm-proxy",
+ "iss": "http://example",
+ "iat": now,
+ "exp": now + 300,
+ },
rsa_priv_pem,
algorithm="RS256",
headers={"kid": "rsa1"},
@@ -1540,7 +1551,13 @@ async def test_auth_jwt_mismatched_key_fails(monkeypatch):
)
now = int(time.time())
token = jwt.encode(
- {"sub": "mallory", "aud": "litellm-proxy", "iss": "http://example", "iat": now, "exp": now + 300},
+ {
+ "sub": "mallory",
+ "aud": "litellm-proxy",
+ "iss": "http://example",
+ "iat": now,
+ "exp": now + 300,
+ },
ec_priv_pem,
algorithm="ES256",
headers={"kid": "ec1"},
@@ -1566,4 +1583,4 @@ async def test_auth_jwt_mismatched_key_fails(monkeypatch):
with patch.object(h, "get_public_key", new=AsyncMock(return_value=rsa_jwk)):
with pytest.raises(Exception) as exc:
await h.auth_jwt(token)
- assert "Validation fails" in str(exc.value)
\ No newline at end of file
+ assert "Validation fails" in str(exc.value)
diff --git a/tests/proxy_unit_tests/test_user_api_key_auth.py b/tests/proxy_unit_tests/test_user_api_key_auth.py
index 1a6e2eda9a..75f0d5e319 100644
--- a/tests/proxy_unit_tests/test_user_api_key_auth.py
+++ b/tests/proxy_unit_tests/test_user_api_key_auth.py
@@ -359,27 +359,38 @@ async def test_auth_with_allowed_routes(route, should_raise_error):
@pytest.mark.parametrize(
- "route, user_role, expected_result",
+ "route, user_role, should_be_allowed",
[
- # Proxy Admin checks
+ # Admin can access everything
+ ("/config/update", "proxy_admin", True),
("/global/spend/logs", "proxy_admin", True),
- ("/key/delete", "proxy_admin", False),
- ("/key/generate", "proxy_admin", False),
- ("/key/regenerate", "proxy_admin", False),
- # Internal User checks - allowed routes
+ ("/global/activity/cache_hits", "proxy_admin", True),
+ # Internal User - allowed read-only routes
("/global/spend/logs", "internal_user", True),
- ("/key/delete", "internal_user", False),
- ("/key/generate", "internal_user", False),
- ("/key/82akk800000000jjsk/regenerate", "internal_user", False),
- # Internal User Viewer
- ("/key/generate", "internal_user_viewer", False),
- # Internal User checks - disallowed routes
+ ("/spend/logs/ui", "internal_user", True),
+ ("/global/activity/cache_hits", "internal_user", True),
+ ("/health/services", "internal_user", True),
+ # Internal User - BLOCKED from admin routes (security fix)
+ ("/config/update", "internal_user", False),
+ ("/config/pass_through_endpoint", "internal_user", False),
+ ("/config/field/update", "internal_user", False),
("/organization/member_add", "internal_user", False),
+ # Internal User Viewer - allowed spend routes only
+ ("/spend/logs/ui", "internal_user_viewer", True),
+ ("/global/spend/all_tag_names", "internal_user_viewer", True),
+ # Internal User Viewer - blocked from admin routes
+ ("/config/update", "internal_user_viewer", False),
+ ("/key/generate", "internal_user_viewer", False),
],
)
-def test_is_ui_route_allowed(route, user_role, expected_result):
- from litellm.proxy.auth.auth_checks import _is_ui_route
- from litellm.proxy._types import LiteLLM_UserTable
+def test_ui_token_route_access(route, user_role, should_be_allowed):
+ """
+ Verify that UI tokens (team_id=litellm-dashboard) go through the same
+ RBAC checks as API tokens. Non-admin dashboard users must not be able
+ to access admin-only routes like /config/update.
+ """
+ from litellm.proxy.auth.auth_checks import _is_api_route_allowed
+ from litellm.proxy._types import LiteLLM_UserTable, UserAPIKeyAuth
user_obj = LiteLLM_UserTable(
user_id="3b803c0e-666e-4e99-bd5c-6e534c07e297",
@@ -395,18 +406,36 @@ def test_is_ui_route_allowed(route, user_role, expected_result):
organization_memberships=[],
)
- received_args: dict = {
- "route": route,
- "user_obj": user_obj,
- }
- try:
- assert _is_ui_route(**received_args) == expected_result
- except Exception as e:
- # If expected result is False, we expect an error
- if expected_result is False:
- pass
- else:
- raise e
+ valid_token = UserAPIKeyAuth(
+ user_id="3b803c0e-666e-4e99-bd5c-6e534c07e297",
+ team_id="litellm-dashboard",
+ user_role=user_role,
+ )
+
+ from starlette.datastructures import URL
+ from fastapi import Request
+
+ request = Request(scope={"type": "http"})
+ request._url = URL(url=route)
+
+ if should_be_allowed:
+ result = _is_api_route_allowed(
+ route=route,
+ request=request,
+ request_data={},
+ valid_token=valid_token,
+ user_obj=user_obj,
+ )
+ assert result is True
+ else:
+ with pytest.raises(Exception):
+ _is_api_route_allowed(
+ route=route,
+ request=request,
+ request_data={},
+ valid_token=valid_token,
+ user_obj=user_obj,
+ )
@pytest.mark.parametrize(
@@ -684,7 +713,7 @@ async def test_soft_budget_alert():
def test_is_allowed_route():
- from litellm.proxy.auth.auth_checks import _is_allowed_route
+ from litellm.proxy.auth.auth_checks import _is_api_route_allowed
from litellm.proxy._types import UserAPIKeyAuth
import datetime
@@ -692,7 +721,6 @@ def test_is_allowed_route():
args = {
"route": "/embeddings",
- "token_type": "api",
"request": request,
"request_data": {"input": ["hello world"], "model": "embedding-small"},
"valid_token": UserAPIKeyAuth(
@@ -752,7 +780,7 @@ def test_is_allowed_route():
"user_obj": None,
}
- assert _is_allowed_route(**args)
+ assert _is_api_route_allowed(**args)
@pytest.mark.parametrize(
@@ -836,7 +864,6 @@ async def test_user_api_key_auth_websocket():
with patch(
"litellm.proxy.auth.user_api_key_auth.user_api_key_auth", autospec=True
) as mock_user_api_key_auth:
-
# Make the call to the WebSocket function
await user_api_key_auth_websocket(mock_websocket)
@@ -845,10 +872,14 @@ async def test_user_api_key_auth_websocket():
# Get the request object that was passed to user_api_key_auth
request_arg = mock_user_api_key_auth.call_args.kwargs["request"]
-
+
# Verify that the request has headers set
- assert hasattr(request_arg, "headers"), "Request object should have headers attribute"
- assert "authorization" in request_arg.headers, "Request headers should contain authorization"
+ assert hasattr(
+ request_arg, "headers"
+ ), "Request object should have headers attribute"
+ assert (
+ "authorization" in request_arg.headers
+ ), "Request headers should contain authorization"
assert request_arg.headers["authorization"] == "Bearer some_api_key"
assert (
@@ -1036,7 +1067,10 @@ async def test_jwt_non_admin_team_route_access(monkeypatch):
# Create request
request = Request(
- scope={"type": "http", "headers": [(b"authorization", b"Bearer fake.jwt.token")]}
+ scope={
+ "type": "http",
+ "headers": [(b"authorization", b"Bearer fake.jwt.token")],
+ }
)
request._url = URL(url="/team/new")
@@ -1101,14 +1135,14 @@ async def test_x_litellm_api_key():
ignored_key = "aj12445"
# Create request with headers as bytes
- request = Request(
- scope={
- "type": "http"
- }
- )
+ request = Request(scope={"type": "http"})
request._url = URL(url="/chat/completions")
- valid_token = await user_api_key_auth(request=request, api_key="Bearer " + ignored_key, custom_litellm_key_header=master_key)
+ valid_token = await user_api_key_auth(
+ request=request,
+ api_key="Bearer " + ignored_key,
+ custom_litellm_key_header=master_key,
+ )
assert valid_token.token == hash_token(master_key)
@@ -1123,7 +1157,9 @@ async def test_user_api_key_from_query_param():
from litellm.proxy.proxy_server import hash_token, user_api_key_cache
user_key = "sk-query-1234"
- user_api_key_cache.set_cache(key=hash_token(user_key), value=UserAPIKeyAuth(token=hash_token(user_key)))
+ user_api_key_cache.set_cache(
+ key=hash_token(user_key), value=UserAPIKeyAuth(token=hash_token(user_key))
+ )
setattr(litellm.proxy.proxy_server, "user_api_key_cache", user_api_key_cache)
setattr(litellm.proxy.proxy_server, "master_key", "sk-1234")
@@ -1136,7 +1172,9 @@ async def test_user_api_key_from_query_param():
"query_string": f"alt=sse&key={user_key}".encode(),
}
)
- request._url = URL(url=f"/v1beta/models/gemini:streamGenerateContent?alt=sse&key={user_key}")
+ request._url = URL(
+ url=f"/v1beta/models/gemini:streamGenerateContent?alt=sse&key={user_key}"
+ )
async def return_body():
return b"{}"
@@ -1145,4 +1183,3 @@ async def test_user_api_key_from_query_param():
valid_token = await user_api_key_auth(request=request, api_key="")
assert valid_token.token == hash_token(user_key)
-
diff --git a/tests/test_litellm/llms/bedrock/messages/invoke_transformations/test_anthropic_claude3_transformation.py b/tests/test_litellm/llms/bedrock/messages/invoke_transformations/test_anthropic_claude3_transformation.py
index ea208007cd..570e11e1bb 100644
--- a/tests/test_litellm/llms/bedrock/messages/invoke_transformations/test_anthropic_claude3_transformation.py
+++ b/tests/test_litellm/llms/bedrock/messages/invoke_transformations/test_anthropic_claude3_transformation.py
@@ -3,6 +3,7 @@ import json
import os
import sys
from datetime import datetime
+from unittest.mock import Mock
import pytest
@@ -125,7 +126,8 @@ async def test_bedrock_sse_wrapper_keeps_usage_in_message_start_and_message_delt
assert "usage" in delta_json
assert delta_json["usage"]["cache_creation_input_tokens"] == 1562
assert delta_json["usage"]["cache_read_input_tokens"] == 32392
- assert delta_json["usage"]["input_tokens"] == 3 + 1562 + 32392
+ assert delta_json["usage"]["input_tokens"] == 3
+ assert delta_json["usage"]["output_tokens"] == 8
def test_chunk_parser_usage_transformation():
@@ -402,3 +404,111 @@ def test_bedrock_messages_strips_output_config_with_output_format():
assert "output_config" not in result
assert "output_format" not in result
+
+
+@pytest.mark.asyncio
+async def test_promote_message_stop_usage_preserves_message_delta_output_tokens():
+ """
+ Bedrock unified /messages streaming can send full usage on message_delta and a
+ conflicting smaller usage on message_stop (e.g. output_tokens 9 vs 12).
+ _promote_message_stop_usage must not replace message_delta output_tokens.
+ """
+ cfg = AmazonAnthropicClaudeMessagesConfig()
+
+ async def _stream(): # type: ignore[return-type]
+ yield {
+ "type": "message_delta",
+ "delta": {"stop_reason": "end_turn", "stop_sequence": None},
+ "usage": {
+ "input_tokens": 3,
+ "cache_creation_input_tokens": 10553,
+ "cache_read_input_tokens": 25490,
+ "output_tokens": 12,
+ },
+ }
+ yield {
+ "type": "message_stop",
+ "usage": {"input_tokens": 3, "output_tokens": 9},
+ }
+
+ merged: list[dict] = []
+ async for chunk in cfg._promote_message_stop_usage(_stream()):
+ if isinstance(chunk, dict):
+ merged.append(chunk)
+
+ assert len(merged) >= 1
+ delta_out = merged[0]
+ assert delta_out["type"] == "message_delta"
+ assert delta_out["usage"]["output_tokens"] == 12
+ assert delta_out["usage"]["cache_creation_input_tokens"] == 10553
+ assert delta_out["usage"]["cache_read_input_tokens"] == 25490
+ assert delta_out["usage"]["input_tokens"] == 3
+
+
+@pytest.mark.asyncio
+async def test_unified_bedrock_messages_sse_usage_and_cost_claude_sonnet_46():
+ """
+ End-to-end for Bedrock Invoke Anthropic Messages (unified) streaming path:
+ dict chunks -> _promote_message_stop_usage -> bedrock_sse_wrapper SSE bytes ->
+ same logging reconstruction as Anthropic /messages. Ensures token counts and
+ completion_cost match model_prices for us.anthropic.claude-sonnet-4-6.
+ """
+ from litellm import completion_cost
+ from litellm.proxy.pass_through_endpoints.llm_provider_handlers.anthropic_passthrough_logging_handler import (
+ AnthropicPassthroughLoggingHandler,
+ )
+
+ cfg = AmazonAnthropicClaudeMessagesConfig()
+
+ async def _stream(): # type: ignore[return-type]
+ yield {
+ "type": "message_delta",
+ "delta": {"stop_reason": "end_turn", "stop_sequence": None},
+ "usage": {
+ "input_tokens": 3,
+ "cache_creation_input_tokens": 10553,
+ "cache_read_input_tokens": 25490,
+ "output_tokens": 12,
+ },
+ }
+ yield {
+ "type": "message_stop",
+ "usage": {"input_tokens": 3, "output_tokens": 9},
+ }
+
+ logging_obj = LiteLLMLoggingObj(
+ model="bedrock/us.anthropic.claude-sonnet-4-6",
+ messages=[{"role": "user", "content": "Hello"}],
+ stream=True,
+ call_type="chat",
+ start_time=datetime.now(),
+ litellm_call_id="test_unified_bedrock_messages_sse_cost",
+ function_id="test_unified_bedrock_messages_sse_cost",
+ )
+
+ collected: list[bytes] = []
+ async for sse in cfg.bedrock_sse_wrapper(
+ completion_stream=_stream(),
+ litellm_logging_obj=logging_obj,
+ request_body={"model": "us.anthropic.claude-sonnet-4-6"},
+ ):
+ collected.append(sse)
+
+ built = AnthropicPassthroughLoggingHandler._build_complete_streaming_response(
+ all_chunks=collected,
+ model="us.anthropic.claude-sonnet-4-6",
+ litellm_logging_obj=Mock(),
+ )
+ assert built.usage is not None
+ assert built.usage.completion_tokens == 12
+ assert built.usage.prompt_tokens == 36046
+ assert built.usage.total_tokens == 36058
+ assert built.usage.cache_creation_input_tokens == 10553
+ assert built.usage.cache_read_input_tokens == 25490
+
+ cost = completion_cost(
+ completion_response=built,
+ model="bedrock/us.anthropic.claude-sonnet-4-6",
+ custom_llm_provider="bedrock",
+ )
+ assert cost == pytest.approx(0.052150725, rel=0, abs=1e-9)
diff --git a/tests/test_litellm/proxy/management_endpoints/test_key_management_endpoints.py b/tests/test_litellm/proxy/management_endpoints/test_key_management_endpoints.py
index 72092a97f5..096e0b2bc4 100644
--- a/tests/test_litellm/proxy/management_endpoints/test_key_management_endpoints.py
+++ b/tests/test_litellm/proxy/management_endpoints/test_key_management_endpoints.py
@@ -8579,3 +8579,170 @@ def test_enforce_upperbound_no_config_is_noop():
assert data.tpm_limit == 999999
finally:
litellm.upperbound_key_generate_params = original
+
+
+class TestAllowedRoutesCallerPermission:
+ """
+ Non-admins must not be able to set `allowed_routes` on a key. The field
+ bypasses the role-based route gate in
+ RouteChecks.non_proxy_admin_allowed_routes_check, so allowing a non-admin
+ to populate it grants them arbitrary endpoint access.
+ """
+
+ @pytest.mark.asyncio
+ async def test_non_admin_generate_key_with_allowed_routes_rejected(self):
+ data = GenerateKeyRequest(
+ key_alias="escalate",
+ allowed_routes=["/*"],
+ )
+ user_api_key_dict = UserAPIKeyAuth(
+ user_id="internal-user-123",
+ user_role=LitellmUserRoles.INTERNAL_USER,
+ )
+ mock_prisma_client = AsyncMock()
+
+ with patch("litellm.proxy.proxy_server.prisma_client", mock_prisma_client), patch(
+ "litellm.proxy.proxy_server.user_api_key_cache", MagicMock()
+ ), patch("litellm.proxy.proxy_server.user_custom_key_generate", None), patch(
+ "litellm.proxy.management_endpoints.key_management_endpoints._common_key_generation_helper",
+ new_callable=AsyncMock,
+ return_value=MagicMock(),
+ ):
+ with pytest.raises(ProxyException) as exc_info:
+ await generate_key_fn(
+ data=data,
+ user_api_key_dict=user_api_key_dict,
+ litellm_changed_by=None,
+ )
+ assert str(exc_info.value.code) == "403"
+ assert "allowed_routes" in str(exc_info.value.message)
+
+ @pytest.mark.asyncio
+ async def test_admin_generate_key_with_allowed_routes_allowed(self):
+ data = GenerateKeyRequest(
+ key_alias="admin-key",
+ allowed_routes=["/chat/completions"],
+ user_id="admin-user",
+ )
+ user_api_key_dict = UserAPIKeyAuth(
+ user_id="admin-user",
+ user_role=LitellmUserRoles.PROXY_ADMIN,
+ )
+ mock_prisma_client = AsyncMock()
+ stub_response = MagicMock()
+
+ with patch("litellm.proxy.proxy_server.prisma_client", mock_prisma_client), patch(
+ "litellm.proxy.proxy_server.user_api_key_cache", MagicMock()
+ ), patch("litellm.proxy.proxy_server.user_custom_key_generate", None), patch(
+ "litellm.proxy.management_endpoints.key_management_endpoints._common_key_generation_helper",
+ new_callable=AsyncMock,
+ return_value=stub_response,
+ ):
+ result = await generate_key_fn(
+ data=data,
+ user_api_key_dict=user_api_key_dict,
+ litellm_changed_by=None,
+ )
+ assert result is stub_response
+
+ @pytest.mark.asyncio
+ async def test_non_admin_generate_key_default_empty_allowed_routes_ok(self):
+ """
+ Regression guard: GenerateKeyRequest.allowed_routes defaults to [], so
+ the helper must treat empty-list as "not set" or every non-admin key
+ creation breaks.
+ """
+ data = GenerateKeyRequest(key_alias="plain-key")
+ user_api_key_dict = UserAPIKeyAuth(
+ user_id="internal-user-123",
+ user_role=LitellmUserRoles.INTERNAL_USER,
+ )
+ mock_prisma_client = AsyncMock()
+ stub_response = MagicMock()
+
+ with patch("litellm.proxy.proxy_server.prisma_client", mock_prisma_client), patch(
+ "litellm.proxy.proxy_server.user_api_key_cache", MagicMock()
+ ), patch("litellm.proxy.proxy_server.user_custom_key_generate", None), patch(
+ "litellm.proxy.management_endpoints.key_management_endpoints._common_key_generation_helper",
+ new_callable=AsyncMock,
+ return_value=stub_response,
+ ):
+ result = await generate_key_fn(
+ data=data,
+ user_api_key_dict=user_api_key_dict,
+ litellm_changed_by=None,
+ )
+ assert result is stub_response
+
+ @pytest.mark.asyncio
+ async def test_non_admin_update_key_with_allowed_routes_rejected(self):
+ from litellm.proxy.management_endpoints.key_management_endpoints import (
+ update_key_fn,
+ )
+
+ data = UpdateKeyRequest(key="sk-test", allowed_routes=["/*"])
+ user_api_key_dict = UserAPIKeyAuth(
+ user_id="internal-user-123",
+ user_role=LitellmUserRoles.INTERNAL_USER,
+ )
+ mock_prisma_client = AsyncMock()
+
+ with patch("litellm.proxy.proxy_server.prisma_client", mock_prisma_client), patch(
+ "litellm.proxy.proxy_server.user_api_key_cache", MagicMock()
+ ), patch("litellm.proxy.proxy_server.user_custom_key_update", None), patch(
+ "litellm.proxy.proxy_server.llm_router", None
+ ), patch("litellm.proxy.proxy_server.premium_user", True), patch(
+ "litellm.proxy.proxy_server.proxy_logging_obj", MagicMock()
+ ), patch(
+ "litellm.proxy.management_endpoints.key_management_endpoints._get_and_validate_existing_key",
+ new_callable=AsyncMock,
+ return_value=MagicMock(),
+ ):
+ with pytest.raises(ProxyException) as exc_info:
+ await update_key_fn(
+ request=MagicMock(),
+ data=data,
+ user_api_key_dict=user_api_key_dict,
+ litellm_changed_by=None,
+ )
+ assert str(exc_info.value.code) == "403"
+ assert "allowed_routes" in str(exc_info.value.message)
+
+
+def test_jinja_prompt_manager_is_sandboxed():
+ """
+ PromptManager renders user-supplied templates via /prompts/test, so its
+ jinja env must reject access to unsafe Python attributes like
+ ``__class__`` and ``__mro__``.
+ """
+ from jinja2.exceptions import SecurityError
+
+ from litellm.integrations.dotprompt.prompt_manager import PromptManager
+
+ pm = PromptManager()
+ template = pm.jinja_env.from_string("{{ ''.__class__.__mro__ }}")
+ with pytest.raises(SecurityError):
+ template.render()
+
+
+def test_validate_public_image_url_rejects_local_paths():
+ from litellm.proxy.ui_crud_endpoints.proxy_setting_endpoints import (
+ _validate_public_image_url,
+ )
+
+ for bad in ("/etc/passwd", "file:///etc/passwd", "../../etc/passwd"):
+ with pytest.raises(HTTPException) as exc_info:
+ _validate_public_image_url(bad, "logo_url")
+ assert exc_info.value.status_code == 400
+
+
+def test_validate_public_image_url_accepts_http_and_noop_empty():
+ from litellm.proxy.ui_crud_endpoints.proxy_setting_endpoints import (
+ _validate_public_image_url,
+ )
+
+ _validate_public_image_url("https://example.com/logo.png", "logo_url")
+ _validate_public_image_url("http://cdn.internal/logo.svg", "logo_url")
+ _validate_public_image_url(None, "logo_url")
+ _validate_public_image_url("", "logo_url")
+ _validate_public_image_url(" ", "logo_url")
diff --git a/ui/litellm-dashboard/src/components/user_edit_view.test.tsx b/ui/litellm-dashboard/src/components/user_edit_view.test.tsx
index 7f78ef4512..7aeaae94fa 100644
--- a/ui/litellm-dashboard/src/components/user_edit_view.test.tsx
+++ b/ui/litellm-dashboard/src/components/user_edit_view.test.tsx
@@ -1,6 +1,6 @@
-import { screen, waitFor } from "@testing-library/react";
+import { cleanup, screen, waitFor } from "@testing-library/react";
import userEvent from "@testing-library/user-event";
-import { beforeEach, describe, expect, it, vi } from "vitest";
+import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { renderWithProviders } from "../../tests/test-utils";
import { UserEditView } from "./user_edit_view";
@@ -140,6 +140,15 @@ describe("UserEditView", () => {
vi.clearAllMocks();
});
+ afterEach(() => {
+ // Tremor's internal Tooltip sets a setTimeout that fires after teardown,
+ // causing "window is not defined". Flush pending timers before cleanup.
+ vi.useFakeTimers();
+ vi.runAllTimers();
+ vi.useRealTimers();
+ cleanup();
+ });
+
it("should render", async () => {
renderWithProviders();