Merge remote-tracking branch 'origin' into litellm_oss_staging_04_09_2026

2026-04-10 16:41:27 -07:00 · 2026-04-10 16:41:27 -07:00 · 9a0487553d
commit 9a0487553d
parent bb7ac7c4ca 4e12d3c562
24 changed files with 826 additions and 238 deletions
--- a/docs/my-website/blog/april_townhall_announcement/index.md
+++ b/docs/my-website/blog/april_townhall_announcement/index.md
@ -4,6 +4,7 @@ title: "April Townhall: Security + Product Roadmap"
 date: 2026-04-02T07:30:00
 authors:
  - krrish
+  - ishaan-alt
 description: "Join the LiteLLM April townhall on Friday, 10 April at 7:30 AM to learn about LiteLLM's security and product roadmap."
 tags: [announcement, townhall]
 hide_table_of_contents: true
--- a/docs/my-website/blog/april_townhall_updates/index.md
+++ b/docs/my-website/blog/april_townhall_updates/index.md
@ -0,0 +1,162 @@
+---
+slug: april-townhall-updates
+title: "April Townhall Updates: CI/CD v2, Stability, and Product Roadmap"
+date: 2026-04-10T12:00:00
+authors:
+  - krrish
+  - ishaan-alt
+description: "A recap of the April LiteLLM town hall covering CI/CD v2, product stability work, and the near-term roadmap."
+tags: [townhall, security, reliability, product]
+hide_table_of_contents: false
+---
+
+import Image from '@theme/IdealImage';
+
+Thank you to everyone who joined our April town hall.
+
+We used the session to share our CI/CD v2 improvements, product stability work, and what we are prioritizing next across reliability and product roadmap.
+
+{/* truncate */}
+
+## CI/CD v2 improvements
+
+Our CI/CD v2 work is centered around four goals:
+
+1. **Limit** what each package can access
+2. **Reduce** the number of sensitive environment variables
+3. **Avoid** compromised packages
+4. **Reduce the risk of** release tampering
+
+#### New architecture: isolated environments
+
+We have begun moving to isolated environments for distinct CI/CD stages to reduce the chance that a single compromised step can inherit broad access across the entire pipeline.
+
+<Image
+  img={require('../../img/april_townhall_isolated_environments.png')}
+  style={{width: '900px', height: 'auto', display: 'block'}}
+/>
+
+#### Current rollout status
+
+These changes are deployed in our current release workflow. [See here](https://github.com/BerriAI/litellm/tags)
+
+#### Independently verify releases
+
+A key part of CI/CD v2 is supporting independent verification of release artifacts using our published verification process, while reducing reliance on any single credential or release path.
+
+[**Learn more about how to verify releases**](https://docs.litellm.ai/docs/proxy/docker_image_security)
+
+<Image
+  img={require('../../img/verify_releases.png')}
+  style={{width: '900px', height: 'auto', display: 'block'}}
+/>
+
+## Stability improvements
+
+### SDLC improvements
+
+This month, we're focusing on process stability improvements around:
+- Improving main-branch stability
+- Mapping UI QA to built Docker images for 1:1 environment parity
+- Consistent release tags across PyPI and Docker
+- Fixing release notes publication
+
+#### Improving main-branch stability
+
+We're introducing a staging-gated flow:
+
+<Image
+  img={require('../../img/stable_main.png')}
+  style={{width: '900px', height: 'auto', display: 'block'}}
+/>
+
+- Only an internal staging branch can push to `main`.
+- PRs to that staging branch must pass CircleCI LLM API testing.
+- Collision handling happens on staging, which is designed to reduce unstable changes reaching `main`.
+
+#### UI QA in Docker environment
+
+Moving forward, all UI QA will be performed in the built Docker image that users run.
+
+Previously, some UI QA paths were run in local environments that did not fully replicate Docker runtime conditions.
+
+That contributed to release-specific issues, including MCP registration problems in `v1.82.3`.
+
+#### Consistent release tags
+
+Today we publish releases for multiple scenarios:
+- Dev (Built of a PR for a customer-specific scenario)
+- Nightly (Passes all CI/CD checks)
+- Release Candidate (Passes all CI/CD checks + manual UI QA)
+- Stable (intended to pass all CI/CD checks + manual UI QA + 7 days of production testing)
+
+We are targeting a consistent naming convention across PyPI and Docker by the end of April.
+
+#### Release notes
+
+CI/CD v2 changes moved release notes to a manual path. This is a temporary solution while we investigate a better automated workflow. We are targeting a more consistent process by the end of April.
+
+### Product stability improvements
+
+#### Stable Prisma migrations
+
+Today, we have observed several migration failure classes:
+- Migration not applied
+- Migration marked applied but incomplete
+- Migration not applied due to non-root image issues
+
+We're prioritizing this work this month and have assigned an engineering owner to the effort. Our target is to resolve these error classes by the end of April.
+
+#### UI type safety
+
+Another area of focus is improving the stability of the UI. Today, one cause of errors is that the UI maintains its own assumptions about backend API types. This can lead to issues when backend responses differ from UI assumptions.
+
+We aim to move to having the UI and Backend be in sync with each other, and are exploring OpenAPI-driven mapping to achieve this.
+
+## Product roadmap
+
+### Our Assumptions
+
+Over the next few years, we expect:
+- Companies will give employees more AI tools.
+- More AI agents will move into production workflows across HR, finance, support, and operations.
+
+### Our Inferences
+#### Near-term
+
+- AI spend will increase.
+- Uptime and latency will become even more important.
+- More AI resources (skills, CLIs, and related assets) will require governance.
+- Agent and MCP usage patterns will require deeper controls.
+- Broader developer adoption will increase the need for simpler, more discoverable tooling.
+
+#### Long-term 
+
+- We expect many organizations to treat agent auditability (how decisions were made across LLM + MCP + sub-agent inputs/outputs) as a compliance expectation.
+- Permission management will get more complex as user-agent interaction chains deepen.
+
+Roadmap timelines in this post are targets and may evolve based on validation and user feedback.
+
+## April investments
+
+### Reliability
+
+- Increase uptime for 10k+ RPS scenarios.
+- Investigate latency overhead for long-running Claude Code requests.
+
+### Feature reliability
+
+- Polish MCP authentication.
+- Better understand how teams are using agents through LiteLLM.
+
+### Governance
+
+- Launch Skills as a first-class citizen in LiteLLM.
+
+## Q&A
+
+Thank you again for all the questions and direct feedback. We will keep sharing concrete progress updates as these efforts ship.
+
+## Hiring
+
+We are actively hiring across several roles, please apply [here](https://jobs.ashbyhq.com/litellm) if you're interested!
--- a/docs/my-website/blog/authors.yml
+++ b/docs/my-website/blog/authors.yml
@ -24,7 +24,7 @@ ishaan:

 # Alias for typo in name
 ishaan-alt:
-  name: Ishaan Jaff
+  name: Ishaan Jaffer
  title: CTO, LiteLLM
  url: https://www.linkedin.com/in/reffajnaahsi/
  image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
--- a/docs/my-website/img/april_townhall_isolated_environments.png
+++ b/docs/my-website/img/april_townhall_isolated_environments.png
--- a/docs/my-website/img/stable_main.png
+++ b/docs/my-website/img/stable_main.png
--- a/docs/my-website/img/verify_releases.png
+++ b/docs/my-website/img/verify_releases.png
--- a/litellm/integrations/dotprompt/prompt_manager.py
+++ b/litellm/integrations/dotprompt/prompt_manager.py
@ -7,7 +7,8 @@ from pathlib import Path
 from typing import Any, Dict, List, Optional, Tuple, Union

 import yaml
-from jinja2 import DictLoader, Environment, select_autoescape
+from jinja2 import DictLoader, select_autoescape
+from jinja2.sandbox import ImmutableSandboxedEnvironment


 class PromptTemplate:
@ -59,7 +60,10 @@ class PromptManager:
        self.prompt_directory = Path(prompt_directory) if prompt_directory else None
        self.prompts: Dict[str, PromptTemplate] = {}
        self.prompt_file = prompt_file
-        self.jinja_env = Environment(
+        # Sandboxed env: templates can come from user input via /prompts/test,
+        # so we must block access to unsafe Python attributes and mutation of
+        # caller-supplied mutables.
+        self.jinja_env = ImmutableSandboxedEnvironment(
            loader=DictLoader({}),
            autoescape=select_autoescape(["html", "xml"]),
            # Use Handlebars-style delimiters to match Dotprompt spec
--- a/litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py
+++ b/litellm/llms/bedrock/messages/invoke_transformations/anthropic_claude3_transformation.py
@ -538,9 +538,9 @@ class AmazonAnthropicClaudeMessagesConfig(
        merges usage from message_start and message_delta but ignores
        message_stop. This method buffers message_delta and, when
        message_stop arrives with cache usage, merges those fields into the
-        message_delta usage and also updates the input_tokens on
-        message_delta to include the full count (uncached + cache_creation +
-        cache_read).
+        message_delta usage. input_tokens is kept as the uncached-only
+        count; downstream calculate_usage adds cache tokens to
+        prompt_tokens.
        """
        _CACHE_FIELDS = ("cache_creation_input_tokens", "cache_read_input_tokens")
        pending_delta = None
@ -569,12 +569,7 @@ class AmazonAnthropicClaudeMessagesConfig(

                raw_input = stop_usage.get("input_tokens")
                if raw_input is not None:
-                    uncached = raw_input if isinstance(raw_input, int) else 0
-                    raw_cc = delta_usage.get("cache_creation_input_tokens", 0)
-                    cache_creation = raw_cc if isinstance(raw_cc, int) else 0
-                    raw_cr = delta_usage.get("cache_read_input_tokens", 0)
-                    cache_read = raw_cr if isinstance(raw_cr, int) else 0
-                    delta_usage["input_tokens"] = uncached + cache_creation + cache_read
+                    delta_usage["input_tokens"] = raw_input if isinstance(raw_input, int) else 0

                if delta_usage:
                    pending_delta["usage"] = delta_usage  # type: ignore[arg-type]
--- a/litellm/llms/litellm_proxy/skills/prompt_injection.py
+++ b/litellm/llms/litellm_proxy/skills/prompt_injection.py
@ -5,6 +5,7 @@ Handles extraction of skill content (SKILL.md) from stored ZIP files
 and injection into the system prompt for non-Anthropic models.
 """

+import posixpath
 import zipfile
 from io import BytesIO
 from typing import Any, Dict, List, Optional
@ -103,8 +104,18 @@ class SkillPromptInjectionHandler:
                    else:
                        clean_path = name

-                    if clean_path:
-                        files[clean_path] = zf.read(name)
+                    if not clean_path:
+                        continue
+
+                    # Ensure the path stays within the intended directory
+                    normalized = posixpath.normpath(clean_path)
+                    if normalized.startswith("..") or posixpath.isabs(normalized):
+                        verbose_logger.warning(
+                            f"SkillPromptInjectionHandler: Skipping entry with invalid path in skill {skill.skill_id}: {name}"
+                        )
+                        continue
+
+                    files[normalized] = zf.read(name)
        except Exception as e:
            verbose_logger.warning(
                f"SkillPromptInjectionHandler: Error extracting files from skill {skill.skill_id}: {e}"
--- a/litellm/llms/litellm_proxy/skills/sandbox_executor.py
+++ b/litellm/llms/litellm_proxy/skills/sandbox_executor.py
@ -94,9 +94,15 @@ class SkillsSandboxExecutor:

                # Create a temp directory to stage files
                with tempfile.TemporaryDirectory() as tmpdir:
+                    tmpdir_abs = os.path.abspath(tmpdir)
                    for path, content in skill_files.items():
                        # Create the file in temp directory
-                        local_path = os.path.join(tmpdir, path)
+                        local_path = os.path.abspath(os.path.join(tmpdir, path))
+                        if not local_path.startswith(tmpdir_abs + os.sep):
+                            verbose_logger.warning(
+                                f"SkillsSandboxExecutor: Skipping file with invalid path: {path}"
+                            )
+                            continue
                        os.makedirs(os.path.dirname(local_path), exist_ok=True)
                        with open(local_path, "wb") as f:
                            f.write(content)
--- a/litellm/proxy/_types.py
+++ b/litellm/proxy/_types.py
@ -494,10 +494,12 @@ class LiteLLMRoutes(enum.Enum):
        "/v2/key/info",
        "/model_group/info",
        "/health",
+        "/health/services",
        "/key/list",
        "/user/filter/ui",
        "/models",
        "/v1/models",
+        "/sso/get/ui_settings",
    ]

    # NOTE: ROUTES ONLY FOR MASTER KEY - only the Master Key should be able to Reset Spend
@ -566,6 +568,8 @@ class LiteLLMRoutes(enum.Enum):
        "/spend/tags",
        "/spend/calculate",
        "/spend/logs",
+        "/spend/logs/ui",
+        "/spend/logs/session/ui",
        "/cost/estimate",
    ]

@ -581,6 +585,7 @@ class LiteLLMRoutes(enum.Enum):
        "/global/spend/report",
        "/global/spend/provider",
        "/global/spend/tags",
+        "/global/spend/all_tag_names",
    ]

    public_routes = set(
@ -602,6 +607,9 @@ class LiteLLMRoutes(enum.Enum):
        ]
    )

+    # Retained for backwards compatibility with JWT auth configs that reference
+    # "ui_routes" in admin_allowed_routes. Not used by the proxy's own route
+    # authorization — UI tokens now go through the same RBAC path as API tokens.
    ui_routes = [
        "/sso",
        "/sso/get/ui_settings",
@ -627,19 +635,16 @@ class LiteLLMRoutes(enum.Enum):

    internal_user_routes = (
        [
-            "/global/spend/tags",
-            "/global/spend/keys",
-            "/global/spend/models",
-            "/global/spend/provider",
-            "/global/spend/end_users",
            "/global/activity",
            "/global/activity/model",
+            "/global/activity/cache_hits",
            "/v1/models/{model_id}",
            "/models/{model_id}",
            "/guardrails/list",
            "/v2/guardrails/list",
        ]
        + spend_tracking_routes
+        + global_spend_tracking_routes
        + key_management_routes
    )

@ -694,6 +699,9 @@ class LiteLLMRoutes(enum.Enum):
        "/tag/list",
        "/audit",
        "/audit/{id}",
+        "/global/activity",
+        "/global/activity/model",
+        "/global/activity/cache_hits",
    ] + info_routes

    # All routes accesible by an Org Admin
@ -892,9 +900,9 @@ class GenerateRequestBase(LiteLLMPydanticObjectBase):
    allowed_cache_controls: Optional[list] = []
    config: Optional[dict] = {}
    permissions: Optional[dict] = {}
-    model_max_budget: Optional[dict] = (
-        {}
-    )  # {"gpt-4": 5.0, "gpt-3.5-turbo": 5.0}, defaults to {}
+    model_max_budget: Optional[
+        dict
+    ] = {}  # {"gpt-4": 5.0, "gpt-3.5-turbo": 5.0}, defaults to {}

    model_config = ConfigDict(protected_namespaces=())
    model_rpm_limit: Optional[dict] = None
@ -1036,9 +1044,9 @@ class RegenerateKeyRequest(GenerateKeyRequest):
    spend: Optional[float] = None
    metadata: Optional[dict] = None
    new_master_key: Optional[str] = None
-    grace_period: Optional[str] = (
-        None  # Duration to keep old key valid (e.g. "24h", "2d"); None = immediate revoke
-    )
+    grace_period: Optional[
+        str
+    ] = None  # Duration to keep old key valid (e.g. "24h", "2d"); None = immediate revoke


 class ResetSpendRequest(LiteLLMPydanticObjectBase):
@ -1562,12 +1570,12 @@ class NewCustomerRequest(BudgetNewRequest):
    blocked: bool = False  # allow/disallow requests for this end-user
    budget_id: Optional[str] = None  # give either a budget_id or max_budget
    spend: Optional[float] = None
-    allowed_model_region: Optional[AllowedModelRegion] = (
-        None  # require all user requests to use models in this specific region
-    )
-    default_model: Optional[str] = (
-        None  # if no equivalent model in allowed region - default all requests to this model
-    )
+    allowed_model_region: Optional[
+        AllowedModelRegion
+    ] = None  # require all user requests to use models in this specific region
+    default_model: Optional[
+        str
+    ] = None  # if no equivalent model in allowed region - default all requests to this model
    object_permission: Optional[LiteLLM_ObjectPermissionBase] = None

    @model_validator(mode="before")
@ -1590,12 +1598,12 @@ class UpdateCustomerRequest(LiteLLMPydanticObjectBase):
    blocked: bool = False  # allow/disallow requests for this end-user
    max_budget: Optional[float] = None
    budget_id: Optional[str] = None  # give either a budget_id or max_budget
-    allowed_model_region: Optional[AllowedModelRegion] = (
-        None  # require all user requests to use models in this specific region
-    )
-    default_model: Optional[str] = (
-        None  # if no equivalent model in allowed region - default all requests to this model
-    )
+    allowed_model_region: Optional[
+        AllowedModelRegion
+    ] = None  # require all user requests to use models in this specific region
+    default_model: Optional[
+        str
+    ] = None  # if no equivalent model in allowed region - default all requests to this model
    object_permission: Optional[LiteLLM_ObjectPermissionBase] = None


@ -1685,15 +1693,15 @@ class NewTeamRequest(TeamBase):
    ] = None  # raise an error if 'guaranteed_throughput' is set and we're overallocating tpm

    model_tpm_limit: Optional[Dict[str, int]] = None
-    team_member_budget: Optional[float] = (
-        None  # allow user to set a budget for all team members
-    )
-    team_member_rpm_limit: Optional[int] = (
-        None  # allow user to set RPM limit for all team members
-    )
-    team_member_tpm_limit: Optional[int] = (
-        None  # allow user to set TPM limit for all team members
-    )
+    team_member_budget: Optional[
+        float
+    ] = None  # allow user to set a budget for all team members
+    team_member_rpm_limit: Optional[
+        int
+    ] = None  # allow user to set RPM limit for all team members
+    team_member_tpm_limit: Optional[
+        int
+    ] = None  # allow user to set TPM limit for all team members
    team_member_key_duration: Optional[str] = None  # e.g. "1d", "1w", "1m"
    team_member_budget_duration: Optional[str] = None  # e.g. "30d", "1mo"
    allowed_vector_store_indexes: Optional[List[AllowedVectorStoreIndexItem]] = None
@ -1790,9 +1798,9 @@ class BlockKeyRequest(LiteLLMPydanticObjectBase):

 class AddTeamCallback(LiteLLMPydanticObjectBase):
    callback_name: str
-    callback_type: Optional[Literal["success", "failure", "success_and_failure"]] = (
-        "success_and_failure"
-    )
+    callback_type: Optional[
+        Literal["success", "failure", "success_and_failure"]
+    ] = "success_and_failure"
    callback_vars: Dict[str, str]

    @model_validator(mode="before")
@ -2134,9 +2142,9 @@ class ConfigList(LiteLLMPydanticObjectBase):
    stored_in_db: Optional[bool]
    field_default_value: Any
    premium_field: bool = False
-    nested_fields: Optional[List[FieldDetail]] = (
-        None  # For nested dictionary or Pydantic fields
-    )
+    nested_fields: Optional[
+        List[FieldDetail]
+    ] = None  # For nested dictionary or Pydantic fields


 class UserHeaderMapping(LiteLLMPydanticObjectBase):
@ -2495,9 +2503,9 @@ class UserAPIKeyAuth(
    user_max_budget: Optional[float] = None
    request_route: Optional[str] = None
    user: Optional[Any] = None  # Expanded user object when expand=user is used
-    created_by_user: Optional[Any] = (
-        None  # Expanded created_by user when expand=user is used
-    )
+    created_by_user: Optional[
+        Any
+    ] = None  # Expanded created_by user when expand=user is used
    end_user_object_permission: Optional[LiteLLM_ObjectPermissionTable] = None
    # Decoded upstream IdP claims (groups, roles, etc.) propagated by JWT auth machinery
    # and forwarded into outbound tokens by guardrails such as MCPJWTSigner.
@ -2636,9 +2644,9 @@ class LiteLLM_OrganizationMembershipTable(LiteLLMPydanticObjectBase):
    budget_id: Optional[str] = None
    created_at: datetime
    updated_at: datetime
-    user: Optional[Any] = (
-        None  # You might want to replace 'Any' with a more specific type if available
-    )
+    user: Optional[
+        Any
+    ] = None  # You might want to replace 'Any' with a more specific type if available
    litellm_budget_table: Optional[LiteLLM_BudgetTable] = None
    user_email: Optional[str] = None

@ -3793,9 +3801,9 @@ class TeamModelDeleteRequest(BaseModel):
 # Organization Member Requests
 class OrganizationMemberAddRequest(OrgMemberAddRequest):
    organization_id: str
-    max_budget_in_organization: Optional[float] = (
-        None  # Users max budget within the organization
-    )
+    max_budget_in_organization: Optional[
+        float
+    ] = None  # Users max budget within the organization


 class OrganizationMemberDeleteRequest(MemberDeleteRequest):
@ -4050,9 +4058,9 @@ class ProviderBudgetResponse(LiteLLMPydanticObjectBase):
    Maps provider names to their budget configs.
    """

-    providers: Dict[str, ProviderBudgetResponseObject] = (
-        {}
-    )  # Dictionary mapping provider names to their budget configurations
+    providers: Dict[
+        str, ProviderBudgetResponseObject
+    ] = {}  # Dictionary mapping provider names to their budget configurations


 class ProxyStateVariables(TypedDict):
@ -4214,9 +4222,9 @@ class LiteLLM_JWTAuth(LiteLLMPydanticObjectBase):
    enforce_rbac: bool = False
    roles_jwt_field: Optional[str] = None  # v2 on role mappings
    role_mappings: Optional[List[RoleMapping]] = None
-    object_id_jwt_field: Optional[str] = (
-        None  # can be either user / team, inferred from the role mapping
-    )
+    object_id_jwt_field: Optional[
+        str
+    ] = None  # can be either user / team, inferred from the role mapping
    scope_mappings: Optional[List[ScopeMapping]] = None
    enforce_scope_based_access: bool = False
    enforce_team_based_model_access: bool = False
--- a/litellm/proxy/agent_endpoints/endpoints.py
+++ b/litellm/proxy/agent_endpoints/endpoints.py
@ -28,6 +28,7 @@ from litellm.types.agents import (
    MakeAgentsPublicRequest,
    PatchAgentRequest,
 )
+from litellm.litellm_core_utils.litellm_logging import _get_masked_values
 from litellm.types.llms.custom_http import httpxSpecialProvider
 from litellm.types.proxy.management_endpoints.common_daily_activity import (
    SpendAnalyticsPaginatedResponse,
@ -36,6 +37,28 @@ from litellm.types.proxy.management_endpoints.common_daily_activity import (
 router = APIRouter()


+def _redact_sensitive_agent_fields(
+    agents: List[AgentResponse],
+) -> List[AgentResponse]:
+    """
+    Return copies of the given agents with sensitive configuration fields
+    redacted.  The original objects are not modified.
+    """
+    redacted: List[AgentResponse] = []
+    for agent in agents:
+        copy = agent.model_copy(deep=True)
+        copy.static_headers = None
+        copy.extra_headers = None
+        if copy.litellm_params:
+            copy.litellm_params = _get_masked_values(
+                copy.litellm_params,
+                unmasked_length=4,
+                number_of_asterisks=4,
+            )
+        redacted.append(copy)
+    return redacted
+
+
 def _check_agent_management_permission(user_api_key_dict: UserAPIKeyAuth) -> None:
    """
    Raises HTTP 403 if the caller does not have permission to create, update,
@ -183,6 +206,14 @@ async def get_agents(
                agent.agent_id in litellm.public_agent_groups
            )

+        # Redact sensitive fields for non-admin users
+        is_admin = (
+            user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN
+            or user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value
+        )
+        if not is_admin:
+            returned_agents = _redact_sensitive_agent_fields(returned_agents)
+
        if health_check:
            agents_with_url = [
                agent
@ -399,6 +430,14 @@ async def get_agent_by_id(
                status_code=404, detail=f"Agent with ID {agent_id} not found"
            )

+        # Redact sensitive fields for non-admin users
+        is_admin = (
+            user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN
+            or user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value
+        )
+        if not is_admin:
+            agent = _redact_sensitive_agent_fields([agent])[0]
+
        return agent
    except HTTPException:
        raise
--- a/litellm/proxy/auth/auth_checks.py
+++ b/litellm/proxy/auth/auth_checks.py
@ -196,9 +196,7 @@ def _is_model_cost_zero(
    return True


-def _is_cost_explicitly_configured(
-    model: str, llm_router: "Router"
-) -> bool:
+def _is_cost_explicitly_configured(model: str, llm_router: "Router") -> bool:
    """
    Check if any deployment in the model group has cost fields explicitly
    set in its litellm.model_cost entry.
@ -215,10 +213,7 @@ def _is_cost_explicitly_configured(
        if model_id is None:
            continue
        raw_entry = litellm.model_cost.get(model_id, {})
-        if (
-            "input_cost_per_token" in raw_entry
-            or "output_cost_per_token" in raw_entry
-        ):
+        if "input_cost_per_token" in raw_entry or "output_cost_per_token" in raw_entry:
            return True
    return False

@ -596,17 +591,12 @@ async def common_checks(  # noqa: PLR0915
        user_object=user_object, route=route, request_body=request_body
    )

-    token_team = getattr(valid_token, "team_id", None)
-    token_type: Literal["ui", "api"] = (
-        "ui" if token_team is not None and token_team == "litellm-dashboard" else "api"
-    )
-    _is_route_allowed = _is_allowed_route(
+    _is_route_allowed = _is_api_route_allowed(
        route=route,
-        token_type=token_type,
-        user_obj=user_object,
        request=request,
        request_data=request_body,
        valid_token=valid_token,
+        user_obj=user_object,
    )

    # 11. [OPTIONAL] Vector store checks - is the object allowed to access the vector store
@ -629,31 +619,6 @@ async def common_checks(  # noqa: PLR0915
    return True


-def _is_ui_route(
-    route: str,
-    user_obj: Optional[LiteLLM_UserTable] = None,
-) -> bool:
-    """
-    - Check if the route is a UI used route
-    """
-    # this token is only used for managing the ui
-    allowed_routes = LiteLLMRoutes.ui_routes.value
-    # check if the current route startswith any of the allowed routes
-    if (
-        route is not None
-        and isinstance(route, str)
-        and any(route.startswith(allowed_route) for allowed_route in allowed_routes)
-    ):
-        # Do something if the current route starts with any of the allowed routes
-        return True
-    elif any(
-        RouteChecks._route_matches_pattern(route=route, pattern=allowed_route)
-        for allowed_route in allowed_routes
-    ):
-        return True
-    return False
-
-
 def _get_user_role(
    user_obj: Optional[LiteLLM_UserTable],
 ) -> Optional[LitellmUserRoles]:
@ -717,30 +682,6 @@ def _is_user_proxy_admin(user_obj: Optional[LiteLLM_UserTable]):
    return False


-def _is_allowed_route(
-    route: str,
-    token_type: Literal["ui", "api"],
-    request: Request,
-    request_data: dict,
-    valid_token: Optional[UserAPIKeyAuth],
-    user_obj: Optional[LiteLLM_UserTable] = None,
-) -> bool:
-    """
-    - Route b/w ui token check and normal token check
-    """
-
-    if token_type == "ui" and _is_ui_route(route=route, user_obj=user_obj):
-        return True
-    else:
-        return _is_api_route_allowed(
-            route=route,
-            request=request,
-            request_data=request_data,
-            valid_token=valid_token,
-            user_obj=user_obj,
-        )
-
-
 def _allowed_routes_check(user_route: str, allowed_routes: list) -> bool:
    """
    Return if a user is allowed to access route. Helper function for `allowed_routes_check`.
--- a/litellm/proxy/guardrails/guardrail_endpoints.py
+++ b/litellm/proxy/guardrails/guardrail_endpoints.py
@ -60,12 +60,20 @@ def _get_guardrails_list_response(
    """
    Helper function to get the guardrails list response
    """
+    from litellm.litellm_core_utils.litellm_logging import _get_masked_values
+
    guardrail_configs: List[GuardrailInfoResponse] = []
    for guardrail in guardrails_config:
+        litellm_params = guardrail.get("litellm_params") or {}
+        masked_params = _get_masked_values(
+            litellm_params,
+            unmasked_length=4,
+            number_of_asterisks=4,
+        )
        guardrail_configs.append(
            GuardrailInfoResponse(
                guardrail_name=guardrail.get("guardrail_name"),
-                litellm_params=guardrail.get("litellm_params"),
+                litellm_params=masked_params,
                guardrail_info=guardrail.get("guardrail_info"),
            )
        )
--- a/litellm/proxy/management_endpoints/key_management_endpoints.py
+++ b/litellm/proxy/management_endpoints/key_management_endpoints.py
@ -456,6 +456,34 @@ def handle_key_type(data: GenerateKeyRequest, data_json: dict) -> dict:
    return data_json


+def _check_allowed_routes_caller_permission(
+    allowed_routes: Optional[list],
+    user_api_key_dict: UserAPIKeyAuth,
+) -> None:
+    """
+    Only proxy admins may set `allowed_routes` on a key.
+
+    `allowed_routes` bypasses the standard role-based route gate in
+    RouteChecks.non_proxy_admin_allowed_routes_check, so if a non-admin is
+    allowed to set it they can grant themselves access to any endpoint.
+    Non-admins should use `key_type` to pick a preset route bucket instead.
+    """
+    # Empty list is the default on GenerateKeyRequest — treat as "not set".
+    if not allowed_routes:
+        return
+    if user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value:
+        return
+    raise HTTPException(
+        status_code=403,
+        detail={
+            "error": (
+                "Only proxy admins can set `allowed_routes` on a key. "
+                "Use `key_type` to pick a preset route bucket instead."
+            )
+        },
+    )
+
+
 async def validate_team_id_used_in_service_account_request(
    team_id: Optional[str],
    prisma_client: Optional[PrismaClient],
@ -740,9 +768,9 @@ async def _common_key_generation_helper(  # noqa: PLR0915
        request_type="key", **data_json, table_name="key"
    )

-    response[
-        "soft_budget"
-    ] = data.soft_budget  # include the user-input soft budget in the response
+    response["soft_budget"] = (
+        data.soft_budget
+    )  # include the user-input soft budget in the response

    response = GenerateKeyResponse(**response)

@ -1254,6 +1282,12 @@ async def generate_key_fn(
                raise HTTPException(
                    status_code=status.HTTP_403_FORBIDDEN, detail=message
                )
+
+        _check_allowed_routes_caller_permission(
+            allowed_routes=data.allowed_routes,
+            user_api_key_dict=user_api_key_dict,
+        )
+
        # For non-admin internal users: auto-assign caller's user_id if not provided
        # This prevents creating unbound keys with no user association (LIT-1884)
        _is_proxy_admin = (
@ -1888,6 +1922,11 @@ async def _validate_update_key_data(
    """Validate permissions and constraints for key update."""
    _is_proxy_admin = user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value

+    _check_allowed_routes_caller_permission(
+        allowed_routes=data.allowed_routes,
+        user_api_key_dict=user_api_key_dict,
+    )
+
    # Prevent non-admin from removing user_id (setting to empty string) (LIT-1884)
    if data.user_id is not None and data.user_id == "" and not _is_proxy_admin:
        raise HTTPException(
@ -3233,10 +3272,10 @@ async def delete_verification_tokens(
    try:
        if prisma_client:
            tokens = [_hash_token_if_needed(token=key) for key in tokens]
-            _keys_being_deleted: List[
-                LiteLLM_VerificationToken
-            ] = await prisma_client.db.litellm_verificationtoken.find_many(
-                where={"token": {"in": tokens}}
+            _keys_being_deleted: List[LiteLLM_VerificationToken] = (
+                await prisma_client.db.litellm_verificationtoken.find_many(
+                    where={"token": {"in": tokens}}
+                )
            )

            if len(_keys_being_deleted) == 0:
@ -3436,9 +3475,9 @@ async def _rotate_master_key(  # noqa: PLR0915
    from litellm.proxy.proxy_server import proxy_config

    try:
-        models: Optional[
-            List
-        ] = await prisma_client.db.litellm_proxymodeltable.find_many()
+        models: Optional[List] = (
+            await prisma_client.db.litellm_proxymodeltable.find_many()
+        )
    except Exception:
        models = None
    # 2. process model table
@ -4078,11 +4117,11 @@ async def validate_key_list_check(
            param="user_id",
            code=status.HTTP_403_FORBIDDEN,
        )
-    complete_user_info_db_obj: Optional[
-        BaseModel
-    ] = await prisma_client.db.litellm_usertable.find_unique(
-        where={"user_id": user_api_key_dict.user_id},
-        include={"organization_memberships": True},
+    complete_user_info_db_obj: Optional[BaseModel] = (
+        await prisma_client.db.litellm_usertable.find_unique(
+            where={"user_id": user_api_key_dict.user_id},
+            include={"organization_memberships": True},
+        )
    )

    if complete_user_info_db_obj is None:
@ -4165,10 +4204,10 @@ async def _fetch_user_team_objects(
    if complete_user_info is None or not complete_user_info.teams:
        return []

-    teams: Optional[
-        List[BaseModel]
-    ] = await prisma_client.db.litellm_teamtable.find_many(
-        where={"team_id": {"in": complete_user_info.teams}}
+    teams: Optional[List[BaseModel]] = (
+        await prisma_client.db.litellm_teamtable.find_many(
+            where={"team_id": {"in": complete_user_info.teams}}
+        )
    )
    if teams is None:
        return []
--- a/litellm/proxy/ui_crud_endpoints/proxy_setting_endpoints.py
+++ b/litellm/proxy/ui_crud_endpoints/proxy_setting_endpoints.py
@ -1,6 +1,7 @@
 #### CRUD ENDPOINTS for UI Settings #####
 import json
 from typing import Any, Dict, List, Optional, Union
+from urllib.parse import urlparse

 from fastapi import APIRouter, Depends, File, HTTPException, UploadFile

@ -817,6 +818,29 @@ async def get_ui_theme_settings():
    )


+def _validate_public_image_url(value: Optional[str], field_name: str) -> None:
+    """
+    Reject anything that isn't a plain http(s) URL with a host. This value is
+    later served via the unauthenticated /get_image endpoint, so local paths
+    like "/etc/passwd" or "file://..." must not be accepted.
+    """
+    if value is None:
+        return
+    if not isinstance(value, str) or not value.strip():
+        return
+    parsed = urlparse(value.strip())
+    if parsed.scheme not in ("http", "https") or not parsed.netloc:
+        raise HTTPException(
+            status_code=400,
+            detail={
+                "error": (
+                    f"Invalid {field_name}: must be an http(s) URL with a host. "
+                    "Local filesystem paths and non-http schemes are not allowed."
+                )
+            },
+        )
+
+
@router.patch(
    "/update/ui_theme_settings",
    tags=["UI Theme Settings"],
@ -831,6 +855,9 @@ async def update_ui_theme_settings(theme_config: UIThemeConfig):

    from litellm.proxy.proxy_server import proxy_config, store_model_in_db

+    _validate_public_image_url(theme_config.logo_url, "logo_url")
+    _validate_public_image_url(theme_config.favicon_url, "favicon_url")
+
    if store_model_in_db is not True:
        raise HTTPException(
            status_code=500,
--- a/litellm/proxy/utils.py
+++ b/litellm/proxy/utils.py
@ -2645,7 +2645,7 @@ class PrismaClient:
            raise e

    async def _query_first_with_cached_plan_fallback(
-        self, sql_query: str
+        self, sql_query: str, *args
    ) -> Optional[dict]:
        """
        Execute a query with automatic fallback for PostgreSQL cached plan errors.
@ -2664,7 +2664,7 @@ class PrismaClient:
            Original exception if not a cached plan error
        """
        try:
-            return await self.db.query_first(query=sql_query)
+            return await self.db.query_first(sql_query, *args)
        except Exception as e:
            error_str = str(e)
            if "cached plan must not change result type" in error_str:
@ -2679,7 +2679,7 @@ class PrismaClient:
                    "retrying with fresh plan. This may occur during rolling deployments "
                    "when schema changes are applied."
                )
-                return await self.db.query_first(query=sql_query_retry)
+                return await self.db.query_first(sql_query_retry, *args)
            else:
                raise

@ -2978,7 +2978,7 @@ class PrismaClient:
                            detail={"error": f"No token passed in. Token={token}"},
                        )

-                    sql_query = f"""
+                    sql_query = """
                        SELECT 
                            v.*,
                            t.spend AS team_spend, 
@ -3016,11 +3016,11 @@ class PrismaClient:
                        LEFT JOIN "LiteLLM_ProjectTable" AS p ON v.project_id = p.project_id
                        LEFT JOIN "LiteLLM_OrganizationTable" AS o ON v.organization_id = o.organization_id
                        LEFT JOIN "LiteLLM_BudgetTable" AS b2 ON o.budget_id = b2.budget_id
-                        WHERE v.token = '{token}'
+                        WHERE v.token = $1
                    """

                    response = await self._query_first_with_cached_plan_fallback(
-                        sql_query
+                        sql_query, hashed_token
                    )

                    # If not found in main table, check deprecated keys (grace period)
--- a/litellm/responses/main.py
+++ b/litellm/responses/main.py
@ -1967,11 +1967,18 @@ async def _aresponses_websocket(
    )

    # Extract params that we're passing explicitly to avoid duplicates in **kwargs
-    remaining_kwargs = {
-        k: v
-        for k, v in kwargs.items()
-        if k not in {"user_api_key_dict", "litellm_metadata"}
+    _explicit_keys = {
+        "user_api_key_dict",
+        "litellm_metadata",
+        "custom_llm_provider",
+        "model",
+        "websocket",
+        "litellm_logging_obj",
+        "api_base",
+        "api_key",
+        "timeout",
    }
+    remaining_kwargs = {k: v for k, v in kwargs.items() if k not in _explicit_keys}

    await base_llm_http_handler.async_responses_websocket(
        model=model,
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,6 +1,6 @@
 [project]
 name = "litellm"
-version = "1.83.5"
+version = "1.83.6"
 description = "Library to easily interface with LLM API providers"
 readme = "README.md"
 requires-python = ">=3.9"
@ -238,7 +238,7 @@ source-exclude = [
 profile = "black"

 [tool.commitizen]
-version = "1.83.5"
+version = "1.83.6"
 version_files = [
    "pyproject.toml:^version",
 ]
--- a/tests/proxy_unit_tests/test_jwt.py
+++ b/tests/proxy_unit_tests/test_jwt.py
@ -934,10 +934,7 @@ async def mock_user_object(*args, **kwargs):
    user_id = kwargs.get("user_id")
    user_email = kwargs.get("user_email")
    return LiteLLM_UserTable(
-        spend=0, 
-        user_id=user_id, 
-        max_budget=None, 
-        user_email=user_email
+        spend=0, user_id=user_id, max_budget=None, user_email=user_email
    )


@ -1170,15 +1167,13 @@ async def test_end_user_jwt_auth(monkeypatch):
    # use generated key to auth in
    from litellm import Router
    from litellm.types.router import RouterGeneralSettings
-    
+
    # Create a router with pass_through_all_models enabled
    router = Router(
        model_list=[],
-        router_general_settings=RouterGeneralSettings(
-            pass_through_all_models=True
-        ),
+        router_general_settings=RouterGeneralSettings(pass_through_all_models=True),
    )
-    
+
    setattr(litellm.proxy.proxy_server, "premium_user", True)
    setattr(
        litellm.proxy.proxy_server,
@ -1196,7 +1191,7 @@ async def test_end_user_jwt_auth(monkeypatch):

    cost_tracking()
    result = await user_api_key_auth(request=request, api_key=bearer_token)
-    
+
    # Assert that end_user_id is correctly extracted from JWT token's 'sub' field
    assert result.end_user_id == "81b3e52a-67a6-4efb-9645-70527e101479"

@ -1228,7 +1223,9 @@ async def test_end_user_jwt_auth(monkeypatch):
        ),
    )

-    with patch("litellm.acompletion", new=AsyncMock(return_value=mock_response)) as mock_completion:
+    with patch(
+        "litellm.acompletion", new=AsyncMock(return_value=mock_response)
+    ) as mock_completion:
        resp = await chat_completion(
            request=request,
            fastapi_response=temp_response,
@ -1243,10 +1240,13 @@ async def test_end_user_jwt_auth(monkeypatch):
        # Verify the completion was called with correct end_user_id
        mock_completion.assert_called_once()
        call_kwargs = mock_completion.call_args.kwargs
-        
+
        # end_user_id is passed in metadata as 'user_api_key_end_user_id'
        metadata = call_kwargs.get("metadata", {})
-        assert metadata.get("user_api_key_end_user_id") == "81b3e52a-67a6-4efb-9645-70527e101479"
+        assert (
+            metadata.get("user_api_key_end_user_id")
+            == "81b3e52a-67a6-4efb-9645-70527e101479"
+        )


 def test_can_rbac_role_call_route():
@ -1278,13 +1278,13 @@ def test_user_api_key_auth_jwt_hashing():
    """
    from litellm.proxy._types import UserAPIKeyAuth
    from litellm.proxy.auth.handle_jwt import JWTHandler
-    
+
    # Test with a JWT token (3 parts separated by dots)
    jwt_token = "test-jwt-token-header.payload.signature"
-    
+
    # Create UserAPIKeyAuth instance with JWT
    user_auth = UserAPIKeyAuth(api_key=jwt_token)
-    
+
    # Verify that the API key is hashed with "hashed-jwt-" prefix
    # critical - the raw JWT token should not be in the api_key or token
    assert user_auth.api_key.startswith("hashed-jwt-")
@ -1292,19 +1292,18 @@ def test_user_api_key_auth_jwt_hashing():
    assert jwt_token not in user_auth.api_key
    assert jwt_token not in user_auth.token

-    
    # Test with a regular API key (should not be hashed)
    regular_api_key = "sk-1234567890abcdef"
    user_auth_regular = UserAPIKeyAuth(api_key=regular_api_key)
-    
+
    # Verify that regular API key is hashed normally (without "hashed-jwt-" prefix)
    assert not user_auth_regular.api_key.startswith("hashed-jwt-")
    assert not user_auth_regular.token.startswith("hashed-jwt-")
-    
+
    # Test with a non-JWT, non-sk string (should not be hashed)
    non_jwt_key = "some-random-key"
    user_auth_non_jwt = UserAPIKeyAuth(api_key=non_jwt_key)
-    
+
    # Verify that non-JWT key is not hashed
    assert user_auth_non_jwt.api_key == non_jwt_key
    assert user_auth_non_jwt.token == non_jwt_key
@ -1315,19 +1314,19 @@ def test_jwt_handler_is_jwt_static_method():
    Test that JWTHandler.is_jwt is a static method and works correctly
    """
    from litellm.proxy.auth.handle_jwt import JWTHandler
-    
+
    # Test with valid JWT format
    valid_jwt = "test-jwt-token-header.payload.signature"
    assert JWTHandler.is_jwt(valid_jwt) == True
-    
+
    # Test with invalid JWT format (only 2 parts)
    invalid_jwt = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ"
    assert JWTHandler.is_jwt(invalid_jwt) == False
-    
+
    # Test with regular API key
    regular_key = "sk-1234567890abcdef"
    assert JWTHandler.is_jwt(regular_key) == False
-    
+
    # Test with empty string
    assert JWTHandler.is_jwt("") == False

@ -1461,7 +1460,13 @@ async def test_auth_jwt_es256_jwk_path(monkeypatch):

    now = int(time.time())
    token = jwt.encode(
-        {"sub": "alice", "aud": "litellm-proxy", "iss": "http://example", "iat": now, "exp": now + 300},
+        {
+            "sub": "alice",
+            "aud": "litellm-proxy",
+            "iss": "http://example",
+            "iat": now,
+            "exp": now + 300,
+        },
        ec_priv_pem,
        algorithm="ES256",
        headers={"kid": "ec1"},
@ -1508,7 +1513,13 @@ async def test_auth_jwt_rs256_regression(monkeypatch):

    now = int(time.time())
    token = jwt.encode(
-        {"sub": "bob", "aud": "litellm-proxy", "iss": "http://example", "iat": now, "exp": now + 300},
+        {
+            "sub": "bob",
+            "aud": "litellm-proxy",
+            "iss": "http://example",
+            "iat": now,
+            "exp": now + 300,
+        },
        rsa_priv_pem,
        algorithm="RS256",
        headers={"kid": "rsa1"},
@ -1540,7 +1551,13 @@ async def test_auth_jwt_mismatched_key_fails(monkeypatch):
    )
    now = int(time.time())
    token = jwt.encode(
-        {"sub": "mallory", "aud": "litellm-proxy", "iss": "http://example", "iat": now, "exp": now + 300},
+        {
+            "sub": "mallory",
+            "aud": "litellm-proxy",
+            "iss": "http://example",
+            "iat": now,
+            "exp": now + 300,
+        },
        ec_priv_pem,
        algorithm="ES256",
        headers={"kid": "ec1"},
@ -1566,4 +1583,4 @@ async def test_auth_jwt_mismatched_key_fails(monkeypatch):
    with patch.object(h, "get_public_key", new=AsyncMock(return_value=rsa_jwk)):
        with pytest.raises(Exception) as exc:
            await h.auth_jwt(token)
-        assert "Validation fails" in str(exc.value)
+        assert "Validation fails" in str(exc.value)
--- a/tests/proxy_unit_tests/test_user_api_key_auth.py
+++ b/tests/proxy_unit_tests/test_user_api_key_auth.py
@ -359,27 +359,38 @@ async def test_auth_with_allowed_routes(route, should_raise_error):


@pytest.mark.parametrize(
-    "route, user_role, expected_result",
+    "route, user_role, should_be_allowed",
    [
-        # Proxy Admin checks
+        # Admin can access everything
+        ("/config/update", "proxy_admin", True),
        ("/global/spend/logs", "proxy_admin", True),
-        ("/key/delete", "proxy_admin", False),
-        ("/key/generate", "proxy_admin", False),
-        ("/key/regenerate", "proxy_admin", False),
-        # Internal User checks - allowed routes
+        ("/global/activity/cache_hits", "proxy_admin", True),
+        # Internal User - allowed read-only routes
        ("/global/spend/logs", "internal_user", True),
-        ("/key/delete", "internal_user", False),
-        ("/key/generate", "internal_user", False),
-        ("/key/82akk800000000jjsk/regenerate", "internal_user", False),
-        # Internal User Viewer
-        ("/key/generate", "internal_user_viewer", False),
-        # Internal User checks - disallowed routes
+        ("/spend/logs/ui", "internal_user", True),
+        ("/global/activity/cache_hits", "internal_user", True),
+        ("/health/services", "internal_user", True),
+        # Internal User - BLOCKED from admin routes (security fix)
+        ("/config/update", "internal_user", False),
+        ("/config/pass_through_endpoint", "internal_user", False),
+        ("/config/field/update", "internal_user", False),
        ("/organization/member_add", "internal_user", False),
+        # Internal User Viewer - allowed spend routes only
+        ("/spend/logs/ui", "internal_user_viewer", True),
+        ("/global/spend/all_tag_names", "internal_user_viewer", True),
+        # Internal User Viewer - blocked from admin routes
+        ("/config/update", "internal_user_viewer", False),
+        ("/key/generate", "internal_user_viewer", False),
    ],
 )
-def test_is_ui_route_allowed(route, user_role, expected_result):
-    from litellm.proxy.auth.auth_checks import _is_ui_route
-    from litellm.proxy._types import LiteLLM_UserTable
+def test_ui_token_route_access(route, user_role, should_be_allowed):
+    """
+    Verify that UI tokens (team_id=litellm-dashboard) go through the same
+    RBAC checks as API tokens. Non-admin dashboard users must not be able
+    to access admin-only routes like /config/update.
+    """
+    from litellm.proxy.auth.auth_checks import _is_api_route_allowed
+    from litellm.proxy._types import LiteLLM_UserTable, UserAPIKeyAuth

    user_obj = LiteLLM_UserTable(
        user_id="3b803c0e-666e-4e99-bd5c-6e534c07e297",
@ -395,18 +406,36 @@ def test_is_ui_route_allowed(route, user_role, expected_result):
        organization_memberships=[],
    )

-    received_args: dict = {
-        "route": route,
-        "user_obj": user_obj,
-    }
-    try:
-        assert _is_ui_route(**received_args) == expected_result
-    except Exception as e:
-        # If expected result is False, we expect an error
-        if expected_result is False:
-            pass
-        else:
-            raise e
+    valid_token = UserAPIKeyAuth(
+        user_id="3b803c0e-666e-4e99-bd5c-6e534c07e297",
+        team_id="litellm-dashboard",
+        user_role=user_role,
+    )
+
+    from starlette.datastructures import URL
+    from fastapi import Request
+
+    request = Request(scope={"type": "http"})
+    request._url = URL(url=route)
+
+    if should_be_allowed:
+        result = _is_api_route_allowed(
+            route=route,
+            request=request,
+            request_data={},
+            valid_token=valid_token,
+            user_obj=user_obj,
+        )
+        assert result is True
+    else:
+        with pytest.raises(Exception):
+            _is_api_route_allowed(
+                route=route,
+                request=request,
+                request_data={},
+                valid_token=valid_token,
+                user_obj=user_obj,
+            )


@pytest.mark.parametrize(
@ -684,7 +713,7 @@ async def test_soft_budget_alert():


 def test_is_allowed_route():
-    from litellm.proxy.auth.auth_checks import _is_allowed_route
+    from litellm.proxy.auth.auth_checks import _is_api_route_allowed
    from litellm.proxy._types import UserAPIKeyAuth
    import datetime

@ -692,7 +721,6 @@ def test_is_allowed_route():

    args = {
        "route": "/embeddings",
-        "token_type": "api",
        "request": request,
        "request_data": {"input": ["hello world"], "model": "embedding-small"},
        "valid_token": UserAPIKeyAuth(
@ -752,7 +780,7 @@ def test_is_allowed_route():
        "user_obj": None,
    }

-    assert _is_allowed_route(**args)
+    assert _is_api_route_allowed(**args)


@pytest.mark.parametrize(
@ -836,7 +864,6 @@ async def test_user_api_key_auth_websocket():
    with patch(
        "litellm.proxy.auth.user_api_key_auth.user_api_key_auth", autospec=True
    ) as mock_user_api_key_auth:
-
        # Make the call to the WebSocket function
        await user_api_key_auth_websocket(mock_websocket)

@ -845,10 +872,14 @@ async def test_user_api_key_auth_websocket():

        # Get the request object that was passed to user_api_key_auth
        request_arg = mock_user_api_key_auth.call_args.kwargs["request"]
-        
+
        # Verify that the request has headers set
-        assert hasattr(request_arg, "headers"), "Request object should have headers attribute"
-        assert "authorization" in request_arg.headers, "Request headers should contain authorization"
+        assert hasattr(
+            request_arg, "headers"
+        ), "Request object should have headers attribute"
+        assert (
+            "authorization" in request_arg.headers
+        ), "Request headers should contain authorization"
        assert request_arg.headers["authorization"] == "Bearer some_api_key"

        assert (
@ -1036,7 +1067,10 @@ async def test_jwt_non_admin_team_route_access(monkeypatch):

    # Create request
    request = Request(
-        scope={"type": "http", "headers": [(b"authorization", b"Bearer fake.jwt.token")]}
+        scope={
+            "type": "http",
+            "headers": [(b"authorization", b"Bearer fake.jwt.token")],
+        }
    )
    request._url = URL(url="/team/new")

@ -1101,14 +1135,14 @@ async def test_x_litellm_api_key():
    ignored_key = "aj12445"

    # Create request with headers as bytes
-    request = Request(
-        scope={
-            "type": "http"
-        }
-    )
+    request = Request(scope={"type": "http"})
    request._url = URL(url="/chat/completions")

-    valid_token = await user_api_key_auth(request=request, api_key="Bearer " + ignored_key, custom_litellm_key_header=master_key)
+    valid_token = await user_api_key_auth(
+        request=request,
+        api_key="Bearer " + ignored_key,
+        custom_litellm_key_header=master_key,
+    )
    assert valid_token.token == hash_token(master_key)


@ -1123,7 +1157,9 @@ async def test_user_api_key_from_query_param():
    from litellm.proxy.proxy_server import hash_token, user_api_key_cache

    user_key = "sk-query-1234"
-    user_api_key_cache.set_cache(key=hash_token(user_key), value=UserAPIKeyAuth(token=hash_token(user_key)))
+    user_api_key_cache.set_cache(
+        key=hash_token(user_key), value=UserAPIKeyAuth(token=hash_token(user_key))
+    )

    setattr(litellm.proxy.proxy_server, "user_api_key_cache", user_api_key_cache)
    setattr(litellm.proxy.proxy_server, "master_key", "sk-1234")
@ -1136,7 +1172,9 @@ async def test_user_api_key_from_query_param():
            "query_string": f"alt=sse&key={user_key}".encode(),
        }
    )
-    request._url = URL(url=f"/v1beta/models/gemini:streamGenerateContent?alt=sse&key={user_key}")
+    request._url = URL(
+        url=f"/v1beta/models/gemini:streamGenerateContent?alt=sse&key={user_key}"
+    )

    async def return_body():
        return b"{}"
@ -1145,4 +1183,3 @@ async def test_user_api_key_from_query_param():

    valid_token = await user_api_key_auth(request=request, api_key="")
    assert valid_token.token == hash_token(user_key)
-
--- a/tests/test_litellm/llms/bedrock/messages/invoke_transformations/test_anthropic_claude3_transformation.py
+++ b/tests/test_litellm/llms/bedrock/messages/invoke_transformations/test_anthropic_claude3_transformation.py
@ -3,6 +3,7 @@ import json
 import os
 import sys
 from datetime import datetime
+from unittest.mock import Mock

 import pytest

@ -125,7 +126,8 @@ async def test_bedrock_sse_wrapper_keeps_usage_in_message_start_and_message_delt
    assert "usage" in delta_json
    assert delta_json["usage"]["cache_creation_input_tokens"] == 1562
    assert delta_json["usage"]["cache_read_input_tokens"] == 32392
-    assert delta_json["usage"]["input_tokens"] == 3 + 1562 + 32392
+    assert delta_json["usage"]["input_tokens"] == 3
+    assert delta_json["usage"]["output_tokens"] == 8


 def test_chunk_parser_usage_transformation():
@ -402,3 +404,111 @@ def test_bedrock_messages_strips_output_config_with_output_format():

    assert "output_config" not in result
    assert "output_format" not in result
+
+
+@pytest.mark.asyncio
+async def test_promote_message_stop_usage_preserves_message_delta_output_tokens():
+    """
+    Bedrock unified /messages streaming can send full usage on message_delta and a
+    conflicting smaller usage on message_stop (e.g. output_tokens 9 vs 12).
+    _promote_message_stop_usage must not replace message_delta output_tokens.
+    """
+    cfg = AmazonAnthropicClaudeMessagesConfig()
+
+    async def _stream():  # type: ignore[return-type]
+        yield {
+            "type": "message_delta",
+            "delta": {"stop_reason": "end_turn", "stop_sequence": None},
+            "usage": {
+                "input_tokens": 3,
+                "cache_creation_input_tokens": 10553,
+                "cache_read_input_tokens": 25490,
+                "output_tokens": 12,
+            },
+        }
+        yield {
+            "type": "message_stop",
+            "usage": {"input_tokens": 3, "output_tokens": 9},
+        }
+
+    merged: list[dict] = []
+    async for chunk in cfg._promote_message_stop_usage(_stream()):
+        if isinstance(chunk, dict):
+            merged.append(chunk)
+
+    assert len(merged) >= 1
+    delta_out = merged[0]
+    assert delta_out["type"] == "message_delta"
+    assert delta_out["usage"]["output_tokens"] == 12
+    assert delta_out["usage"]["cache_creation_input_tokens"] == 10553
+    assert delta_out["usage"]["cache_read_input_tokens"] == 25490
+    assert delta_out["usage"]["input_tokens"] == 3
+
+
+@pytest.mark.asyncio
+async def test_unified_bedrock_messages_sse_usage_and_cost_claude_sonnet_46():
+    """
+    End-to-end for Bedrock Invoke Anthropic Messages (unified) streaming path:
+    dict chunks -> _promote_message_stop_usage -> bedrock_sse_wrapper SSE bytes ->
+    same logging reconstruction as Anthropic /messages. Ensures token counts and
+    completion_cost match model_prices for us.anthropic.claude-sonnet-4-6.
+    """
+    from litellm import completion_cost
+    from litellm.proxy.pass_through_endpoints.llm_provider_handlers.anthropic_passthrough_logging_handler import (
+        AnthropicPassthroughLoggingHandler,
+    )
+
+    cfg = AmazonAnthropicClaudeMessagesConfig()
+
+    async def _stream():  # type: ignore[return-type]
+        yield {
+            "type": "message_delta",
+            "delta": {"stop_reason": "end_turn", "stop_sequence": None},
+            "usage": {
+                "input_tokens": 3,
+                "cache_creation_input_tokens": 10553,
+                "cache_read_input_tokens": 25490,
+                "output_tokens": 12,
+            },
+        }
+        yield {
+            "type": "message_stop",
+            "usage": {"input_tokens": 3, "output_tokens": 9},
+        }
+
+    logging_obj = LiteLLMLoggingObj(
+        model="bedrock/us.anthropic.claude-sonnet-4-6",
+        messages=[{"role": "user", "content": "Hello"}],
+        stream=True,
+        call_type="chat",
+        start_time=datetime.now(),
+        litellm_call_id="test_unified_bedrock_messages_sse_cost",
+        function_id="test_unified_bedrock_messages_sse_cost",
+    )
+
+    collected: list[bytes] = []
+    async for sse in cfg.bedrock_sse_wrapper(
+        completion_stream=_stream(),
+        litellm_logging_obj=logging_obj,
+        request_body={"model": "us.anthropic.claude-sonnet-4-6"},
+    ):
+        collected.append(sse)
+
+    built = AnthropicPassthroughLoggingHandler._build_complete_streaming_response(
+        all_chunks=collected,
+        model="us.anthropic.claude-sonnet-4-6",
+        litellm_logging_obj=Mock(),
+    )
+    assert built.usage is not None
+    assert built.usage.completion_tokens == 12
+    assert built.usage.prompt_tokens == 36046
+    assert built.usage.total_tokens == 36058
+    assert built.usage.cache_creation_input_tokens == 10553
+    assert built.usage.cache_read_input_tokens == 25490
+
+    cost = completion_cost(
+        completion_response=built,
+        model="bedrock/us.anthropic.claude-sonnet-4-6",
+        custom_llm_provider="bedrock",
+    )
+    assert cost == pytest.approx(0.052150725, rel=0, abs=1e-9)
--- a/tests/test_litellm/proxy/management_endpoints/test_key_management_endpoints.py
+++ b/tests/test_litellm/proxy/management_endpoints/test_key_management_endpoints.py
@ -8579,3 +8579,170 @@ def test_enforce_upperbound_no_config_is_noop():
        assert data.tpm_limit == 999999
    finally:
        litellm.upperbound_key_generate_params = original
+
+
+class TestAllowedRoutesCallerPermission:
+    """
+    Non-admins must not be able to set `allowed_routes` on a key. The field
+    bypasses the role-based route gate in
+    RouteChecks.non_proxy_admin_allowed_routes_check, so allowing a non-admin
+    to populate it grants them arbitrary endpoint access.
+    """
+
+    @pytest.mark.asyncio
+    async def test_non_admin_generate_key_with_allowed_routes_rejected(self):
+        data = GenerateKeyRequest(
+            key_alias="escalate",
+            allowed_routes=["/*"],
+        )
+        user_api_key_dict = UserAPIKeyAuth(
+            user_id="internal-user-123",
+            user_role=LitellmUserRoles.INTERNAL_USER,
+        )
+        mock_prisma_client = AsyncMock()
+
+        with patch("litellm.proxy.proxy_server.prisma_client", mock_prisma_client), patch(
+            "litellm.proxy.proxy_server.user_api_key_cache", MagicMock()
+        ), patch("litellm.proxy.proxy_server.user_custom_key_generate", None), patch(
+            "litellm.proxy.management_endpoints.key_management_endpoints._common_key_generation_helper",
+            new_callable=AsyncMock,
+            return_value=MagicMock(),
+        ):
+            with pytest.raises(ProxyException) as exc_info:
+                await generate_key_fn(
+                    data=data,
+                    user_api_key_dict=user_api_key_dict,
+                    litellm_changed_by=None,
+                )
+        assert str(exc_info.value.code) == "403"
+        assert "allowed_routes" in str(exc_info.value.message)
+
+    @pytest.mark.asyncio
+    async def test_admin_generate_key_with_allowed_routes_allowed(self):
+        data = GenerateKeyRequest(
+            key_alias="admin-key",
+            allowed_routes=["/chat/completions"],
+            user_id="admin-user",
+        )
+        user_api_key_dict = UserAPIKeyAuth(
+            user_id="admin-user",
+            user_role=LitellmUserRoles.PROXY_ADMIN,
+        )
+        mock_prisma_client = AsyncMock()
+        stub_response = MagicMock()
+
+        with patch("litellm.proxy.proxy_server.prisma_client", mock_prisma_client), patch(
+            "litellm.proxy.proxy_server.user_api_key_cache", MagicMock()
+        ), patch("litellm.proxy.proxy_server.user_custom_key_generate", None), patch(
+            "litellm.proxy.management_endpoints.key_management_endpoints._common_key_generation_helper",
+            new_callable=AsyncMock,
+            return_value=stub_response,
+        ):
+            result = await generate_key_fn(
+                data=data,
+                user_api_key_dict=user_api_key_dict,
+                litellm_changed_by=None,
+            )
+        assert result is stub_response
+
+    @pytest.mark.asyncio
+    async def test_non_admin_generate_key_default_empty_allowed_routes_ok(self):
+        """
+        Regression guard: GenerateKeyRequest.allowed_routes defaults to [], so
+        the helper must treat empty-list as "not set" or every non-admin key
+        creation breaks.
+        """
+        data = GenerateKeyRequest(key_alias="plain-key")
+        user_api_key_dict = UserAPIKeyAuth(
+            user_id="internal-user-123",
+            user_role=LitellmUserRoles.INTERNAL_USER,
+        )
+        mock_prisma_client = AsyncMock()
+        stub_response = MagicMock()
+
+        with patch("litellm.proxy.proxy_server.prisma_client", mock_prisma_client), patch(
+            "litellm.proxy.proxy_server.user_api_key_cache", MagicMock()
+        ), patch("litellm.proxy.proxy_server.user_custom_key_generate", None), patch(
+            "litellm.proxy.management_endpoints.key_management_endpoints._common_key_generation_helper",
+            new_callable=AsyncMock,
+            return_value=stub_response,
+        ):
+            result = await generate_key_fn(
+                data=data,
+                user_api_key_dict=user_api_key_dict,
+                litellm_changed_by=None,
+            )
+        assert result is stub_response
+
+    @pytest.mark.asyncio
+    async def test_non_admin_update_key_with_allowed_routes_rejected(self):
+        from litellm.proxy.management_endpoints.key_management_endpoints import (
+            update_key_fn,
+        )
+
+        data = UpdateKeyRequest(key="sk-test", allowed_routes=["/*"])
+        user_api_key_dict = UserAPIKeyAuth(
+            user_id="internal-user-123",
+            user_role=LitellmUserRoles.INTERNAL_USER,
+        )
+        mock_prisma_client = AsyncMock()
+
+        with patch("litellm.proxy.proxy_server.prisma_client", mock_prisma_client), patch(
+            "litellm.proxy.proxy_server.user_api_key_cache", MagicMock()
+        ), patch("litellm.proxy.proxy_server.user_custom_key_update", None), patch(
+            "litellm.proxy.proxy_server.llm_router", None
+        ), patch("litellm.proxy.proxy_server.premium_user", True), patch(
+            "litellm.proxy.proxy_server.proxy_logging_obj", MagicMock()
+        ), patch(
+            "litellm.proxy.management_endpoints.key_management_endpoints._get_and_validate_existing_key",
+            new_callable=AsyncMock,
+            return_value=MagicMock(),
+        ):
+            with pytest.raises(ProxyException) as exc_info:
+                await update_key_fn(
+                    request=MagicMock(),
+                    data=data,
+                    user_api_key_dict=user_api_key_dict,
+                    litellm_changed_by=None,
+                )
+        assert str(exc_info.value.code) == "403"
+        assert "allowed_routes" in str(exc_info.value.message)
+
+
+def test_jinja_prompt_manager_is_sandboxed():
+    """
+    PromptManager renders user-supplied templates via /prompts/test, so its
+    jinja env must reject access to unsafe Python attributes like
+    ``__class__`` and ``__mro__``.
+    """
+    from jinja2.exceptions import SecurityError
+
+    from litellm.integrations.dotprompt.prompt_manager import PromptManager
+
+    pm = PromptManager()
+    template = pm.jinja_env.from_string("{{ ''.__class__.__mro__ }}")
+    with pytest.raises(SecurityError):
+        template.render()
+
+
+def test_validate_public_image_url_rejects_local_paths():
+    from litellm.proxy.ui_crud_endpoints.proxy_setting_endpoints import (
+        _validate_public_image_url,
+    )
+
+    for bad in ("/etc/passwd", "file:///etc/passwd", "../../etc/passwd"):
+        with pytest.raises(HTTPException) as exc_info:
+            _validate_public_image_url(bad, "logo_url")
+        assert exc_info.value.status_code == 400
+
+
+def test_validate_public_image_url_accepts_http_and_noop_empty():
+    from litellm.proxy.ui_crud_endpoints.proxy_setting_endpoints import (
+        _validate_public_image_url,
+    )
+
+    _validate_public_image_url("https://example.com/logo.png", "logo_url")
+    _validate_public_image_url("http://cdn.internal/logo.svg", "logo_url")
+    _validate_public_image_url(None, "logo_url")
+    _validate_public_image_url("", "logo_url")
+    _validate_public_image_url("   ", "logo_url")
--- a/ui/litellm-dashboard/src/components/user_edit_view.test.tsx
+++ b/ui/litellm-dashboard/src/components/user_edit_view.test.tsx
@ -1,6 +1,6 @@
-import { screen, waitFor } from "@testing-library/react";
+import { cleanup, screen, waitFor } from "@testing-library/react";
 import userEvent from "@testing-library/user-event";
-import { beforeEach, describe, expect, it, vi } from "vitest";
+import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
 import { renderWithProviders } from "../../tests/test-utils";
 import { UserEditView } from "./user_edit_view";

@ -140,6 +140,15 @@ describe("UserEditView", () => {
    vi.clearAllMocks();
  });

+  afterEach(() => {
+    // Tremor's internal Tooltip sets a setTimeout that fires after teardown,
+    // causing "window is not defined". Flush pending timers before cleanup.
+    vi.useFakeTimers();
+    vi.runAllTimers();
+    vi.useRealTimers();
+    cleanup();
+  });
+
  it("should render", async () => {
    renderWithProviders(<UserEditView {...defaultProps} />);