Merge remote-tracking branch 'origin' into litellm_oss_staging_04_09_2026
This commit is contained in:
commit
9a0487553d
@ -4,6 +4,7 @@ title: "April Townhall: Security + Product Roadmap"
|
||||
date: 2026-04-02T07:30:00
|
||||
authors:
|
||||
- krrish
|
||||
- ishaan-alt
|
||||
description: "Join the LiteLLM April townhall on Friday, 10 April at 7:30 AM to learn about LiteLLM's security and product roadmap."
|
||||
tags: [announcement, townhall]
|
||||
hide_table_of_contents: true
|
||||
|
||||
162
docs/my-website/blog/april_townhall_updates/index.md
Normal file
162
docs/my-website/blog/april_townhall_updates/index.md
Normal file
@ -0,0 +1,162 @@
|
||||
---
|
||||
slug: april-townhall-updates
|
||||
title: "April Townhall Updates: CI/CD v2, Stability, and Product Roadmap"
|
||||
date: 2026-04-10T12:00:00
|
||||
authors:
|
||||
- krrish
|
||||
- ishaan-alt
|
||||
description: "A recap of the April LiteLLM town hall covering CI/CD v2, product stability work, and the near-term roadmap."
|
||||
tags: [townhall, security, reliability, product]
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
|
||||
Thank you to everyone who joined our April town hall.
|
||||
|
||||
We used the session to share our CI/CD v2 improvements, product stability work, and what we are prioritizing next across reliability and product roadmap.
|
||||
|
||||
{/* truncate */}
|
||||
|
||||
## CI/CD v2 improvements
|
||||
|
||||
Our CI/CD v2 work is centered around four goals:
|
||||
|
||||
1. **Limit** what each package can access
|
||||
2. **Reduce** the number of sensitive environment variables
|
||||
3. **Avoid** compromised packages
|
||||
4. **Reduce the risk of** release tampering
|
||||
|
||||
#### New architecture: isolated environments
|
||||
|
||||
We have begun moving to isolated environments for distinct CI/CD stages to reduce the chance that a single compromised step can inherit broad access across the entire pipeline.
|
||||
|
||||
<Image
|
||||
img={require('../../img/april_townhall_isolated_environments.png')}
|
||||
style={{width: '900px', height: 'auto', display: 'block'}}
|
||||
/>
|
||||
|
||||
#### Current rollout status
|
||||
|
||||
These changes are deployed in our current release workflow. [See here](https://github.com/BerriAI/litellm/tags)
|
||||
|
||||
#### Independently verify releases
|
||||
|
||||
A key part of CI/CD v2 is supporting independent verification of release artifacts using our published verification process, while reducing reliance on any single credential or release path.
|
||||
|
||||
[**Learn more about how to verify releases**](https://docs.litellm.ai/docs/proxy/docker_image_security)
|
||||
|
||||
<Image
|
||||
img={require('../../img/verify_releases.png')}
|
||||
style={{width: '900px', height: 'auto', display: 'block'}}
|
||||
/>
|
||||
|
||||
## Stability improvements
|
||||
|
||||
### SDLC improvements
|
||||
|
||||
This month, we're focusing on process stability improvements around:
|
||||
- Improving main-branch stability
|
||||
- Mapping UI QA to built Docker images for 1:1 environment parity
|
||||
- Consistent release tags across PyPI and Docker
|
||||
- Fixing release notes publication
|
||||
|
||||
#### Improving main-branch stability
|
||||
|
||||
We're introducing a staging-gated flow:
|
||||
|
||||
<Image
|
||||
img={require('../../img/stable_main.png')}
|
||||
style={{width: '900px', height: 'auto', display: 'block'}}
|
||||
/>
|
||||
|
||||
- Only an internal staging branch can push to `main`.
|
||||
- PRs to that staging branch must pass CircleCI LLM API testing.
|
||||
- Collision handling happens on staging, which is designed to reduce unstable changes reaching `main`.
|
||||
|
||||
#### UI QA in Docker environment
|
||||
|
||||
Moving forward, all UI QA will be performed in the built Docker image that users run.
|
||||
|
||||
Previously, some UI QA paths were run in local environments that did not fully replicate Docker runtime conditions.
|
||||
|
||||
That contributed to release-specific issues, including MCP registration problems in `v1.82.3`.
|
||||
|
||||
#### Consistent release tags
|
||||
|
||||
Today we publish releases for multiple scenarios:
|
||||
- Dev (Built of a PR for a customer-specific scenario)
|
||||
- Nightly (Passes all CI/CD checks)
|
||||
- Release Candidate (Passes all CI/CD checks + manual UI QA)
|
||||
- Stable (intended to pass all CI/CD checks + manual UI QA + 7 days of production testing)
|
||||
|
||||
We are targeting a consistent naming convention across PyPI and Docker by the end of April.
|
||||
|
||||
#### Release notes
|
||||
|
||||
CI/CD v2 changes moved release notes to a manual path. This is a temporary solution while we investigate a better automated workflow. We are targeting a more consistent process by the end of April.
|
||||
|
||||
### Product stability improvements
|
||||
|
||||
#### Stable Prisma migrations
|
||||
|
||||
Today, we have observed several migration failure classes:
|
||||
- Migration not applied
|
||||
- Migration marked applied but incomplete
|
||||
- Migration not applied due to non-root image issues
|
||||
|
||||
We're prioritizing this work this month and have assigned an engineering owner to the effort. Our target is to resolve these error classes by the end of April.
|
||||
|
||||
#### UI type safety
|
||||
|
||||
Another area of focus is improving the stability of the UI. Today, one cause of errors is that the UI maintains its own assumptions about backend API types. This can lead to issues when backend responses differ from UI assumptions.
|
||||
|
||||
We aim to move to having the UI and Backend be in sync with each other, and are exploring OpenAPI-driven mapping to achieve this.
|
||||
|
||||
## Product roadmap
|
||||
|
||||
### Our Assumptions
|
||||
|
||||
Over the next few years, we expect:
|
||||
- Companies will give employees more AI tools.
|
||||
- More AI agents will move into production workflows across HR, finance, support, and operations.
|
||||
|
||||
### Our Inferences
|
||||
#### Near-term
|
||||
|
||||
- AI spend will increase.
|
||||
- Uptime and latency will become even more important.
|
||||
- More AI resources (skills, CLIs, and related assets) will require governance.
|
||||
- Agent and MCP usage patterns will require deeper controls.
|
||||
- Broader developer adoption will increase the need for simpler, more discoverable tooling.
|
||||
|
||||
#### Long-term
|
||||
|
||||
- We expect many organizations to treat agent auditability (how decisions were made across LLM + MCP + sub-agent inputs/outputs) as a compliance expectation.
|
||||
- Permission management will get more complex as user-agent interaction chains deepen.
|
||||
|
||||
Roadmap timelines in this post are targets and may evolve based on validation and user feedback.
|
||||
|
||||
## April investments
|
||||
|
||||
### Reliability
|
||||
|
||||
- Increase uptime for 10k+ RPS scenarios.
|
||||
- Investigate latency overhead for long-running Claude Code requests.
|
||||
|
||||
### Feature reliability
|
||||
|
||||
- Polish MCP authentication.
|
||||
- Better understand how teams are using agents through LiteLLM.
|
||||
|
||||
### Governance
|
||||
|
||||
- Launch Skills as a first-class citizen in LiteLLM.
|
||||
|
||||
## Q&A
|
||||
|
||||
Thank you again for all the questions and direct feedback. We will keep sharing concrete progress updates as these efforts ship.
|
||||
|
||||
## Hiring
|
||||
|
||||
We are actively hiring across several roles, please apply [here](https://jobs.ashbyhq.com/litellm) if you're interested!
|
||||
@ -24,7 +24,7 @@ ishaan:
|
||||
|
||||
# Alias for typo in name
|
||||
ishaan-alt:
|
||||
name: Ishaan Jaff
|
||||
name: Ishaan Jaffer
|
||||
title: CTO, LiteLLM
|
||||
url: https://www.linkedin.com/in/reffajnaahsi/
|
||||
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
|
||||
|
||||
BIN
docs/my-website/img/april_townhall_isolated_environments.png
Normal file
BIN
docs/my-website/img/april_townhall_isolated_environments.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 312 KiB |
BIN
docs/my-website/img/stable_main.png
Normal file
BIN
docs/my-website/img/stable_main.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 236 KiB |
BIN
docs/my-website/img/verify_releases.png
Normal file
BIN
docs/my-website/img/verify_releases.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 1.3 MiB |
@ -7,7 +7,8 @@ from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional, Tuple, Union
|
||||
|
||||
import yaml
|
||||
from jinja2 import DictLoader, Environment, select_autoescape
|
||||
from jinja2 import DictLoader, select_autoescape
|
||||
from jinja2.sandbox import ImmutableSandboxedEnvironment
|
||||
|
||||
|
||||
class PromptTemplate:
|
||||
@ -59,7 +60,10 @@ class PromptManager:
|
||||
self.prompt_directory = Path(prompt_directory) if prompt_directory else None
|
||||
self.prompts: Dict[str, PromptTemplate] = {}
|
||||
self.prompt_file = prompt_file
|
||||
self.jinja_env = Environment(
|
||||
# Sandboxed env: templates can come from user input via /prompts/test,
|
||||
# so we must block access to unsafe Python attributes and mutation of
|
||||
# caller-supplied mutables.
|
||||
self.jinja_env = ImmutableSandboxedEnvironment(
|
||||
loader=DictLoader({}),
|
||||
autoescape=select_autoescape(["html", "xml"]),
|
||||
# Use Handlebars-style delimiters to match Dotprompt spec
|
||||
|
||||
@ -538,9 +538,9 @@ class AmazonAnthropicClaudeMessagesConfig(
|
||||
merges usage from message_start and message_delta but ignores
|
||||
message_stop. This method buffers message_delta and, when
|
||||
message_stop arrives with cache usage, merges those fields into the
|
||||
message_delta usage and also updates the input_tokens on
|
||||
message_delta to include the full count (uncached + cache_creation +
|
||||
cache_read).
|
||||
message_delta usage. input_tokens is kept as the uncached-only
|
||||
count; downstream calculate_usage adds cache tokens to
|
||||
prompt_tokens.
|
||||
"""
|
||||
_CACHE_FIELDS = ("cache_creation_input_tokens", "cache_read_input_tokens")
|
||||
pending_delta = None
|
||||
@ -569,12 +569,7 @@ class AmazonAnthropicClaudeMessagesConfig(
|
||||
|
||||
raw_input = stop_usage.get("input_tokens")
|
||||
if raw_input is not None:
|
||||
uncached = raw_input if isinstance(raw_input, int) else 0
|
||||
raw_cc = delta_usage.get("cache_creation_input_tokens", 0)
|
||||
cache_creation = raw_cc if isinstance(raw_cc, int) else 0
|
||||
raw_cr = delta_usage.get("cache_read_input_tokens", 0)
|
||||
cache_read = raw_cr if isinstance(raw_cr, int) else 0
|
||||
delta_usage["input_tokens"] = uncached + cache_creation + cache_read
|
||||
delta_usage["input_tokens"] = raw_input if isinstance(raw_input, int) else 0
|
||||
|
||||
if delta_usage:
|
||||
pending_delta["usage"] = delta_usage # type: ignore[arg-type]
|
||||
|
||||
@ -5,6 +5,7 @@ Handles extraction of skill content (SKILL.md) from stored ZIP files
|
||||
and injection into the system prompt for non-Anthropic models.
|
||||
"""
|
||||
|
||||
import posixpath
|
||||
import zipfile
|
||||
from io import BytesIO
|
||||
from typing import Any, Dict, List, Optional
|
||||
@ -103,8 +104,18 @@ class SkillPromptInjectionHandler:
|
||||
else:
|
||||
clean_path = name
|
||||
|
||||
if clean_path:
|
||||
files[clean_path] = zf.read(name)
|
||||
if not clean_path:
|
||||
continue
|
||||
|
||||
# Ensure the path stays within the intended directory
|
||||
normalized = posixpath.normpath(clean_path)
|
||||
if normalized.startswith("..") or posixpath.isabs(normalized):
|
||||
verbose_logger.warning(
|
||||
f"SkillPromptInjectionHandler: Skipping entry with invalid path in skill {skill.skill_id}: {name}"
|
||||
)
|
||||
continue
|
||||
|
||||
files[normalized] = zf.read(name)
|
||||
except Exception as e:
|
||||
verbose_logger.warning(
|
||||
f"SkillPromptInjectionHandler: Error extracting files from skill {skill.skill_id}: {e}"
|
||||
|
||||
@ -94,9 +94,15 @@ class SkillsSandboxExecutor:
|
||||
|
||||
# Create a temp directory to stage files
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
tmpdir_abs = os.path.abspath(tmpdir)
|
||||
for path, content in skill_files.items():
|
||||
# Create the file in temp directory
|
||||
local_path = os.path.join(tmpdir, path)
|
||||
local_path = os.path.abspath(os.path.join(tmpdir, path))
|
||||
if not local_path.startswith(tmpdir_abs + os.sep):
|
||||
verbose_logger.warning(
|
||||
f"SkillsSandboxExecutor: Skipping file with invalid path: {path}"
|
||||
)
|
||||
continue
|
||||
os.makedirs(os.path.dirname(local_path), exist_ok=True)
|
||||
with open(local_path, "wb") as f:
|
||||
f.write(content)
|
||||
|
||||
@ -494,10 +494,12 @@ class LiteLLMRoutes(enum.Enum):
|
||||
"/v2/key/info",
|
||||
"/model_group/info",
|
||||
"/health",
|
||||
"/health/services",
|
||||
"/key/list",
|
||||
"/user/filter/ui",
|
||||
"/models",
|
||||
"/v1/models",
|
||||
"/sso/get/ui_settings",
|
||||
]
|
||||
|
||||
# NOTE: ROUTES ONLY FOR MASTER KEY - only the Master Key should be able to Reset Spend
|
||||
@ -566,6 +568,8 @@ class LiteLLMRoutes(enum.Enum):
|
||||
"/spend/tags",
|
||||
"/spend/calculate",
|
||||
"/spend/logs",
|
||||
"/spend/logs/ui",
|
||||
"/spend/logs/session/ui",
|
||||
"/cost/estimate",
|
||||
]
|
||||
|
||||
@ -581,6 +585,7 @@ class LiteLLMRoutes(enum.Enum):
|
||||
"/global/spend/report",
|
||||
"/global/spend/provider",
|
||||
"/global/spend/tags",
|
||||
"/global/spend/all_tag_names",
|
||||
]
|
||||
|
||||
public_routes = set(
|
||||
@ -602,6 +607,9 @@ class LiteLLMRoutes(enum.Enum):
|
||||
]
|
||||
)
|
||||
|
||||
# Retained for backwards compatibility with JWT auth configs that reference
|
||||
# "ui_routes" in admin_allowed_routes. Not used by the proxy's own route
|
||||
# authorization — UI tokens now go through the same RBAC path as API tokens.
|
||||
ui_routes = [
|
||||
"/sso",
|
||||
"/sso/get/ui_settings",
|
||||
@ -627,19 +635,16 @@ class LiteLLMRoutes(enum.Enum):
|
||||
|
||||
internal_user_routes = (
|
||||
[
|
||||
"/global/spend/tags",
|
||||
"/global/spend/keys",
|
||||
"/global/spend/models",
|
||||
"/global/spend/provider",
|
||||
"/global/spend/end_users",
|
||||
"/global/activity",
|
||||
"/global/activity/model",
|
||||
"/global/activity/cache_hits",
|
||||
"/v1/models/{model_id}",
|
||||
"/models/{model_id}",
|
||||
"/guardrails/list",
|
||||
"/v2/guardrails/list",
|
||||
]
|
||||
+ spend_tracking_routes
|
||||
+ global_spend_tracking_routes
|
||||
+ key_management_routes
|
||||
)
|
||||
|
||||
@ -694,6 +699,9 @@ class LiteLLMRoutes(enum.Enum):
|
||||
"/tag/list",
|
||||
"/audit",
|
||||
"/audit/{id}",
|
||||
"/global/activity",
|
||||
"/global/activity/model",
|
||||
"/global/activity/cache_hits",
|
||||
] + info_routes
|
||||
|
||||
# All routes accesible by an Org Admin
|
||||
@ -892,9 +900,9 @@ class GenerateRequestBase(LiteLLMPydanticObjectBase):
|
||||
allowed_cache_controls: Optional[list] = []
|
||||
config: Optional[dict] = {}
|
||||
permissions: Optional[dict] = {}
|
||||
model_max_budget: Optional[dict] = (
|
||||
{}
|
||||
) # {"gpt-4": 5.0, "gpt-3.5-turbo": 5.0}, defaults to {}
|
||||
model_max_budget: Optional[
|
||||
dict
|
||||
] = {} # {"gpt-4": 5.0, "gpt-3.5-turbo": 5.0}, defaults to {}
|
||||
|
||||
model_config = ConfigDict(protected_namespaces=())
|
||||
model_rpm_limit: Optional[dict] = None
|
||||
@ -1036,9 +1044,9 @@ class RegenerateKeyRequest(GenerateKeyRequest):
|
||||
spend: Optional[float] = None
|
||||
metadata: Optional[dict] = None
|
||||
new_master_key: Optional[str] = None
|
||||
grace_period: Optional[str] = (
|
||||
None # Duration to keep old key valid (e.g. "24h", "2d"); None = immediate revoke
|
||||
)
|
||||
grace_period: Optional[
|
||||
str
|
||||
] = None # Duration to keep old key valid (e.g. "24h", "2d"); None = immediate revoke
|
||||
|
||||
|
||||
class ResetSpendRequest(LiteLLMPydanticObjectBase):
|
||||
@ -1562,12 +1570,12 @@ class NewCustomerRequest(BudgetNewRequest):
|
||||
blocked: bool = False # allow/disallow requests for this end-user
|
||||
budget_id: Optional[str] = None # give either a budget_id or max_budget
|
||||
spend: Optional[float] = None
|
||||
allowed_model_region: Optional[AllowedModelRegion] = (
|
||||
None # require all user requests to use models in this specific region
|
||||
)
|
||||
default_model: Optional[str] = (
|
||||
None # if no equivalent model in allowed region - default all requests to this model
|
||||
)
|
||||
allowed_model_region: Optional[
|
||||
AllowedModelRegion
|
||||
] = None # require all user requests to use models in this specific region
|
||||
default_model: Optional[
|
||||
str
|
||||
] = None # if no equivalent model in allowed region - default all requests to this model
|
||||
object_permission: Optional[LiteLLM_ObjectPermissionBase] = None
|
||||
|
||||
@model_validator(mode="before")
|
||||
@ -1590,12 +1598,12 @@ class UpdateCustomerRequest(LiteLLMPydanticObjectBase):
|
||||
blocked: bool = False # allow/disallow requests for this end-user
|
||||
max_budget: Optional[float] = None
|
||||
budget_id: Optional[str] = None # give either a budget_id or max_budget
|
||||
allowed_model_region: Optional[AllowedModelRegion] = (
|
||||
None # require all user requests to use models in this specific region
|
||||
)
|
||||
default_model: Optional[str] = (
|
||||
None # if no equivalent model in allowed region - default all requests to this model
|
||||
)
|
||||
allowed_model_region: Optional[
|
||||
AllowedModelRegion
|
||||
] = None # require all user requests to use models in this specific region
|
||||
default_model: Optional[
|
||||
str
|
||||
] = None # if no equivalent model in allowed region - default all requests to this model
|
||||
object_permission: Optional[LiteLLM_ObjectPermissionBase] = None
|
||||
|
||||
|
||||
@ -1685,15 +1693,15 @@ class NewTeamRequest(TeamBase):
|
||||
] = None # raise an error if 'guaranteed_throughput' is set and we're overallocating tpm
|
||||
|
||||
model_tpm_limit: Optional[Dict[str, int]] = None
|
||||
team_member_budget: Optional[float] = (
|
||||
None # allow user to set a budget for all team members
|
||||
)
|
||||
team_member_rpm_limit: Optional[int] = (
|
||||
None # allow user to set RPM limit for all team members
|
||||
)
|
||||
team_member_tpm_limit: Optional[int] = (
|
||||
None # allow user to set TPM limit for all team members
|
||||
)
|
||||
team_member_budget: Optional[
|
||||
float
|
||||
] = None # allow user to set a budget for all team members
|
||||
team_member_rpm_limit: Optional[
|
||||
int
|
||||
] = None # allow user to set RPM limit for all team members
|
||||
team_member_tpm_limit: Optional[
|
||||
int
|
||||
] = None # allow user to set TPM limit for all team members
|
||||
team_member_key_duration: Optional[str] = None # e.g. "1d", "1w", "1m"
|
||||
team_member_budget_duration: Optional[str] = None # e.g. "30d", "1mo"
|
||||
allowed_vector_store_indexes: Optional[List[AllowedVectorStoreIndexItem]] = None
|
||||
@ -1790,9 +1798,9 @@ class BlockKeyRequest(LiteLLMPydanticObjectBase):
|
||||
|
||||
class AddTeamCallback(LiteLLMPydanticObjectBase):
|
||||
callback_name: str
|
||||
callback_type: Optional[Literal["success", "failure", "success_and_failure"]] = (
|
||||
"success_and_failure"
|
||||
)
|
||||
callback_type: Optional[
|
||||
Literal["success", "failure", "success_and_failure"]
|
||||
] = "success_and_failure"
|
||||
callback_vars: Dict[str, str]
|
||||
|
||||
@model_validator(mode="before")
|
||||
@ -2134,9 +2142,9 @@ class ConfigList(LiteLLMPydanticObjectBase):
|
||||
stored_in_db: Optional[bool]
|
||||
field_default_value: Any
|
||||
premium_field: bool = False
|
||||
nested_fields: Optional[List[FieldDetail]] = (
|
||||
None # For nested dictionary or Pydantic fields
|
||||
)
|
||||
nested_fields: Optional[
|
||||
List[FieldDetail]
|
||||
] = None # For nested dictionary or Pydantic fields
|
||||
|
||||
|
||||
class UserHeaderMapping(LiteLLMPydanticObjectBase):
|
||||
@ -2495,9 +2503,9 @@ class UserAPIKeyAuth(
|
||||
user_max_budget: Optional[float] = None
|
||||
request_route: Optional[str] = None
|
||||
user: Optional[Any] = None # Expanded user object when expand=user is used
|
||||
created_by_user: Optional[Any] = (
|
||||
None # Expanded created_by user when expand=user is used
|
||||
)
|
||||
created_by_user: Optional[
|
||||
Any
|
||||
] = None # Expanded created_by user when expand=user is used
|
||||
end_user_object_permission: Optional[LiteLLM_ObjectPermissionTable] = None
|
||||
# Decoded upstream IdP claims (groups, roles, etc.) propagated by JWT auth machinery
|
||||
# and forwarded into outbound tokens by guardrails such as MCPJWTSigner.
|
||||
@ -2636,9 +2644,9 @@ class LiteLLM_OrganizationMembershipTable(LiteLLMPydanticObjectBase):
|
||||
budget_id: Optional[str] = None
|
||||
created_at: datetime
|
||||
updated_at: datetime
|
||||
user: Optional[Any] = (
|
||||
None # You might want to replace 'Any' with a more specific type if available
|
||||
)
|
||||
user: Optional[
|
||||
Any
|
||||
] = None # You might want to replace 'Any' with a more specific type if available
|
||||
litellm_budget_table: Optional[LiteLLM_BudgetTable] = None
|
||||
user_email: Optional[str] = None
|
||||
|
||||
@ -3793,9 +3801,9 @@ class TeamModelDeleteRequest(BaseModel):
|
||||
# Organization Member Requests
|
||||
class OrganizationMemberAddRequest(OrgMemberAddRequest):
|
||||
organization_id: str
|
||||
max_budget_in_organization: Optional[float] = (
|
||||
None # Users max budget within the organization
|
||||
)
|
||||
max_budget_in_organization: Optional[
|
||||
float
|
||||
] = None # Users max budget within the organization
|
||||
|
||||
|
||||
class OrganizationMemberDeleteRequest(MemberDeleteRequest):
|
||||
@ -4050,9 +4058,9 @@ class ProviderBudgetResponse(LiteLLMPydanticObjectBase):
|
||||
Maps provider names to their budget configs.
|
||||
"""
|
||||
|
||||
providers: Dict[str, ProviderBudgetResponseObject] = (
|
||||
{}
|
||||
) # Dictionary mapping provider names to their budget configurations
|
||||
providers: Dict[
|
||||
str, ProviderBudgetResponseObject
|
||||
] = {} # Dictionary mapping provider names to their budget configurations
|
||||
|
||||
|
||||
class ProxyStateVariables(TypedDict):
|
||||
@ -4214,9 +4222,9 @@ class LiteLLM_JWTAuth(LiteLLMPydanticObjectBase):
|
||||
enforce_rbac: bool = False
|
||||
roles_jwt_field: Optional[str] = None # v2 on role mappings
|
||||
role_mappings: Optional[List[RoleMapping]] = None
|
||||
object_id_jwt_field: Optional[str] = (
|
||||
None # can be either user / team, inferred from the role mapping
|
||||
)
|
||||
object_id_jwt_field: Optional[
|
||||
str
|
||||
] = None # can be either user / team, inferred from the role mapping
|
||||
scope_mappings: Optional[List[ScopeMapping]] = None
|
||||
enforce_scope_based_access: bool = False
|
||||
enforce_team_based_model_access: bool = False
|
||||
|
||||
@ -28,6 +28,7 @@ from litellm.types.agents import (
|
||||
MakeAgentsPublicRequest,
|
||||
PatchAgentRequest,
|
||||
)
|
||||
from litellm.litellm_core_utils.litellm_logging import _get_masked_values
|
||||
from litellm.types.llms.custom_http import httpxSpecialProvider
|
||||
from litellm.types.proxy.management_endpoints.common_daily_activity import (
|
||||
SpendAnalyticsPaginatedResponse,
|
||||
@ -36,6 +37,28 @@ from litellm.types.proxy.management_endpoints.common_daily_activity import (
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
def _redact_sensitive_agent_fields(
|
||||
agents: List[AgentResponse],
|
||||
) -> List[AgentResponse]:
|
||||
"""
|
||||
Return copies of the given agents with sensitive configuration fields
|
||||
redacted. The original objects are not modified.
|
||||
"""
|
||||
redacted: List[AgentResponse] = []
|
||||
for agent in agents:
|
||||
copy = agent.model_copy(deep=True)
|
||||
copy.static_headers = None
|
||||
copy.extra_headers = None
|
||||
if copy.litellm_params:
|
||||
copy.litellm_params = _get_masked_values(
|
||||
copy.litellm_params,
|
||||
unmasked_length=4,
|
||||
number_of_asterisks=4,
|
||||
)
|
||||
redacted.append(copy)
|
||||
return redacted
|
||||
|
||||
|
||||
def _check_agent_management_permission(user_api_key_dict: UserAPIKeyAuth) -> None:
|
||||
"""
|
||||
Raises HTTP 403 if the caller does not have permission to create, update,
|
||||
@ -183,6 +206,14 @@ async def get_agents(
|
||||
agent.agent_id in litellm.public_agent_groups
|
||||
)
|
||||
|
||||
# Redact sensitive fields for non-admin users
|
||||
is_admin = (
|
||||
user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN
|
||||
or user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value
|
||||
)
|
||||
if not is_admin:
|
||||
returned_agents = _redact_sensitive_agent_fields(returned_agents)
|
||||
|
||||
if health_check:
|
||||
agents_with_url = [
|
||||
agent
|
||||
@ -399,6 +430,14 @@ async def get_agent_by_id(
|
||||
status_code=404, detail=f"Agent with ID {agent_id} not found"
|
||||
)
|
||||
|
||||
# Redact sensitive fields for non-admin users
|
||||
is_admin = (
|
||||
user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN
|
||||
or user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value
|
||||
)
|
||||
if not is_admin:
|
||||
agent = _redact_sensitive_agent_fields([agent])[0]
|
||||
|
||||
return agent
|
||||
except HTTPException:
|
||||
raise
|
||||
|
||||
@ -196,9 +196,7 @@ def _is_model_cost_zero(
|
||||
return True
|
||||
|
||||
|
||||
def _is_cost_explicitly_configured(
|
||||
model: str, llm_router: "Router"
|
||||
) -> bool:
|
||||
def _is_cost_explicitly_configured(model: str, llm_router: "Router") -> bool:
|
||||
"""
|
||||
Check if any deployment in the model group has cost fields explicitly
|
||||
set in its litellm.model_cost entry.
|
||||
@ -215,10 +213,7 @@ def _is_cost_explicitly_configured(
|
||||
if model_id is None:
|
||||
continue
|
||||
raw_entry = litellm.model_cost.get(model_id, {})
|
||||
if (
|
||||
"input_cost_per_token" in raw_entry
|
||||
or "output_cost_per_token" in raw_entry
|
||||
):
|
||||
if "input_cost_per_token" in raw_entry or "output_cost_per_token" in raw_entry:
|
||||
return True
|
||||
return False
|
||||
|
||||
@ -596,17 +591,12 @@ async def common_checks( # noqa: PLR0915
|
||||
user_object=user_object, route=route, request_body=request_body
|
||||
)
|
||||
|
||||
token_team = getattr(valid_token, "team_id", None)
|
||||
token_type: Literal["ui", "api"] = (
|
||||
"ui" if token_team is not None and token_team == "litellm-dashboard" else "api"
|
||||
)
|
||||
_is_route_allowed = _is_allowed_route(
|
||||
_is_route_allowed = _is_api_route_allowed(
|
||||
route=route,
|
||||
token_type=token_type,
|
||||
user_obj=user_object,
|
||||
request=request,
|
||||
request_data=request_body,
|
||||
valid_token=valid_token,
|
||||
user_obj=user_object,
|
||||
)
|
||||
|
||||
# 11. [OPTIONAL] Vector store checks - is the object allowed to access the vector store
|
||||
@ -629,31 +619,6 @@ async def common_checks( # noqa: PLR0915
|
||||
return True
|
||||
|
||||
|
||||
def _is_ui_route(
|
||||
route: str,
|
||||
user_obj: Optional[LiteLLM_UserTable] = None,
|
||||
) -> bool:
|
||||
"""
|
||||
- Check if the route is a UI used route
|
||||
"""
|
||||
# this token is only used for managing the ui
|
||||
allowed_routes = LiteLLMRoutes.ui_routes.value
|
||||
# check if the current route startswith any of the allowed routes
|
||||
if (
|
||||
route is not None
|
||||
and isinstance(route, str)
|
||||
and any(route.startswith(allowed_route) for allowed_route in allowed_routes)
|
||||
):
|
||||
# Do something if the current route starts with any of the allowed routes
|
||||
return True
|
||||
elif any(
|
||||
RouteChecks._route_matches_pattern(route=route, pattern=allowed_route)
|
||||
for allowed_route in allowed_routes
|
||||
):
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
def _get_user_role(
|
||||
user_obj: Optional[LiteLLM_UserTable],
|
||||
) -> Optional[LitellmUserRoles]:
|
||||
@ -717,30 +682,6 @@ def _is_user_proxy_admin(user_obj: Optional[LiteLLM_UserTable]):
|
||||
return False
|
||||
|
||||
|
||||
def _is_allowed_route(
|
||||
route: str,
|
||||
token_type: Literal["ui", "api"],
|
||||
request: Request,
|
||||
request_data: dict,
|
||||
valid_token: Optional[UserAPIKeyAuth],
|
||||
user_obj: Optional[LiteLLM_UserTable] = None,
|
||||
) -> bool:
|
||||
"""
|
||||
- Route b/w ui token check and normal token check
|
||||
"""
|
||||
|
||||
if token_type == "ui" and _is_ui_route(route=route, user_obj=user_obj):
|
||||
return True
|
||||
else:
|
||||
return _is_api_route_allowed(
|
||||
route=route,
|
||||
request=request,
|
||||
request_data=request_data,
|
||||
valid_token=valid_token,
|
||||
user_obj=user_obj,
|
||||
)
|
||||
|
||||
|
||||
def _allowed_routes_check(user_route: str, allowed_routes: list) -> bool:
|
||||
"""
|
||||
Return if a user is allowed to access route. Helper function for `allowed_routes_check`.
|
||||
|
||||
@ -60,12 +60,20 @@ def _get_guardrails_list_response(
|
||||
"""
|
||||
Helper function to get the guardrails list response
|
||||
"""
|
||||
from litellm.litellm_core_utils.litellm_logging import _get_masked_values
|
||||
|
||||
guardrail_configs: List[GuardrailInfoResponse] = []
|
||||
for guardrail in guardrails_config:
|
||||
litellm_params = guardrail.get("litellm_params") or {}
|
||||
masked_params = _get_masked_values(
|
||||
litellm_params,
|
||||
unmasked_length=4,
|
||||
number_of_asterisks=4,
|
||||
)
|
||||
guardrail_configs.append(
|
||||
GuardrailInfoResponse(
|
||||
guardrail_name=guardrail.get("guardrail_name"),
|
||||
litellm_params=guardrail.get("litellm_params"),
|
||||
litellm_params=masked_params,
|
||||
guardrail_info=guardrail.get("guardrail_info"),
|
||||
)
|
||||
)
|
||||
|
||||
@ -456,6 +456,34 @@ def handle_key_type(data: GenerateKeyRequest, data_json: dict) -> dict:
|
||||
return data_json
|
||||
|
||||
|
||||
def _check_allowed_routes_caller_permission(
|
||||
allowed_routes: Optional[list],
|
||||
user_api_key_dict: UserAPIKeyAuth,
|
||||
) -> None:
|
||||
"""
|
||||
Only proxy admins may set `allowed_routes` on a key.
|
||||
|
||||
`allowed_routes` bypasses the standard role-based route gate in
|
||||
RouteChecks.non_proxy_admin_allowed_routes_check, so if a non-admin is
|
||||
allowed to set it they can grant themselves access to any endpoint.
|
||||
Non-admins should use `key_type` to pick a preset route bucket instead.
|
||||
"""
|
||||
# Empty list is the default on GenerateKeyRequest — treat as "not set".
|
||||
if not allowed_routes:
|
||||
return
|
||||
if user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value:
|
||||
return
|
||||
raise HTTPException(
|
||||
status_code=403,
|
||||
detail={
|
||||
"error": (
|
||||
"Only proxy admins can set `allowed_routes` on a key. "
|
||||
"Use `key_type` to pick a preset route bucket instead."
|
||||
)
|
||||
},
|
||||
)
|
||||
|
||||
|
||||
async def validate_team_id_used_in_service_account_request(
|
||||
team_id: Optional[str],
|
||||
prisma_client: Optional[PrismaClient],
|
||||
@ -740,9 +768,9 @@ async def _common_key_generation_helper( # noqa: PLR0915
|
||||
request_type="key", **data_json, table_name="key"
|
||||
)
|
||||
|
||||
response[
|
||||
"soft_budget"
|
||||
] = data.soft_budget # include the user-input soft budget in the response
|
||||
response["soft_budget"] = (
|
||||
data.soft_budget
|
||||
) # include the user-input soft budget in the response
|
||||
|
||||
response = GenerateKeyResponse(**response)
|
||||
|
||||
@ -1254,6 +1282,12 @@ async def generate_key_fn(
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_403_FORBIDDEN, detail=message
|
||||
)
|
||||
|
||||
_check_allowed_routes_caller_permission(
|
||||
allowed_routes=data.allowed_routes,
|
||||
user_api_key_dict=user_api_key_dict,
|
||||
)
|
||||
|
||||
# For non-admin internal users: auto-assign caller's user_id if not provided
|
||||
# This prevents creating unbound keys with no user association (LIT-1884)
|
||||
_is_proxy_admin = (
|
||||
@ -1888,6 +1922,11 @@ async def _validate_update_key_data(
|
||||
"""Validate permissions and constraints for key update."""
|
||||
_is_proxy_admin = user_api_key_dict.user_role == LitellmUserRoles.PROXY_ADMIN.value
|
||||
|
||||
_check_allowed_routes_caller_permission(
|
||||
allowed_routes=data.allowed_routes,
|
||||
user_api_key_dict=user_api_key_dict,
|
||||
)
|
||||
|
||||
# Prevent non-admin from removing user_id (setting to empty string) (LIT-1884)
|
||||
if data.user_id is not None and data.user_id == "" and not _is_proxy_admin:
|
||||
raise HTTPException(
|
||||
@ -3233,10 +3272,10 @@ async def delete_verification_tokens(
|
||||
try:
|
||||
if prisma_client:
|
||||
tokens = [_hash_token_if_needed(token=key) for key in tokens]
|
||||
_keys_being_deleted: List[
|
||||
LiteLLM_VerificationToken
|
||||
] = await prisma_client.db.litellm_verificationtoken.find_many(
|
||||
where={"token": {"in": tokens}}
|
||||
_keys_being_deleted: List[LiteLLM_VerificationToken] = (
|
||||
await prisma_client.db.litellm_verificationtoken.find_many(
|
||||
where={"token": {"in": tokens}}
|
||||
)
|
||||
)
|
||||
|
||||
if len(_keys_being_deleted) == 0:
|
||||
@ -3436,9 +3475,9 @@ async def _rotate_master_key( # noqa: PLR0915
|
||||
from litellm.proxy.proxy_server import proxy_config
|
||||
|
||||
try:
|
||||
models: Optional[
|
||||
List
|
||||
] = await prisma_client.db.litellm_proxymodeltable.find_many()
|
||||
models: Optional[List] = (
|
||||
await prisma_client.db.litellm_proxymodeltable.find_many()
|
||||
)
|
||||
except Exception:
|
||||
models = None
|
||||
# 2. process model table
|
||||
@ -4078,11 +4117,11 @@ async def validate_key_list_check(
|
||||
param="user_id",
|
||||
code=status.HTTP_403_FORBIDDEN,
|
||||
)
|
||||
complete_user_info_db_obj: Optional[
|
||||
BaseModel
|
||||
] = await prisma_client.db.litellm_usertable.find_unique(
|
||||
where={"user_id": user_api_key_dict.user_id},
|
||||
include={"organization_memberships": True},
|
||||
complete_user_info_db_obj: Optional[BaseModel] = (
|
||||
await prisma_client.db.litellm_usertable.find_unique(
|
||||
where={"user_id": user_api_key_dict.user_id},
|
||||
include={"organization_memberships": True},
|
||||
)
|
||||
)
|
||||
|
||||
if complete_user_info_db_obj is None:
|
||||
@ -4165,10 +4204,10 @@ async def _fetch_user_team_objects(
|
||||
if complete_user_info is None or not complete_user_info.teams:
|
||||
return []
|
||||
|
||||
teams: Optional[
|
||||
List[BaseModel]
|
||||
] = await prisma_client.db.litellm_teamtable.find_many(
|
||||
where={"team_id": {"in": complete_user_info.teams}}
|
||||
teams: Optional[List[BaseModel]] = (
|
||||
await prisma_client.db.litellm_teamtable.find_many(
|
||||
where={"team_id": {"in": complete_user_info.teams}}
|
||||
)
|
||||
)
|
||||
if teams is None:
|
||||
return []
|
||||
|
||||
@ -1,6 +1,7 @@
|
||||
#### CRUD ENDPOINTS for UI Settings #####
|
||||
import json
|
||||
from typing import Any, Dict, List, Optional, Union
|
||||
from urllib.parse import urlparse
|
||||
|
||||
from fastapi import APIRouter, Depends, File, HTTPException, UploadFile
|
||||
|
||||
@ -817,6 +818,29 @@ async def get_ui_theme_settings():
|
||||
)
|
||||
|
||||
|
||||
def _validate_public_image_url(value: Optional[str], field_name: str) -> None:
|
||||
"""
|
||||
Reject anything that isn't a plain http(s) URL with a host. This value is
|
||||
later served via the unauthenticated /get_image endpoint, so local paths
|
||||
like "/etc/passwd" or "file://..." must not be accepted.
|
||||
"""
|
||||
if value is None:
|
||||
return
|
||||
if not isinstance(value, str) or not value.strip():
|
||||
return
|
||||
parsed = urlparse(value.strip())
|
||||
if parsed.scheme not in ("http", "https") or not parsed.netloc:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail={
|
||||
"error": (
|
||||
f"Invalid {field_name}: must be an http(s) URL with a host. "
|
||||
"Local filesystem paths and non-http schemes are not allowed."
|
||||
)
|
||||
},
|
||||
)
|
||||
|
||||
|
||||
@router.patch(
|
||||
"/update/ui_theme_settings",
|
||||
tags=["UI Theme Settings"],
|
||||
@ -831,6 +855,9 @@ async def update_ui_theme_settings(theme_config: UIThemeConfig):
|
||||
|
||||
from litellm.proxy.proxy_server import proxy_config, store_model_in_db
|
||||
|
||||
_validate_public_image_url(theme_config.logo_url, "logo_url")
|
||||
_validate_public_image_url(theme_config.favicon_url, "favicon_url")
|
||||
|
||||
if store_model_in_db is not True:
|
||||
raise HTTPException(
|
||||
status_code=500,
|
||||
|
||||
@ -2645,7 +2645,7 @@ class PrismaClient:
|
||||
raise e
|
||||
|
||||
async def _query_first_with_cached_plan_fallback(
|
||||
self, sql_query: str
|
||||
self, sql_query: str, *args
|
||||
) -> Optional[dict]:
|
||||
"""
|
||||
Execute a query with automatic fallback for PostgreSQL cached plan errors.
|
||||
@ -2664,7 +2664,7 @@ class PrismaClient:
|
||||
Original exception if not a cached plan error
|
||||
"""
|
||||
try:
|
||||
return await self.db.query_first(query=sql_query)
|
||||
return await self.db.query_first(sql_query, *args)
|
||||
except Exception as e:
|
||||
error_str = str(e)
|
||||
if "cached plan must not change result type" in error_str:
|
||||
@ -2679,7 +2679,7 @@ class PrismaClient:
|
||||
"retrying with fresh plan. This may occur during rolling deployments "
|
||||
"when schema changes are applied."
|
||||
)
|
||||
return await self.db.query_first(query=sql_query_retry)
|
||||
return await self.db.query_first(sql_query_retry, *args)
|
||||
else:
|
||||
raise
|
||||
|
||||
@ -2978,7 +2978,7 @@ class PrismaClient:
|
||||
detail={"error": f"No token passed in. Token={token}"},
|
||||
)
|
||||
|
||||
sql_query = f"""
|
||||
sql_query = """
|
||||
SELECT
|
||||
v.*,
|
||||
t.spend AS team_spend,
|
||||
@ -3016,11 +3016,11 @@ class PrismaClient:
|
||||
LEFT JOIN "LiteLLM_ProjectTable" AS p ON v.project_id = p.project_id
|
||||
LEFT JOIN "LiteLLM_OrganizationTable" AS o ON v.organization_id = o.organization_id
|
||||
LEFT JOIN "LiteLLM_BudgetTable" AS b2 ON o.budget_id = b2.budget_id
|
||||
WHERE v.token = '{token}'
|
||||
WHERE v.token = $1
|
||||
"""
|
||||
|
||||
response = await self._query_first_with_cached_plan_fallback(
|
||||
sql_query
|
||||
sql_query, hashed_token
|
||||
)
|
||||
|
||||
# If not found in main table, check deprecated keys (grace period)
|
||||
|
||||
@ -1967,11 +1967,18 @@ async def _aresponses_websocket(
|
||||
)
|
||||
|
||||
# Extract params that we're passing explicitly to avoid duplicates in **kwargs
|
||||
remaining_kwargs = {
|
||||
k: v
|
||||
for k, v in kwargs.items()
|
||||
if k not in {"user_api_key_dict", "litellm_metadata"}
|
||||
_explicit_keys = {
|
||||
"user_api_key_dict",
|
||||
"litellm_metadata",
|
||||
"custom_llm_provider",
|
||||
"model",
|
||||
"websocket",
|
||||
"litellm_logging_obj",
|
||||
"api_base",
|
||||
"api_key",
|
||||
"timeout",
|
||||
}
|
||||
remaining_kwargs = {k: v for k, v in kwargs.items() if k not in _explicit_keys}
|
||||
|
||||
await base_llm_http_handler.async_responses_websocket(
|
||||
model=model,
|
||||
|
||||
@ -1,6 +1,6 @@
|
||||
[project]
|
||||
name = "litellm"
|
||||
version = "1.83.5"
|
||||
version = "1.83.6"
|
||||
description = "Library to easily interface with LLM API providers"
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.9"
|
||||
@ -238,7 +238,7 @@ source-exclude = [
|
||||
profile = "black"
|
||||
|
||||
[tool.commitizen]
|
||||
version = "1.83.5"
|
||||
version = "1.83.6"
|
||||
version_files = [
|
||||
"pyproject.toml:^version",
|
||||
]
|
||||
|
||||
@ -934,10 +934,7 @@ async def mock_user_object(*args, **kwargs):
|
||||
user_id = kwargs.get("user_id")
|
||||
user_email = kwargs.get("user_email")
|
||||
return LiteLLM_UserTable(
|
||||
spend=0,
|
||||
user_id=user_id,
|
||||
max_budget=None,
|
||||
user_email=user_email
|
||||
spend=0, user_id=user_id, max_budget=None, user_email=user_email
|
||||
)
|
||||
|
||||
|
||||
@ -1170,15 +1167,13 @@ async def test_end_user_jwt_auth(monkeypatch):
|
||||
# use generated key to auth in
|
||||
from litellm import Router
|
||||
from litellm.types.router import RouterGeneralSettings
|
||||
|
||||
|
||||
# Create a router with pass_through_all_models enabled
|
||||
router = Router(
|
||||
model_list=[],
|
||||
router_general_settings=RouterGeneralSettings(
|
||||
pass_through_all_models=True
|
||||
),
|
||||
router_general_settings=RouterGeneralSettings(pass_through_all_models=True),
|
||||
)
|
||||
|
||||
|
||||
setattr(litellm.proxy.proxy_server, "premium_user", True)
|
||||
setattr(
|
||||
litellm.proxy.proxy_server,
|
||||
@ -1196,7 +1191,7 @@ async def test_end_user_jwt_auth(monkeypatch):
|
||||
|
||||
cost_tracking()
|
||||
result = await user_api_key_auth(request=request, api_key=bearer_token)
|
||||
|
||||
|
||||
# Assert that end_user_id is correctly extracted from JWT token's 'sub' field
|
||||
assert result.end_user_id == "81b3e52a-67a6-4efb-9645-70527e101479"
|
||||
|
||||
@ -1228,7 +1223,9 @@ async def test_end_user_jwt_auth(monkeypatch):
|
||||
),
|
||||
)
|
||||
|
||||
with patch("litellm.acompletion", new=AsyncMock(return_value=mock_response)) as mock_completion:
|
||||
with patch(
|
||||
"litellm.acompletion", new=AsyncMock(return_value=mock_response)
|
||||
) as mock_completion:
|
||||
resp = await chat_completion(
|
||||
request=request,
|
||||
fastapi_response=temp_response,
|
||||
@ -1243,10 +1240,13 @@ async def test_end_user_jwt_auth(monkeypatch):
|
||||
# Verify the completion was called with correct end_user_id
|
||||
mock_completion.assert_called_once()
|
||||
call_kwargs = mock_completion.call_args.kwargs
|
||||
|
||||
|
||||
# end_user_id is passed in metadata as 'user_api_key_end_user_id'
|
||||
metadata = call_kwargs.get("metadata", {})
|
||||
assert metadata.get("user_api_key_end_user_id") == "81b3e52a-67a6-4efb-9645-70527e101479"
|
||||
assert (
|
||||
metadata.get("user_api_key_end_user_id")
|
||||
== "81b3e52a-67a6-4efb-9645-70527e101479"
|
||||
)
|
||||
|
||||
|
||||
def test_can_rbac_role_call_route():
|
||||
@ -1278,13 +1278,13 @@ def test_user_api_key_auth_jwt_hashing():
|
||||
"""
|
||||
from litellm.proxy._types import UserAPIKeyAuth
|
||||
from litellm.proxy.auth.handle_jwt import JWTHandler
|
||||
|
||||
|
||||
# Test with a JWT token (3 parts separated by dots)
|
||||
jwt_token = "test-jwt-token-header.payload.signature"
|
||||
|
||||
|
||||
# Create UserAPIKeyAuth instance with JWT
|
||||
user_auth = UserAPIKeyAuth(api_key=jwt_token)
|
||||
|
||||
|
||||
# Verify that the API key is hashed with "hashed-jwt-" prefix
|
||||
# critical - the raw JWT token should not be in the api_key or token
|
||||
assert user_auth.api_key.startswith("hashed-jwt-")
|
||||
@ -1292,19 +1292,18 @@ def test_user_api_key_auth_jwt_hashing():
|
||||
assert jwt_token not in user_auth.api_key
|
||||
assert jwt_token not in user_auth.token
|
||||
|
||||
|
||||
# Test with a regular API key (should not be hashed)
|
||||
regular_api_key = "sk-1234567890abcdef"
|
||||
user_auth_regular = UserAPIKeyAuth(api_key=regular_api_key)
|
||||
|
||||
|
||||
# Verify that regular API key is hashed normally (without "hashed-jwt-" prefix)
|
||||
assert not user_auth_regular.api_key.startswith("hashed-jwt-")
|
||||
assert not user_auth_regular.token.startswith("hashed-jwt-")
|
||||
|
||||
|
||||
# Test with a non-JWT, non-sk string (should not be hashed)
|
||||
non_jwt_key = "some-random-key"
|
||||
user_auth_non_jwt = UserAPIKeyAuth(api_key=non_jwt_key)
|
||||
|
||||
|
||||
# Verify that non-JWT key is not hashed
|
||||
assert user_auth_non_jwt.api_key == non_jwt_key
|
||||
assert user_auth_non_jwt.token == non_jwt_key
|
||||
@ -1315,19 +1314,19 @@ def test_jwt_handler_is_jwt_static_method():
|
||||
Test that JWTHandler.is_jwt is a static method and works correctly
|
||||
"""
|
||||
from litellm.proxy.auth.handle_jwt import JWTHandler
|
||||
|
||||
|
||||
# Test with valid JWT format
|
||||
valid_jwt = "test-jwt-token-header.payload.signature"
|
||||
assert JWTHandler.is_jwt(valid_jwt) == True
|
||||
|
||||
|
||||
# Test with invalid JWT format (only 2 parts)
|
||||
invalid_jwt = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ"
|
||||
assert JWTHandler.is_jwt(invalid_jwt) == False
|
||||
|
||||
|
||||
# Test with regular API key
|
||||
regular_key = "sk-1234567890abcdef"
|
||||
assert JWTHandler.is_jwt(regular_key) == False
|
||||
|
||||
|
||||
# Test with empty string
|
||||
assert JWTHandler.is_jwt("") == False
|
||||
|
||||
@ -1461,7 +1460,13 @@ async def test_auth_jwt_es256_jwk_path(monkeypatch):
|
||||
|
||||
now = int(time.time())
|
||||
token = jwt.encode(
|
||||
{"sub": "alice", "aud": "litellm-proxy", "iss": "http://example", "iat": now, "exp": now + 300},
|
||||
{
|
||||
"sub": "alice",
|
||||
"aud": "litellm-proxy",
|
||||
"iss": "http://example",
|
||||
"iat": now,
|
||||
"exp": now + 300,
|
||||
},
|
||||
ec_priv_pem,
|
||||
algorithm="ES256",
|
||||
headers={"kid": "ec1"},
|
||||
@ -1508,7 +1513,13 @@ async def test_auth_jwt_rs256_regression(monkeypatch):
|
||||
|
||||
now = int(time.time())
|
||||
token = jwt.encode(
|
||||
{"sub": "bob", "aud": "litellm-proxy", "iss": "http://example", "iat": now, "exp": now + 300},
|
||||
{
|
||||
"sub": "bob",
|
||||
"aud": "litellm-proxy",
|
||||
"iss": "http://example",
|
||||
"iat": now,
|
||||
"exp": now + 300,
|
||||
},
|
||||
rsa_priv_pem,
|
||||
algorithm="RS256",
|
||||
headers={"kid": "rsa1"},
|
||||
@ -1540,7 +1551,13 @@ async def test_auth_jwt_mismatched_key_fails(monkeypatch):
|
||||
)
|
||||
now = int(time.time())
|
||||
token = jwt.encode(
|
||||
{"sub": "mallory", "aud": "litellm-proxy", "iss": "http://example", "iat": now, "exp": now + 300},
|
||||
{
|
||||
"sub": "mallory",
|
||||
"aud": "litellm-proxy",
|
||||
"iss": "http://example",
|
||||
"iat": now,
|
||||
"exp": now + 300,
|
||||
},
|
||||
ec_priv_pem,
|
||||
algorithm="ES256",
|
||||
headers={"kid": "ec1"},
|
||||
@ -1566,4 +1583,4 @@ async def test_auth_jwt_mismatched_key_fails(monkeypatch):
|
||||
with patch.object(h, "get_public_key", new=AsyncMock(return_value=rsa_jwk)):
|
||||
with pytest.raises(Exception) as exc:
|
||||
await h.auth_jwt(token)
|
||||
assert "Validation fails" in str(exc.value)
|
||||
assert "Validation fails" in str(exc.value)
|
||||
|
||||
@ -359,27 +359,38 @@ async def test_auth_with_allowed_routes(route, should_raise_error):
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"route, user_role, expected_result",
|
||||
"route, user_role, should_be_allowed",
|
||||
[
|
||||
# Proxy Admin checks
|
||||
# Admin can access everything
|
||||
("/config/update", "proxy_admin", True),
|
||||
("/global/spend/logs", "proxy_admin", True),
|
||||
("/key/delete", "proxy_admin", False),
|
||||
("/key/generate", "proxy_admin", False),
|
||||
("/key/regenerate", "proxy_admin", False),
|
||||
# Internal User checks - allowed routes
|
||||
("/global/activity/cache_hits", "proxy_admin", True),
|
||||
# Internal User - allowed read-only routes
|
||||
("/global/spend/logs", "internal_user", True),
|
||||
("/key/delete", "internal_user", False),
|
||||
("/key/generate", "internal_user", False),
|
||||
("/key/82akk800000000jjsk/regenerate", "internal_user", False),
|
||||
# Internal User Viewer
|
||||
("/key/generate", "internal_user_viewer", False),
|
||||
# Internal User checks - disallowed routes
|
||||
("/spend/logs/ui", "internal_user", True),
|
||||
("/global/activity/cache_hits", "internal_user", True),
|
||||
("/health/services", "internal_user", True),
|
||||
# Internal User - BLOCKED from admin routes (security fix)
|
||||
("/config/update", "internal_user", False),
|
||||
("/config/pass_through_endpoint", "internal_user", False),
|
||||
("/config/field/update", "internal_user", False),
|
||||
("/organization/member_add", "internal_user", False),
|
||||
# Internal User Viewer - allowed spend routes only
|
||||
("/spend/logs/ui", "internal_user_viewer", True),
|
||||
("/global/spend/all_tag_names", "internal_user_viewer", True),
|
||||
# Internal User Viewer - blocked from admin routes
|
||||
("/config/update", "internal_user_viewer", False),
|
||||
("/key/generate", "internal_user_viewer", False),
|
||||
],
|
||||
)
|
||||
def test_is_ui_route_allowed(route, user_role, expected_result):
|
||||
from litellm.proxy.auth.auth_checks import _is_ui_route
|
||||
from litellm.proxy._types import LiteLLM_UserTable
|
||||
def test_ui_token_route_access(route, user_role, should_be_allowed):
|
||||
"""
|
||||
Verify that UI tokens (team_id=litellm-dashboard) go through the same
|
||||
RBAC checks as API tokens. Non-admin dashboard users must not be able
|
||||
to access admin-only routes like /config/update.
|
||||
"""
|
||||
from litellm.proxy.auth.auth_checks import _is_api_route_allowed
|
||||
from litellm.proxy._types import LiteLLM_UserTable, UserAPIKeyAuth
|
||||
|
||||
user_obj = LiteLLM_UserTable(
|
||||
user_id="3b803c0e-666e-4e99-bd5c-6e534c07e297",
|
||||
@ -395,18 +406,36 @@ def test_is_ui_route_allowed(route, user_role, expected_result):
|
||||
organization_memberships=[],
|
||||
)
|
||||
|
||||
received_args: dict = {
|
||||
"route": route,
|
||||
"user_obj": user_obj,
|
||||
}
|
||||
try:
|
||||
assert _is_ui_route(**received_args) == expected_result
|
||||
except Exception as e:
|
||||
# If expected result is False, we expect an error
|
||||
if expected_result is False:
|
||||
pass
|
||||
else:
|
||||
raise e
|
||||
valid_token = UserAPIKeyAuth(
|
||||
user_id="3b803c0e-666e-4e99-bd5c-6e534c07e297",
|
||||
team_id="litellm-dashboard",
|
||||
user_role=user_role,
|
||||
)
|
||||
|
||||
from starlette.datastructures import URL
|
||||
from fastapi import Request
|
||||
|
||||
request = Request(scope={"type": "http"})
|
||||
request._url = URL(url=route)
|
||||
|
||||
if should_be_allowed:
|
||||
result = _is_api_route_allowed(
|
||||
route=route,
|
||||
request=request,
|
||||
request_data={},
|
||||
valid_token=valid_token,
|
||||
user_obj=user_obj,
|
||||
)
|
||||
assert result is True
|
||||
else:
|
||||
with pytest.raises(Exception):
|
||||
_is_api_route_allowed(
|
||||
route=route,
|
||||
request=request,
|
||||
request_data={},
|
||||
valid_token=valid_token,
|
||||
user_obj=user_obj,
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
@ -684,7 +713,7 @@ async def test_soft_budget_alert():
|
||||
|
||||
|
||||
def test_is_allowed_route():
|
||||
from litellm.proxy.auth.auth_checks import _is_allowed_route
|
||||
from litellm.proxy.auth.auth_checks import _is_api_route_allowed
|
||||
from litellm.proxy._types import UserAPIKeyAuth
|
||||
import datetime
|
||||
|
||||
@ -692,7 +721,6 @@ def test_is_allowed_route():
|
||||
|
||||
args = {
|
||||
"route": "/embeddings",
|
||||
"token_type": "api",
|
||||
"request": request,
|
||||
"request_data": {"input": ["hello world"], "model": "embedding-small"},
|
||||
"valid_token": UserAPIKeyAuth(
|
||||
@ -752,7 +780,7 @@ def test_is_allowed_route():
|
||||
"user_obj": None,
|
||||
}
|
||||
|
||||
assert _is_allowed_route(**args)
|
||||
assert _is_api_route_allowed(**args)
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
@ -836,7 +864,6 @@ async def test_user_api_key_auth_websocket():
|
||||
with patch(
|
||||
"litellm.proxy.auth.user_api_key_auth.user_api_key_auth", autospec=True
|
||||
) as mock_user_api_key_auth:
|
||||
|
||||
# Make the call to the WebSocket function
|
||||
await user_api_key_auth_websocket(mock_websocket)
|
||||
|
||||
@ -845,10 +872,14 @@ async def test_user_api_key_auth_websocket():
|
||||
|
||||
# Get the request object that was passed to user_api_key_auth
|
||||
request_arg = mock_user_api_key_auth.call_args.kwargs["request"]
|
||||
|
||||
|
||||
# Verify that the request has headers set
|
||||
assert hasattr(request_arg, "headers"), "Request object should have headers attribute"
|
||||
assert "authorization" in request_arg.headers, "Request headers should contain authorization"
|
||||
assert hasattr(
|
||||
request_arg, "headers"
|
||||
), "Request object should have headers attribute"
|
||||
assert (
|
||||
"authorization" in request_arg.headers
|
||||
), "Request headers should contain authorization"
|
||||
assert request_arg.headers["authorization"] == "Bearer some_api_key"
|
||||
|
||||
assert (
|
||||
@ -1036,7 +1067,10 @@ async def test_jwt_non_admin_team_route_access(monkeypatch):
|
||||
|
||||
# Create request
|
||||
request = Request(
|
||||
scope={"type": "http", "headers": [(b"authorization", b"Bearer fake.jwt.token")]}
|
||||
scope={
|
||||
"type": "http",
|
||||
"headers": [(b"authorization", b"Bearer fake.jwt.token")],
|
||||
}
|
||||
)
|
||||
request._url = URL(url="/team/new")
|
||||
|
||||
@ -1101,14 +1135,14 @@ async def test_x_litellm_api_key():
|
||||
ignored_key = "aj12445"
|
||||
|
||||
# Create request with headers as bytes
|
||||
request = Request(
|
||||
scope={
|
||||
"type": "http"
|
||||
}
|
||||
)
|
||||
request = Request(scope={"type": "http"})
|
||||
request._url = URL(url="/chat/completions")
|
||||
|
||||
valid_token = await user_api_key_auth(request=request, api_key="Bearer " + ignored_key, custom_litellm_key_header=master_key)
|
||||
valid_token = await user_api_key_auth(
|
||||
request=request,
|
||||
api_key="Bearer " + ignored_key,
|
||||
custom_litellm_key_header=master_key,
|
||||
)
|
||||
assert valid_token.token == hash_token(master_key)
|
||||
|
||||
|
||||
@ -1123,7 +1157,9 @@ async def test_user_api_key_from_query_param():
|
||||
from litellm.proxy.proxy_server import hash_token, user_api_key_cache
|
||||
|
||||
user_key = "sk-query-1234"
|
||||
user_api_key_cache.set_cache(key=hash_token(user_key), value=UserAPIKeyAuth(token=hash_token(user_key)))
|
||||
user_api_key_cache.set_cache(
|
||||
key=hash_token(user_key), value=UserAPIKeyAuth(token=hash_token(user_key))
|
||||
)
|
||||
|
||||
setattr(litellm.proxy.proxy_server, "user_api_key_cache", user_api_key_cache)
|
||||
setattr(litellm.proxy.proxy_server, "master_key", "sk-1234")
|
||||
@ -1136,7 +1172,9 @@ async def test_user_api_key_from_query_param():
|
||||
"query_string": f"alt=sse&key={user_key}".encode(),
|
||||
}
|
||||
)
|
||||
request._url = URL(url=f"/v1beta/models/gemini:streamGenerateContent?alt=sse&key={user_key}")
|
||||
request._url = URL(
|
||||
url=f"/v1beta/models/gemini:streamGenerateContent?alt=sse&key={user_key}"
|
||||
)
|
||||
|
||||
async def return_body():
|
||||
return b"{}"
|
||||
@ -1145,4 +1183,3 @@ async def test_user_api_key_from_query_param():
|
||||
|
||||
valid_token = await user_api_key_auth(request=request, api_key="")
|
||||
assert valid_token.token == hash_token(user_key)
|
||||
|
||||
|
||||
@ -3,6 +3,7 @@ import json
|
||||
import os
|
||||
import sys
|
||||
from datetime import datetime
|
||||
from unittest.mock import Mock
|
||||
|
||||
import pytest
|
||||
|
||||
@ -125,7 +126,8 @@ async def test_bedrock_sse_wrapper_keeps_usage_in_message_start_and_message_delt
|
||||
assert "usage" in delta_json
|
||||
assert delta_json["usage"]["cache_creation_input_tokens"] == 1562
|
||||
assert delta_json["usage"]["cache_read_input_tokens"] == 32392
|
||||
assert delta_json["usage"]["input_tokens"] == 3 + 1562 + 32392
|
||||
assert delta_json["usage"]["input_tokens"] == 3
|
||||
assert delta_json["usage"]["output_tokens"] == 8
|
||||
|
||||
|
||||
def test_chunk_parser_usage_transformation():
|
||||
@ -402,3 +404,111 @@ def test_bedrock_messages_strips_output_config_with_output_format():
|
||||
|
||||
assert "output_config" not in result
|
||||
assert "output_format" not in result
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_promote_message_stop_usage_preserves_message_delta_output_tokens():
|
||||
"""
|
||||
Bedrock unified /messages streaming can send full usage on message_delta and a
|
||||
conflicting smaller usage on message_stop (e.g. output_tokens 9 vs 12).
|
||||
_promote_message_stop_usage must not replace message_delta output_tokens.
|
||||
"""
|
||||
cfg = AmazonAnthropicClaudeMessagesConfig()
|
||||
|
||||
async def _stream(): # type: ignore[return-type]
|
||||
yield {
|
||||
"type": "message_delta",
|
||||
"delta": {"stop_reason": "end_turn", "stop_sequence": None},
|
||||
"usage": {
|
||||
"input_tokens": 3,
|
||||
"cache_creation_input_tokens": 10553,
|
||||
"cache_read_input_tokens": 25490,
|
||||
"output_tokens": 12,
|
||||
},
|
||||
}
|
||||
yield {
|
||||
"type": "message_stop",
|
||||
"usage": {"input_tokens": 3, "output_tokens": 9},
|
||||
}
|
||||
|
||||
merged: list[dict] = []
|
||||
async for chunk in cfg._promote_message_stop_usage(_stream()):
|
||||
if isinstance(chunk, dict):
|
||||
merged.append(chunk)
|
||||
|
||||
assert len(merged) >= 1
|
||||
delta_out = merged[0]
|
||||
assert delta_out["type"] == "message_delta"
|
||||
assert delta_out["usage"]["output_tokens"] == 12
|
||||
assert delta_out["usage"]["cache_creation_input_tokens"] == 10553
|
||||
assert delta_out["usage"]["cache_read_input_tokens"] == 25490
|
||||
assert delta_out["usage"]["input_tokens"] == 3
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_unified_bedrock_messages_sse_usage_and_cost_claude_sonnet_46():
|
||||
"""
|
||||
End-to-end for Bedrock Invoke Anthropic Messages (unified) streaming path:
|
||||
dict chunks -> _promote_message_stop_usage -> bedrock_sse_wrapper SSE bytes ->
|
||||
same logging reconstruction as Anthropic /messages. Ensures token counts and
|
||||
completion_cost match model_prices for us.anthropic.claude-sonnet-4-6.
|
||||
"""
|
||||
from litellm import completion_cost
|
||||
from litellm.proxy.pass_through_endpoints.llm_provider_handlers.anthropic_passthrough_logging_handler import (
|
||||
AnthropicPassthroughLoggingHandler,
|
||||
)
|
||||
|
||||
cfg = AmazonAnthropicClaudeMessagesConfig()
|
||||
|
||||
async def _stream(): # type: ignore[return-type]
|
||||
yield {
|
||||
"type": "message_delta",
|
||||
"delta": {"stop_reason": "end_turn", "stop_sequence": None},
|
||||
"usage": {
|
||||
"input_tokens": 3,
|
||||
"cache_creation_input_tokens": 10553,
|
||||
"cache_read_input_tokens": 25490,
|
||||
"output_tokens": 12,
|
||||
},
|
||||
}
|
||||
yield {
|
||||
"type": "message_stop",
|
||||
"usage": {"input_tokens": 3, "output_tokens": 9},
|
||||
}
|
||||
|
||||
logging_obj = LiteLLMLoggingObj(
|
||||
model="bedrock/us.anthropic.claude-sonnet-4-6",
|
||||
messages=[{"role": "user", "content": "Hello"}],
|
||||
stream=True,
|
||||
call_type="chat",
|
||||
start_time=datetime.now(),
|
||||
litellm_call_id="test_unified_bedrock_messages_sse_cost",
|
||||
function_id="test_unified_bedrock_messages_sse_cost",
|
||||
)
|
||||
|
||||
collected: list[bytes] = []
|
||||
async for sse in cfg.bedrock_sse_wrapper(
|
||||
completion_stream=_stream(),
|
||||
litellm_logging_obj=logging_obj,
|
||||
request_body={"model": "us.anthropic.claude-sonnet-4-6"},
|
||||
):
|
||||
collected.append(sse)
|
||||
|
||||
built = AnthropicPassthroughLoggingHandler._build_complete_streaming_response(
|
||||
all_chunks=collected,
|
||||
model="us.anthropic.claude-sonnet-4-6",
|
||||
litellm_logging_obj=Mock(),
|
||||
)
|
||||
assert built.usage is not None
|
||||
assert built.usage.completion_tokens == 12
|
||||
assert built.usage.prompt_tokens == 36046
|
||||
assert built.usage.total_tokens == 36058
|
||||
assert built.usage.cache_creation_input_tokens == 10553
|
||||
assert built.usage.cache_read_input_tokens == 25490
|
||||
|
||||
cost = completion_cost(
|
||||
completion_response=built,
|
||||
model="bedrock/us.anthropic.claude-sonnet-4-6",
|
||||
custom_llm_provider="bedrock",
|
||||
)
|
||||
assert cost == pytest.approx(0.052150725, rel=0, abs=1e-9)
|
||||
|
||||
@ -8579,3 +8579,170 @@ def test_enforce_upperbound_no_config_is_noop():
|
||||
assert data.tpm_limit == 999999
|
||||
finally:
|
||||
litellm.upperbound_key_generate_params = original
|
||||
|
||||
|
||||
class TestAllowedRoutesCallerPermission:
|
||||
"""
|
||||
Non-admins must not be able to set `allowed_routes` on a key. The field
|
||||
bypasses the role-based route gate in
|
||||
RouteChecks.non_proxy_admin_allowed_routes_check, so allowing a non-admin
|
||||
to populate it grants them arbitrary endpoint access.
|
||||
"""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_non_admin_generate_key_with_allowed_routes_rejected(self):
|
||||
data = GenerateKeyRequest(
|
||||
key_alias="escalate",
|
||||
allowed_routes=["/*"],
|
||||
)
|
||||
user_api_key_dict = UserAPIKeyAuth(
|
||||
user_id="internal-user-123",
|
||||
user_role=LitellmUserRoles.INTERNAL_USER,
|
||||
)
|
||||
mock_prisma_client = AsyncMock()
|
||||
|
||||
with patch("litellm.proxy.proxy_server.prisma_client", mock_prisma_client), patch(
|
||||
"litellm.proxy.proxy_server.user_api_key_cache", MagicMock()
|
||||
), patch("litellm.proxy.proxy_server.user_custom_key_generate", None), patch(
|
||||
"litellm.proxy.management_endpoints.key_management_endpoints._common_key_generation_helper",
|
||||
new_callable=AsyncMock,
|
||||
return_value=MagicMock(),
|
||||
):
|
||||
with pytest.raises(ProxyException) as exc_info:
|
||||
await generate_key_fn(
|
||||
data=data,
|
||||
user_api_key_dict=user_api_key_dict,
|
||||
litellm_changed_by=None,
|
||||
)
|
||||
assert str(exc_info.value.code) == "403"
|
||||
assert "allowed_routes" in str(exc_info.value.message)
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_admin_generate_key_with_allowed_routes_allowed(self):
|
||||
data = GenerateKeyRequest(
|
||||
key_alias="admin-key",
|
||||
allowed_routes=["/chat/completions"],
|
||||
user_id="admin-user",
|
||||
)
|
||||
user_api_key_dict = UserAPIKeyAuth(
|
||||
user_id="admin-user",
|
||||
user_role=LitellmUserRoles.PROXY_ADMIN,
|
||||
)
|
||||
mock_prisma_client = AsyncMock()
|
||||
stub_response = MagicMock()
|
||||
|
||||
with patch("litellm.proxy.proxy_server.prisma_client", mock_prisma_client), patch(
|
||||
"litellm.proxy.proxy_server.user_api_key_cache", MagicMock()
|
||||
), patch("litellm.proxy.proxy_server.user_custom_key_generate", None), patch(
|
||||
"litellm.proxy.management_endpoints.key_management_endpoints._common_key_generation_helper",
|
||||
new_callable=AsyncMock,
|
||||
return_value=stub_response,
|
||||
):
|
||||
result = await generate_key_fn(
|
||||
data=data,
|
||||
user_api_key_dict=user_api_key_dict,
|
||||
litellm_changed_by=None,
|
||||
)
|
||||
assert result is stub_response
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_non_admin_generate_key_default_empty_allowed_routes_ok(self):
|
||||
"""
|
||||
Regression guard: GenerateKeyRequest.allowed_routes defaults to [], so
|
||||
the helper must treat empty-list as "not set" or every non-admin key
|
||||
creation breaks.
|
||||
"""
|
||||
data = GenerateKeyRequest(key_alias="plain-key")
|
||||
user_api_key_dict = UserAPIKeyAuth(
|
||||
user_id="internal-user-123",
|
||||
user_role=LitellmUserRoles.INTERNAL_USER,
|
||||
)
|
||||
mock_prisma_client = AsyncMock()
|
||||
stub_response = MagicMock()
|
||||
|
||||
with patch("litellm.proxy.proxy_server.prisma_client", mock_prisma_client), patch(
|
||||
"litellm.proxy.proxy_server.user_api_key_cache", MagicMock()
|
||||
), patch("litellm.proxy.proxy_server.user_custom_key_generate", None), patch(
|
||||
"litellm.proxy.management_endpoints.key_management_endpoints._common_key_generation_helper",
|
||||
new_callable=AsyncMock,
|
||||
return_value=stub_response,
|
||||
):
|
||||
result = await generate_key_fn(
|
||||
data=data,
|
||||
user_api_key_dict=user_api_key_dict,
|
||||
litellm_changed_by=None,
|
||||
)
|
||||
assert result is stub_response
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_non_admin_update_key_with_allowed_routes_rejected(self):
|
||||
from litellm.proxy.management_endpoints.key_management_endpoints import (
|
||||
update_key_fn,
|
||||
)
|
||||
|
||||
data = UpdateKeyRequest(key="sk-test", allowed_routes=["/*"])
|
||||
user_api_key_dict = UserAPIKeyAuth(
|
||||
user_id="internal-user-123",
|
||||
user_role=LitellmUserRoles.INTERNAL_USER,
|
||||
)
|
||||
mock_prisma_client = AsyncMock()
|
||||
|
||||
with patch("litellm.proxy.proxy_server.prisma_client", mock_prisma_client), patch(
|
||||
"litellm.proxy.proxy_server.user_api_key_cache", MagicMock()
|
||||
), patch("litellm.proxy.proxy_server.user_custom_key_update", None), patch(
|
||||
"litellm.proxy.proxy_server.llm_router", None
|
||||
), patch("litellm.proxy.proxy_server.premium_user", True), patch(
|
||||
"litellm.proxy.proxy_server.proxy_logging_obj", MagicMock()
|
||||
), patch(
|
||||
"litellm.proxy.management_endpoints.key_management_endpoints._get_and_validate_existing_key",
|
||||
new_callable=AsyncMock,
|
||||
return_value=MagicMock(),
|
||||
):
|
||||
with pytest.raises(ProxyException) as exc_info:
|
||||
await update_key_fn(
|
||||
request=MagicMock(),
|
||||
data=data,
|
||||
user_api_key_dict=user_api_key_dict,
|
||||
litellm_changed_by=None,
|
||||
)
|
||||
assert str(exc_info.value.code) == "403"
|
||||
assert "allowed_routes" in str(exc_info.value.message)
|
||||
|
||||
|
||||
def test_jinja_prompt_manager_is_sandboxed():
|
||||
"""
|
||||
PromptManager renders user-supplied templates via /prompts/test, so its
|
||||
jinja env must reject access to unsafe Python attributes like
|
||||
``__class__`` and ``__mro__``.
|
||||
"""
|
||||
from jinja2.exceptions import SecurityError
|
||||
|
||||
from litellm.integrations.dotprompt.prompt_manager import PromptManager
|
||||
|
||||
pm = PromptManager()
|
||||
template = pm.jinja_env.from_string("{{ ''.__class__.__mro__ }}")
|
||||
with pytest.raises(SecurityError):
|
||||
template.render()
|
||||
|
||||
|
||||
def test_validate_public_image_url_rejects_local_paths():
|
||||
from litellm.proxy.ui_crud_endpoints.proxy_setting_endpoints import (
|
||||
_validate_public_image_url,
|
||||
)
|
||||
|
||||
for bad in ("/etc/passwd", "file:///etc/passwd", "../../etc/passwd"):
|
||||
with pytest.raises(HTTPException) as exc_info:
|
||||
_validate_public_image_url(bad, "logo_url")
|
||||
assert exc_info.value.status_code == 400
|
||||
|
||||
|
||||
def test_validate_public_image_url_accepts_http_and_noop_empty():
|
||||
from litellm.proxy.ui_crud_endpoints.proxy_setting_endpoints import (
|
||||
_validate_public_image_url,
|
||||
)
|
||||
|
||||
_validate_public_image_url("https://example.com/logo.png", "logo_url")
|
||||
_validate_public_image_url("http://cdn.internal/logo.svg", "logo_url")
|
||||
_validate_public_image_url(None, "logo_url")
|
||||
_validate_public_image_url("", "logo_url")
|
||||
_validate_public_image_url(" ", "logo_url")
|
||||
|
||||
@ -1,6 +1,6 @@
|
||||
import { screen, waitFor } from "@testing-library/react";
|
||||
import { cleanup, screen, waitFor } from "@testing-library/react";
|
||||
import userEvent from "@testing-library/user-event";
|
||||
import { beforeEach, describe, expect, it, vi } from "vitest";
|
||||
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
|
||||
import { renderWithProviders } from "../../tests/test-utils";
|
||||
import { UserEditView } from "./user_edit_view";
|
||||
|
||||
@ -140,6 +140,15 @@ describe("UserEditView", () => {
|
||||
vi.clearAllMocks();
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
// Tremor's internal Tooltip sets a setTimeout that fires after teardown,
|
||||
// causing "window is not defined". Flush pending timers before cleanup.
|
||||
vi.useFakeTimers();
|
||||
vi.runAllTimers();
|
||||
vi.useRealTimers();
|
||||
cleanup();
|
||||
});
|
||||
|
||||
it("should render", async () => {
|
||||
renderWithProviders(<UserEditView {...defaultProps} />);
|
||||
|
||||
|
||||
Loading…
Reference in New Issue
Block a user