litellm/schema.prisma
Sameer Kankute d671a09c20
Litellm oss staging 050626 (#29774)
* Mark xAI models retiring on 2026-05-15 (#28788)

Per https://docs.x.ai/developers/migration/may-15-retirement, xAI is
retiring the following slugs on 2026-05-15 (auto-redirect to grok-4.3
with various reasoning efforts; callers continuing to use the old slugs
will be billed at grok-4.3 pricing):

  grok-4-1-fast-reasoning{,-latest}      -> grok-4.3 (low effort)
  grok-4-1-fast-non-reasoning{,-latest}  -> grok-4.3 (none)
  grok-4-fast-reasoning                  -> grok-4.3 (low effort)
  grok-4-fast-non-reasoning              -> grok-4.3 (none)
  grok-4-0709                            -> grok-4.3 (low effort)
  grok-code-fast-1{,-0825}               -> grok-build-0.1
  grok-3                                 -> grok-4.3 (none)

Only the direct xai/ slugs are tagged; third-party hosts (azure_ai,
oci, vercel_ai_gateway, perplexity/xai) run their own schedules. The
grok-3 retirement list explicitly names only the base grok-3 slug — the
-mini / -fast / -beta / -latest variants are not listed, so they remain
untouched.

* feat(moonshot): advertise json_schema response support on live models (#29683)

litellm.responses() already routes Moonshot through the responses->chat-completions
bridge, and Moonshot honors response_format json_schema on chat completions. The
cost-map entries left supports_response_schema unset, so discovery layers that gate
on that flag dropped Moonshot from structured-output / responses listings even though
the capability works end to end.

Set supports_response_schema on the nine models currently live on api.moonshot.ai:
kimi-k2.5, kimi-k2.6, the moonshot-v1 8k/32k/128k text and vision-preview variants,
and moonshot-v1-auto. Verified against the live API that each honors json_schema and
that litellm.responses() returns schema-valid structured output through the bridge.

* chore(moonshot): mark models retired from api.moonshot.ai as deprecated (#29685)

Thirteen Moonshot/Kimi models in the cost map no longer resolve on
api.moonshot.ai (all return 404). Stamp each with its deprecation_date from
platform.kimi.ai/docs/models rather than deleting the entries, so historical
cost calculation keeps resolving the names while tooling can surface the
retirement.

Dates: kimi-thinking-preview 2025-11-11; kimi-latest and its 8k/32k/128k context
variants 2026-01-28; the kimi-k2 preview/turbo/thinking series 2026-05-25; the
moonshot-v1 -0430 snapshots use their own 2024-04-30 snapshot date (Moonshot
publishes no discontinuation date for them).

* fix(moonshot): drop temperature for reasoning models (kimi-k2.5/k2.6) (#29687)

Kimi reasoning models reject every temperature except 1; a request with
temperature=0.2 returns "invalid temperature: only 1 is allowed for this model".
litellm only clamped temperature into [0.3, 1], so any value below 1 still 400'd.

Drop the temperature param entirely for reasoning models (gated on
supports_reasoning, the same signal transform_request already uses) so the model
default is used; the non-reasoning moonshot-v1 models keep the existing clamp.

Co-authored-by: Sameer Kankute <sameer@berri.ai>

* feat(mcp): add per-server timeout configuration (#29672)

* feat(mcp): add per-server timeout configuration

* fix(mcp): address timeout field review comments

- use is not None guard instead of or for 0.0 edge case
- copy timeout in both LiteLLM_MCPServerTable constructions (health check path + _build_mcp_server_table)
- add timeout Float? column to all three schema.prisma files
- extend round-trip test to cover _build_mcp_server_table direction
- add test for zero timeout not treated as falsy

* fix(mcp): forward timeout in _build_temporary_mcp_server_record

* fix(mcp): return 504 instead of 500 when per-server timeout fires

* test(mcp): add 504 timeout regression test; fix black formatting

* Add jp. Bedrock cross-region inference profile for claude-opus-4-7 (#28567)

* fix(thinking): handle None thinking param in is_thinking_enabled (#28598)

Squash-merged by litellm-agent from Terrajlz's PR.

* feat(helm): support tpl rendering in podAnnotations (#28609)

Squash-merged by litellm-agent from devauxbr's PR.

* Forward custom_llm_provider through the Responses API bridge (Fixes #28505) (#28575)

* Forward custom_llm_provider through the Responses API bridge (Fixes #28505)

When a Chat Completions request to a GPT-5.4+ model contains both
`tools` and `reasoning_effort`, `completion()` auto-routes through
`responses_api_bridge`. The bridge handler called
`litellm.responses()` / `litellm.aresponses()` without forwarding the
already-resolved `custom_llm_provider`, so the downstream call
re-invoked `get_llm_provider()` with `custom_llm_provider=None` and
stripped a second provider prefix from a `provider/provider/model`
deployment string.

For a deployment configured as `openai/openai/openai/gpt-5.5`,
the bridge flow sent `openai/gpt-5.5` to the upstream API instead of
the correct `openai/openai/gpt-5.5`. Upstream APIs that enforce
model-name allow-lists rejected this as `key_model_access_denied`.

Fix: pass the locally-resolved `custom_llm_provider` into both the
sync `responses()` and async `aresponses()` calls so the downstream
`_resolve_model_provider_for_responses` sees an explicit provider
and skips the second prefix-strip.

New regression test
`tests/test_litellm/completion_extras/test_responses_bridge_provider_propagation.py`
pins both call sites: each must forward `custom_llm_provider`.

* fix(28505): set custom_llm_provider on request_data instead of as duplicate kwarg

Greptile flagged that the previous patch passed custom_llm_provider as an
explicit kwarg to responses()/aresponses() while request_data already
carried it via the spread of sanitized_litellm_params, which would raise
TypeError: got multiple values for keyword argument on every real bridge
call.

Switches to assigning request_data['custom_llm_provider'] before the call
so the resolved provider wins over whatever sanitized_litellm_params spread
in, without duplicating the kwarg.

Updates the regression test to seed request_data with a sentinel
custom_llm_provider so it actually exercises the overwrite path (the
previous test mocked transform_request with a minimal dict and never hit
the conflict).

* chore: trigger shin-agent re-eval on retargeted staging base

* chore: trigger shin-agent re-eval against updated Greptile state

* Add jp. Bedrock cross-region inference profile for claude-opus-4-7

AWS Bedrock documents jp.anthropic.claude-opus-4-7 alongside the
existing us./eu./au./global. profiles for Claude Opus 4.7
(ap-northeast-1 Tokyo / ap-northeast-3 Osaka), but the entry is
missing from model_prices_and_context_window.json. Tokyo-region
users currently get an "unknown model" error when routing through
the JP geo profile.

Adds the entry to both the canonical file and the bundled backup,
mirroring the recent pattern for sonnet-4-6 (#27831). Pricing matches
the other regional profiles (10% premium over base/global).

Regression test pins all six documented profiles (base, global, us, eu,
au, jp) and asserts pricing parity between jp. and au. variants.

Source: https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-anthropic-claude-opus-4-7.html

---------

Co-authored-by: Terrajlz <info@jouleselectrictech.com>
Co-authored-by: Bruno Devaux <devaux.br@gmail.com>
Co-authored-by: Sameer Kankute <sameer@berri.ai>

* feat(soniox): add soniox audio transcription integration (#29508)

* feat(openmeter): add OPENMETER_TRUST_REQUEST_USER to prevent forged attribution (#29650)

The OpenMeter callback resolves the CloudEvent subject from kwargs["user"]
first, then falls back to the key-bound user_api_key_user_id. For
multi-tenant proxy deployments, a client can set `"user": "..."` in the
request body and cause their usage to be attributed to that arbitrary
string — a billing-attribution forgery risk.

Adds OPENMETER_TRUST_REQUEST_USER env var (default "true" for backward
compatibility). When set to "false", the request-supplied `user` field is
ignored and the subject is resolved solely from user_api_key_user_id.

Matches the existing env-var-driven config pattern in this file
(OPENMETER_API_KEY, OPENMETER_API_ENDPOINT, OPENMETER_EVENT_TYPE).

* feat(search): add you_com as a search provider (#28370)

* feat(search): add you_com as a search provider

Registers You.com Search API as a first-class `search_provider` in the
`search_tools` registry, alongside Tavily, Exa, Perplexity, etc.

- New adapter: litellm/llms/you_com/search/transformation.py
  - POSTs to https://ydc-index.io/v1/search
  - Auth: X-API-Key from YOUCOM_API_KEY (or explicit api_key)
  - Maps Perplexity unified spec: max_results -> count,
    search_domain_filter -> include_domains, country -> country
  - Flattens results.web + results.news into a single SearchResult list;
    snippet prefers snippets[0], falls back to description; page_age -> date
- Registry: SearchProviders.YOU_COM in litellm/types/utils.py and wired
  into ProviderConfigManager.get_provider_search_config()
- Pricing entry: model_prices_and_context_window.json (placeholder $0.0;
  happy to adjust to maintainers' preferred public number)
- Docs: example router config snippet and example proxy yaml updated
- Tests: tests/search_tests/test_you_com_search.py - 5 mocked tests
  (payload shape, domain filter mapping, snippet fallback, news flattening,
  missing-api-key error)

Refs upstream expansion signal: #15942

* review fixups: normalize api_base, lowercase country, scope env-var to test

Addresses Greptile inline review comments on #28370:

- get_complete_url: strip trailing slashes from api_base *before* the
  endswith("/v1/search") check, so a custom base like ".../v1/search/"
  doesn't become ".../v1/search/v1/search".
- transform_search_request: .lower() country before sending, matching
  Tavily's convention so callers using the unified spec form ("US") get
  consistent behavior across providers.
- Tests: replace direct os.environ writes with an autouse monkeypatch
  fixture so YOUCOM_API_KEY is set per-test and removed afterwards.
  The missing-key test now uses monkeypatch.delenv. New test asserts the
  trailing-slash normalization above.

Reverts the ARCHITECTURE.md / example yaml edits per the reviewer note
that documentation changes belong in the litellm-docs repo.

* support keyless free tier (api.you.com/v1/agents/search) as default

You.com offers an IP-throttled keyless endpoint that returns the same
response shape as the keyed one (~100 queries/day, no signup). This is a
significant onboarding lever - mirrors the keyless DuckDuckGo/SearXNG
providers already in the search_tools registry.

Behavior:
- YOUCOM_API_KEY set        -> keyed:  POST https://ydc-index.io/v1/search
                                       (X-API-Key header)
- no key                    -> free:   POST https://api.you.com/v1/agents/search
                                       (no auth)
- YOUCOM_API_BASE override  -> honored as-is

Tests:
- New: test_you_com_search_keyless_free_tier - asserts URL + absence of
  X-API-Key when no key is configured.
- New: test_you_com_search_validate_environment_keyless - asserts the
  config no longer raises when the key is absent.
- Removed: test_you_com_search_raises_without_api_key (the precondition
  no longer holds).
- Existing payload/domain-filter/etc tests still cover keyed mode via
  the autouse YOUCOM_API_KEY fixture.

Verified both endpoints accept POST + return identical JSON shape:
  results.web[] / results.news[] with title, url, snippets, description,
  page_age.

* register you_com in provider_endpoints_support.json

Adding `litellm/llms/you_com/` requires a corresponding entry in
provider_endpoints_support.json or the
code-quality/check_provider_folders_documented CI check fails.

Follows the compact tavily/serper pattern - endpoints: { search: true }.
Local run of the check now reports "All 114 provider folders are documented".

* move tests under tests/test_litellm/llms/ so CI exercises them

The litellm CI workflows scope unit tests to `tests/test_litellm/...`
(see test-unit-llm-providers.yml: `tests/test_litellm/llms` path), so
tests living under `tests/search_tests/` are never run in CI - which is
why codecov reports 0% patch coverage for the new adapter even though
the unit tests exist and pass locally.

Move test_you_com_search.py into `tests/test_litellm/llms/you_com/` so
the test-unit-llm-providers job picks it up. 7/7 tests still pass at
the new location.

(Sibling search-only providers - tavily, exa_ai, brave, etc. - still
live only in `tests/search_tests/` and would benefit from the same
move, but that is out of scope for this PR.)

* fix(you_com): pin Accept-Encoding: identity to dodge keyless gzip bug

The keyless free-tier endpoint (api.you.com/v1/agents/search) advertises
Content-Encoding: gzip but returns a body that httpx's decoder rejects
with `zlib.error: Error -3 while decompressing data: incorrect header
check`, surfacing as litellm.APIConnectionError in user code. curl works
because it doesn't request compression by default.

Pin Accept-Encoding: identity in validate_environment so the upstream
server skips compression entirely. Harmless on the keyed endpoint
(ydc-index.io/v1/search) which negotiates content-encoding correctly.

The header uses setdefault so a caller-supplied Accept-Encoding still
takes precedence. (Server-side bug has been flagged to the You.com team
separately - once fixed there, this workaround can be removed.)

New unit test: test_you_com_search_pins_identity_accept_encoding.

---------

Co-authored-by: Sameer Kankute <sameer@berri.ai>

* docs: fix README typo (#29419)

Correct clear spelling mistakes in documentation without changing behavior.

Confidence: high
Scope-risk: narrow
Tested: git diff --check; uvx codespell on changed files
Not-tested: Full docs build not run; text-only changes

* Fix(langfuse): pass httpx_client to Langfuse in langfuse_prompt_management to respect SSL_VERIFY (#29480)

* fix(langfuse): pass ssl_verify to Langfuse httpx client

* fix_langfuse_

* add unit tests

* addressed comments

---------

Co-authored-by: shin-berri <shin-laptop@berri.ai>
Co-authored-by: yuneng-jiang <yuneng@berri.ai>

* feat(models): add minimax/MiniMax-M3 to model cost map (#29412)

Add MiniMax's new flagship MiniMax-M3 to the native minimax provider:
512K context, 128K max output, native multimodal (supports_vision),
reasoning, prompt caching. Pricing (USD/M tokens): input 0.6 / output
2.4 / cache read 0.12. M3 has no active prompt-cache-write tier, so
cache_creation_input_token_cost is omitted.

Updated both the root model_prices_and_context_window.json (remote
source) and the bundled litellm/model_prices_and_context_window_backup.json
(local fallback), keeping them in sync.

* fix(logging): handle ResponseCompletedEvent in anthropic_messages streaming spend log (#29394)

* fix(logging): handle ResponseCompletedEvent in anthropic_messages streaming spend log

* fix(logging): extend terminal event handling to ResponseIncompleteEvent and ResponseFailedEvent; fix return type annotation

* feat(provider): Add Neosantara provider as OpenAI Compatible (#29646)

* Add Neosantara provider

* Register Neosantara provider enum

* Address Neosantara provider review feedback

* Add Neosantara packaged endpoint support

---------

Co-authored-by: shin-berri <shin-laptop@berri.ai>
Co-authored-by: yuneng-jiang <yuneng@berri.ai>

* fix: address greptile and veria review feedback

- langfuse: guard httpx_client injection behind version check (>= 2.7.3)
- soniox: propagate audio_transcription_duration in _hidden_params for spend tracking
- soniox: give SONIOX_API_BASE env var priority over caller-supplied api_base
- mcp: replace CancelledError catch with asyncio.wait_for + TimeoutError

* chore(mcp): add migration for per-server timeout column

* fix(test): add tool_use_system_prompt_tokens to model prices schema validator

* fix: mcp timeout test uses real asyncio.wait_for timeout; you_com get_complete_url respects resolved api_key

* fix: forward resolved api_key into you_com endpoint selection and apply timeout to soniox polling GETs

The search flow resolves api_key in validate_environment but never passed it
into get_complete_url, so a programmatic api_key (with no YOUCOM_API_KEY in the
env) set the X-API-Key header yet still selected the keyless free-tier endpoint.
Forward api_key through both the search entrypoint and the http handler so the
keyed endpoint is chosen.

HTTPHandler.get/AsyncHTTPHandler.get had no timeout parameter, so the Soniox
poll and transcript-fetch GETs silently used the client global default instead
of the caller timeout. Add a per-request timeout to get() and forward the
configured timeout from the Soniox handler.

* fix(soniox): price stt-async-v4 per second so transcriptions are billed

The handler stores audio_transcription_duration in _hidden_params, but the
model carried only token cost fields and the response has no token usage, so
the transcription cost path fell through to cost_per_second and returned $0.
An authenticated caller could transcribe Soniox audio without decrementing
their budget. Switch the entry to output_cost_per_second at Soniox's published
$0.10/hour async rate so the stored duration produces a real charge.

* fix(langfuse): use a dedicated httpx client for the SDK injection

The httpx_client handed to the Langfuse SDK came from _get_httpx_client(),
which returns LiteLLM's globally cached HTTPHandler. If Langfuse closed that
client on teardown it would invalidate the shared client used by every other
LiteLLM HTTP call. Build a dedicated httpx.Client instead, still resolving SSL
verification and client certificate from LiteLLM's configuration.

* fix(soniox): prefer caller-supplied api_base over SONIOX_API_BASE env var

* fix(cohere): support max_completion_tokens on cohere v2 chat (default route) (#29779)

* fix(cohere): support max_completion_tokens on cohere v2 chat

The default cohere_chat route resolves to CohereV2ChatConfig, which did not
list or map max_completion_tokens, so get_optional_params raised
UnsupportedParamsError for the standard OpenAI parameter (the modern
replacement for the deprecated max_tokens). The v1 config already maps it to
cohere's max_tokens; mirror that in v2 and add v2 regression tests.

* fix(cohere): make max_completion_tokens take precedence over max_tokens on v2

When both max_tokens and max_completion_tokens are supplied, prefer
max_completion_tokens explicitly rather than relying on dict iteration order,
and cover both orderings with a regression test.

---------

Co-authored-by: Daniel Yudelevich <4537920+yudelevi@users.noreply.github.com>
Co-authored-by: hectorc98 <hector.chamorroalvarez@adyen.com>
Co-authored-by: Filippo Menghi <113345637+Cyberfilo@users.noreply.github.com>
Co-authored-by: Terrajlz <info@jouleselectrictech.com>
Co-authored-by: Bruno Devaux <devaux.br@gmail.com>
Co-authored-by: Dan Lemon <dan@danlemon.com>
Co-authored-by: Saswat <saswatds@users.noreply.github.com>
Co-authored-by: Brian Sparker <brainsparker@users.noreply.github.com>
Co-authored-by: Zhao73 <156770117+Zhao73@users.noreply.github.com>
Co-authored-by: Urain Ahmad Shah <60431964+urainshah@users.noreply.github.com>
Co-authored-by: shin-berri <shin-laptop@berri.ai>
Co-authored-by: yuneng-jiang <yuneng@berri.ai>
Co-authored-by: kape <168134658+kapelame@users.noreply.github.com>
Co-authored-by: danisalvaa <159898202+danisalvaa@users.noreply.github.com>
Co-authored-by: Just R <remixingmagelang@gmail.com>
Co-authored-by: mateo-berri <277851410+mateo-berri@users.noreply.github.com>
Co-authored-by: abhay23-AI <abhaytrivedi22@gmail.com>
2026-06-05 13:51:51 -07:00

1382 lines
53 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

datasource client {
provider = "postgresql"
url = env("DATABASE_URL")
}
generator client {
provider = "prisma-client-py"
binaryTargets = ["native", "debian-openssl-1.1.x", "debian-openssl-3.0.x", "linux-musl", "linux-musl-openssl-3.0.x"]
}
// Budget / Rate Limits for an org
model LiteLLM_BudgetTable {
budget_id String @id @default(uuid())
max_budget Float?
soft_budget Float?
max_parallel_requests Int?
tpm_limit BigInt?
rpm_limit BigInt?
model_max_budget Json?
budget_duration String?
budget_reset_at DateTime?
allowed_models String[] @default([]) // per-member model scope; empty = inherit team models
created_at DateTime @default(now()) @map("created_at")
created_by String
updated_at DateTime @default(now()) @updatedAt @map("updated_at")
updated_by String
organization LiteLLM_OrganizationTable[] // multiple orgs can have the same budget
projects LiteLLM_ProjectTable[] // multiple projects can have the same budget
keys LiteLLM_VerificationToken[] // multiple keys can have the same budget
end_users LiteLLM_EndUserTable[] // multiple end-users can have the same budget
tags LiteLLM_TagTable[] // multiple tags can have the same budget
team_membership LiteLLM_TeamMembership[] // budgets of Users within a Team
organization_membership LiteLLM_OrganizationMembership[] // budgets of Users within a Organization
}
// Models on proxy
model LiteLLM_CredentialsTable {
credential_id String @id @default(uuid())
credential_name String @unique
credential_values Json
credential_info Json?
created_at DateTime @default(now()) @map("created_at")
created_by String
updated_at DateTime @default(now()) @updatedAt @map("updated_at")
updated_by String
}
// Models on proxy
model LiteLLM_ProxyModelTable {
model_id String @id @default(uuid())
model_name String
litellm_params Json
model_info Json?
blocked Boolean @default(false)
created_at DateTime @default(now()) @map("created_at")
created_by String
updated_at DateTime @default(now()) @updatedAt @map("updated_at")
updated_by String
}
// Agents on proxy
model LiteLLM_AgentsTable {
agent_id String @id @default(uuid())
agent_name String @unique
litellm_params Json?
agent_card_params Json
static_headers Json? @default("{}")
extra_headers String[] @default([])
agent_access_groups String[] @default([])
object_permission_id String?
object_permission LiteLLM_ObjectPermissionTable? @relation(fields: [object_permission_id], references: [object_permission_id])
spend Float @default(0.0)
tpm_limit Int?
rpm_limit Int?
session_tpm_limit Int?
session_rpm_limit Int?
created_at DateTime @default(now()) @map("created_at")
created_by String
updated_at DateTime @default(now()) @updatedAt @map("updated_at")
updated_by String
}
model LiteLLM_OrganizationTable {
organization_id String @id @default(uuid())
organization_alias String
budget_id String
metadata Json @default("{}")
models String[]
spend Float @default(0.0)
model_spend Json @default("{}")
object_permission_id String?
created_at DateTime @default(now()) @map("created_at")
created_by String
updated_at DateTime @default(now()) @updatedAt @map("updated_at")
updated_by String
litellm_budget_table LiteLLM_BudgetTable? @relation(fields: [budget_id], references: [budget_id])
teams LiteLLM_TeamTable[]
users LiteLLM_UserTable[]
keys LiteLLM_VerificationToken[]
members LiteLLM_OrganizationMembership[] @relation("OrganizationToMembership")
object_permission LiteLLM_ObjectPermissionTable? @relation(fields: [object_permission_id], references: [object_permission_id])
}
// Model info for teams, just has model aliases for now.
model LiteLLM_ModelTable {
id Int @id @default(autoincrement())
model_aliases Json? @map("aliases")
created_at DateTime @default(now()) @map("created_at")
created_by String
updated_at DateTime @default(now()) @updatedAt @map("updated_at")
updated_by String
team LiteLLM_TeamTable?
}
// Assign prod keys to groups, not individuals
model LiteLLM_TeamTable {
team_id String @id @default(uuid())
team_alias String?
organization_id String?
object_permission_id String?
admins String[]
members String[]
members_with_roles Json @default("{}")
metadata Json @default("{}")
max_budget Float?
soft_budget Float?
spend Float @default(0.0)
models String[]
max_parallel_requests Int?
tpm_limit BigInt?
rpm_limit BigInt?
budget_duration String?
budget_reset_at DateTime?
blocked Boolean @default(false)
created_at DateTime @default(now()) @map("created_at")
updated_at DateTime @default(now()) @updatedAt @map("updated_at")
model_spend Json @default("{}")
model_max_budget Json @default("{}")
router_settings Json? @default("{}")
team_member_permissions String[] @default([])
access_group_ids String[] @default([])
policies String[] @default([])
default_team_member_models String[] @default([]) // default allowed_models for newly added team members; empty = no per-member restriction
budget_limits Json? // per-model budget limits for the team
model_id Int? @unique // id for LiteLLM_ModelTable -> stores team-level model aliases
allow_team_guardrail_config Boolean @default(false) // if true, team admin can configure guardrails for this team
litellm_organization_table LiteLLM_OrganizationTable? @relation(fields: [organization_id], references: [organization_id])
litellm_model_table LiteLLM_ModelTable? @relation(fields: [model_id], references: [id])
object_permission LiteLLM_ObjectPermissionTable? @relation(fields: [object_permission_id], references: [object_permission_id])
projects LiteLLM_ProjectTable[]
@@index([organization_id])
@@index([team_alias])
@@index([created_at])
}
// Projects sit between teams and keys for use-case management
model LiteLLM_ProjectTable {
project_id String @id @default(uuid())
project_alias String?
description String?
team_id String?
budget_id String?
metadata Json @default("{}")
models String[]
spend Float @default(0.0)
model_spend Json @default("{}")
model_rpm_limit Json @default("{}")
model_tpm_limit Json @default("{}")
blocked Boolean @default(false)
object_permission_id String?
created_at DateTime @default(now()) @map("created_at")
created_by String
updated_at DateTime @default(now()) @updatedAt @map("updated_at")
updated_by String
// Relations
litellm_team_table LiteLLM_TeamTable? @relation(fields: [team_id], references: [team_id])
litellm_budget_table LiteLLM_BudgetTable? @relation(fields: [budget_id], references: [budget_id])
keys LiteLLM_VerificationToken[]
object_permission LiteLLM_ObjectPermissionTable? @relation(fields: [object_permission_id], references: [object_permission_id])
}
// Audit table for deleted teams - preserves spend and team information for historical tracking
model LiteLLM_DeletedTeamTable {
id String @id @default(uuid())
team_id String // Original team_id
team_alias String?
organization_id String?
object_permission_id String?
admins String[]
members String[]
members_with_roles Json @default("{}")
metadata Json @default("{}")
max_budget Float?
soft_budget Float?
spend Float @default(0.0)
models String[]
max_parallel_requests Int?
tpm_limit BigInt?
rpm_limit BigInt?
budget_duration String?
budget_reset_at DateTime?
blocked Boolean @default(false)
model_spend Json @default("{}")
model_max_budget Json @default("{}")
router_settings Json? @default("{}")
team_member_permissions String[] @default([])
access_group_ids String[] @default([])
policies String[] @default([])
model_id Int? // id for LiteLLM_ModelTable -> stores team-level model aliases
allow_team_guardrail_config Boolean @default(false)
// Original timestamps from team creation/updates
created_at DateTime? @map("created_at")
updated_at DateTime? @map("updated_at")
// Deletion metadata
deleted_at DateTime @default(now()) @map("deleted_at")
deleted_by String? @map("deleted_by") // User who deleted the team
deleted_by_api_key String? @map("deleted_by_api_key") // API key hash that performed the deletion
litellm_changed_by String? @map("litellm_changed_by") // From litellm-changed-by header if provided
@@index([team_id])
@@index([deleted_at])
@@index([organization_id])
@@index([team_alias])
@@index([created_at])
}
// Track spend, rate limit, budget Users
model LiteLLM_UserTable {
user_id String @id
user_alias String?
team_id String?
sso_user_id String? @unique
organization_id String?
object_permission_id String?
password String?
teams String[] @default([])
user_role String?
max_budget Float?
spend Float @default(0.0)
user_email String?
models String[]
metadata Json @default("{}")
max_parallel_requests Int?
tpm_limit BigInt?
rpm_limit BigInt?
budget_duration String?
budget_reset_at DateTime?
allowed_cache_controls String[] @default([])
policies String[] @default([])
model_spend Json @default("{}")
model_max_budget Json @default("{}")
created_at DateTime? @default(now()) @map("created_at")
updated_at DateTime? @default(now()) @updatedAt @map("updated_at")
// relations
litellm_organization_table LiteLLM_OrganizationTable? @relation(fields: [organization_id], references: [organization_id])
organization_memberships LiteLLM_OrganizationMembership[]
invitations_created LiteLLM_InvitationLink[] @relation("CreatedBy")
invitations_updated LiteLLM_InvitationLink[] @relation("UpdatedBy")
invitations_user LiteLLM_InvitationLink[] @relation("UserId")
object_permission LiteLLM_ObjectPermissionTable? @relation(fields: [object_permission_id], references: [object_permission_id])
}
model LiteLLM_ObjectPermissionTable {
object_permission_id String @id @default(uuid())
mcp_servers String[] @default([])
mcp_access_groups String[] @default([])
mcp_tool_permissions Json? // Tool-level permissions for MCP servers. Format: {"server_id": ["tool_name_1", "tool_name_2"]}
vector_stores String[] @default([])
agents String[] @default([])
agent_access_groups String[] @default([])
models String[] @default([])
blocked_tools String[] @default([]) // Tool names blocked for any key/team/user with this permission
mcp_toolsets String[] @default([]) // Toolset IDs granted to this key/team/user
search_tools String[] @default([]) // search_tool_name values this key/team/user may call
teams LiteLLM_TeamTable[]
projects LiteLLM_ProjectTable[]
verification_tokens LiteLLM_VerificationToken[]
organizations LiteLLM_OrganizationTable[]
users LiteLLM_UserTable[]
end_users LiteLLM_EndUserTable[]
agents_table LiteLLM_AgentsTable[]
}
// Holds the MCP server configuration
model LiteLLM_MCPServerTable {
server_id String @id @default(uuid())
server_name String?
alias String?
description String?
instructions String?
url String?
spec_path String?
transport String @default("sse")
auth_type String?
credentials Json? @default("{}")
created_at DateTime? @default(now()) @map("created_at")
created_by String?
updated_at DateTime? @default(now()) @updatedAt @map("updated_at")
updated_by String?
mcp_info Json? @default("{}")
mcp_access_groups String[]
allowed_tools String[] @default([])
tool_name_to_display_name Json? @default("{}")
tool_name_to_description Json? @default("{}")
extra_headers String[] @default([])
static_headers Json? @default("{}")
// Health check status
status String? @default("unknown")
last_health_check DateTime?
health_check_error String?
// Stdio-specific fields
command String?
args String[] @default([])
env Json? @default("{}")
authorization_url String?
token_url String?
registration_url String?
oauth2_flow String?
allow_all_keys Boolean @default(false)
available_on_public_internet Boolean @default(true)
delegate_auth_to_upstream Boolean @default(false)
oauth_passthrough Boolean @default(false)
is_byok Boolean @default(false)
byok_description String[] @default([])
byok_api_key_help_url String?
source_url String?
timeout Float?
// BYOM submission lifecycle
approval_status String? @default("active")
submitted_by String?
submitted_at DateTime?
reviewed_at DateTime?
review_notes String?
@@index([approval_status])
}
// Named collection of {server_id, tool_name} pairs that can be granted to keys/teams
model LiteLLM_MCPToolsetTable {
toolset_id String @id @default(uuid())
toolset_name String @unique
description String?
tools Json @default("[]") // [{server_id: string, tool_name: string}]
created_at DateTime @default(now())
created_by String?
updated_at DateTime @default(now()) @updatedAt
updated_by String?
}
// Per-user BYOK credentials for MCP servers
model LiteLLM_MCPUserCredentials {
id String @id @default(uuid())
user_id String
server_id String
credential_b64 String
created_at DateTime @default(now()) @map("created_at")
updated_at DateTime @default(now()) @updatedAt @map("updated_at")
@@unique([user_id, server_id])
}
// Generate Tokens for Proxy
model LiteLLM_VerificationToken {
token String @id
key_name String?
key_alias String?
soft_budget_cooldown Boolean @default(false) // key-level state on if budget alerts need to be cooled down
spend Float @default(0.0)
expires DateTime?
models String[]
aliases Json @default("{}")
config Json @default("{}")
router_settings Json? @default("{}")
user_id String?
team_id String?
agent_id String?
project_id String?
permissions Json @default("{}")
max_parallel_requests Int?
metadata Json @default("{}")
blocked Boolean?
tpm_limit BigInt?
rpm_limit BigInt?
max_budget Float?
budget_duration String?
budget_reset_at DateTime?
allowed_cache_controls String[] @default([])
allowed_routes String[] @default([])
policies String[] @default([])
access_group_ids String[] @default([])
model_spend Json @default("{}")
model_max_budget Json @default("{}")
budget_id String?
organization_id String?
object_permission_id String?
created_at DateTime? @default(now()) @map("created_at")
created_by String?
updated_at DateTime? @default(now()) @updatedAt @map("updated_at")
updated_by String?
last_active DateTime? // When this key was last used
rotation_count Int? @default(0) // Number of times key has been rotated
auto_rotate Boolean? @default(false) // Whether this key should be auto-rotated
rotation_interval String? // How often to rotate (e.g., "30d", "90d")
last_rotation_at DateTime? // When this key was last rotated
key_rotation_at DateTime? // When this key should next be rotated
budget_limits Json? // per-model budget limits for the key
litellm_budget_table LiteLLM_BudgetTable? @relation(fields: [budget_id], references: [budget_id])
litellm_organization_table LiteLLM_OrganizationTable? @relation(fields: [organization_id], references: [organization_id])
litellm_project_table LiteLLM_ProjectTable? @relation(fields: [project_id], references: [project_id])
object_permission LiteLLM_ObjectPermissionTable? @relation(fields: [object_permission_id], references: [object_permission_id])
jwt_key_mappings LiteLLM_JWTKeyMapping[]
// SELECT COUNT(*) FROM (SELECT "public"."LiteLLM_VerificationToken"."token" FROM "public"."LiteLLM_VerificationToken" WHERE ("public"."LiteLLM_VerificationToken"."user_id" = $1 AND ("public"."LiteLLM_VerificationToken"."team_id" IS NULL OR "public"."LiteLLM_VerificationToken"."team_id" <> $2)) OFFSET $3 ) AS "sub"
// SELECT ... FROM "public"."LiteLLM_VerificationToken" WHERE "public"."LiteLLM_VerificationToken"."user_id" = $1 OFFSET $2
@@index([user_id, team_id])
// SELECT ... FROM "public"."LiteLLM_VerificationToken" WHERE "public"."LiteLLM_VerificationToken"."team_id" = $1 OFFSET $2
@@index([team_id])
// SELECT ... FROM "public"."LiteLLM_VerificationToken" WHERE (("public"."LiteLLM_VerificationToken"."expires" IS NULL OR "public"."LiteLLM_VerificationToken"."expires" > $1) AND "public"."LiteLLM_VerificationToken"."budget_reset_at" < $2) OFFSET $3
@@index([budget_reset_at, expires])
}
model LiteLLM_JWTKeyMapping {
id String @id @default(uuid())
jwt_claim_name String // e.g. "sub", "email"
jwt_claim_value String // The claim value to match
token String // Hashed virtual key (FK)
description String?
is_active Boolean @default(true)
created_at DateTime @default(now())
created_by String?
updated_at DateTime @default(now()) @updatedAt
updated_by String?
litellm_verification_token LiteLLM_VerificationToken @relation(fields: [token], references: [token])
@@unique([jwt_claim_name, jwt_claim_value])
@@index([jwt_claim_name, jwt_claim_value, is_active])
}
// Deprecated keys during grace period - allows old key to work until revoke_at
model LiteLLM_DeprecatedVerificationToken {
id String @id @default(uuid())
token String // Hashed old key
active_token_id String // Current token hash in LiteLLM_VerificationToken
revoke_at DateTime // When the old key stops working
created_at DateTime @default(now()) @map("created_at")
@@unique([token])
@@index([token, revoke_at])
@@index([revoke_at])
}
// Audit table for deleted keys - preserves spend and key information for historical tracking
model LiteLLM_DeletedVerificationToken {
id String @id @default(uuid())
token String // Original token (hashed)
key_name String?
key_alias String?
soft_budget_cooldown Boolean @default(false)
spend Float @default(0.0)
expires DateTime?
models String[]
aliases Json @default("{}")
config Json @default("{}")
user_id String?
team_id String?
agent_id String?
project_id String?
permissions Json @default("{}")
max_parallel_requests Int?
metadata Json @default("{}")
blocked Boolean?
tpm_limit BigInt?
rpm_limit BigInt?
max_budget Float?
budget_duration String?
budget_reset_at DateTime?
allowed_cache_controls String[] @default([])
allowed_routes String[] @default([])
policies String[] @default([])
access_group_ids String[] @default([])
model_spend Json @default("{}")
model_max_budget Json @default("{}")
router_settings Json? @default("{}")
budget_id String?
organization_id String?
object_permission_id String?
created_at DateTime? // Original creation timestamp
created_by String? // Original creator
updated_at DateTime? // Last update timestamp before deletion
updated_by String? // Last user who updated before deletion
last_active DateTime? // When this key was last used before deletion
rotation_count Int? @default(0)
auto_rotate Boolean? @default(false)
rotation_interval String?
last_rotation_at DateTime?
key_rotation_at DateTime?
// Deletion metadata
deleted_at DateTime @default(now()) @map("deleted_at")
deleted_by String? @map("deleted_by") // User who deleted the key
deleted_by_api_key String? @map("deleted_by_api_key") // API key hash that performed the deletion
litellm_changed_by String? @map("litellm_changed_by") // From litellm-changed-by header if provided
@@index([token])
@@index([deleted_at])
@@index([user_id])
@@index([team_id])
@@index([organization_id])
@@index([key_alias])
@@index([created_at])
}
model LiteLLM_EndUserTable {
user_id String @id
alias String? // admin-facing alias
spend Float @default(0.0)
allowed_model_region String? // require all user requests to use models in this specific region
default_model String? // use along with 'allowed_model_region'. if no available model in region, default to this model.
budget_id String?
object_permission_id String?
litellm_budget_table LiteLLM_BudgetTable? @relation(fields: [budget_id], references: [budget_id])
object_permission LiteLLM_ObjectPermissionTable? @relation(fields: [object_permission_id], references: [object_permission_id])
blocked Boolean @default(false)
}
// Track tags with budgets and spend
model LiteLLM_TagTable {
tag_name String @id
description String?
models String[]
model_info Json? // maps model_id to model_name
spend Float @default(0.0)
budget_id String?
litellm_budget_table LiteLLM_BudgetTable? @relation(fields: [budget_id], references: [budget_id])
created_at DateTime @default(now()) @map("created_at")
created_by String?
updated_at DateTime @default(now()) @updatedAt @map("updated_at")
}
// store proxy config.yaml
model LiteLLM_Config {
param_name String @id
param_value Json?
}
// View spend, model, api_key per request
model LiteLLM_SpendLogs {
request_id String @id
call_type String
api_key String @default ("") // Hashed API Token. Not the actual Virtual Key. Equivalent to 'token' column in LiteLLM_VerificationToken
spend Float @default(0.0)
total_tokens Int @default(0)
prompt_tokens Int @default(0)
completion_tokens Int @default(0)
startTime DateTime // Assuming start_time is a DateTime field
endTime DateTime // Assuming end_time is a DateTime field
request_duration_ms Int?
completionStartTime DateTime? // Assuming completionStartTime is a DateTime field
model String @default("")
model_id String? @default("") // the model id stored in proxy model db
model_group String? @default("") // public model_name / model_group
custom_llm_provider String? @default("") // litellm used custom_llm_provider
api_base String? @default("")
user String? @default("")
metadata Json? @default("{}") // project_id stored here
cache_hit String? @default("")
cache_key String? @default("")
request_tags Json? @default("[]")
team_id String?
organization_id String?
end_user String?
requester_ip_address String?
messages Json? @default("{}")
response Json? @default("{}")
session_id String?
status String?
mcp_namespaced_tool_name String?
agent_id String?
proxy_server_request Json? @default("{}")
@@index([startTime])
@@index([startTime, request_id])
@@index([end_user])
@@index([session_id])
}
// View spend, model, api_key per request
model LiteLLM_ErrorLogs {
request_id String @id @default(uuid())
startTime DateTime // Assuming start_time is a DateTime field
endTime DateTime // Assuming end_time is a DateTime field
api_base String @default("")
model_group String @default("") // public model_name / model_group
litellm_model_name String @default("") // model passed to litellm
model_id String @default("") // ID of model in ProxyModelTable
request_kwargs Json @default("{}")
exception_type String @default("")
exception_string String @default("")
status_code String @default("")
}
// Beta - allow team members to request access to a model
model LiteLLM_UserNotifications {
request_id String @id
user_id String
models String[]
justification String
status String // approved, disapproved, pending
}
model LiteLLM_TeamMembership {
// Use this table to track the Internal User's Spend within a Team + Set Budgets, rpm limits for the user within the team
user_id String
team_id String
spend Float @default(0.0)
total_spend Float @default(0.0)
budget_id String?
litellm_budget_table LiteLLM_BudgetTable? @relation(fields: [budget_id], references: [budget_id])
@@id([user_id, team_id])
}
model LiteLLM_OrganizationMembership {
// Use this table to track Internal User and Organization membership. Helps tracking a users role within an Organization
user_id String
organization_id String
user_role String?
spend Float? @default(0.0)
budget_id String?
created_at DateTime? @default(now()) @map("created_at")
updated_at DateTime? @default(now()) @updatedAt @map("updated_at")
// relations
user LiteLLM_UserTable @relation(fields: [user_id], references: [user_id])
organization LiteLLM_OrganizationTable @relation("OrganizationToMembership", fields: [organization_id], references: [organization_id])
litellm_budget_table LiteLLM_BudgetTable? @relation(fields: [budget_id], references: [budget_id])
@@id([user_id, organization_id])
@@unique([user_id, organization_id])
}
model LiteLLM_InvitationLink {
// use this table to track invite links sent by admin for people to join the proxy
id String @id @default(uuid())
user_id String
is_accepted Boolean @default(false)
accepted_at DateTime? // when link is claimed (user successfully onboards via link)
expires_at DateTime // till when is link valid
created_at DateTime // when did admin create the link
created_by String // who created the link
updated_at DateTime // when was invite status updated
updated_by String // who updated the status (admin/user who accepted invite)
// Relations
liteLLM_user_table_user LiteLLM_UserTable @relation("UserId", fields: [user_id], references: [user_id])
liteLLM_user_table_created LiteLLM_UserTable @relation("CreatedBy", fields: [created_by], references: [user_id])
liteLLM_user_table_updated LiteLLM_UserTable @relation("UpdatedBy", fields: [updated_by], references: [user_id])
}
model LiteLLM_AuditLog {
id String @id @default(uuid())
updated_at DateTime @default(now())
changed_by String @default("") // user or system that performed the action
changed_by_api_key String @default("") // api key hash that performed the action
action String // create, update, delete
table_name String // on of LitellmTableNames.TEAM_TABLE_NAME, LitellmTableNames.USER_TABLE_NAME, LitellmTableNames.PROXY_MODEL_TABLE_NAME,
object_id String // id of the object being audited. This can be the key id, team id, user id, model id
before_value Json? // value of the row
updated_values Json? // value of the row after change
}
// Track daily user spend metrics per model and key
model LiteLLM_DailyUserSpend {
id String @id @default(uuid())
user_id String?
date String
api_key String
model String?
model_group String?
custom_llm_provider String?
mcp_namespaced_tool_name String?
endpoint String?
prompt_tokens BigInt @default(0)
completion_tokens BigInt @default(0)
cache_read_input_tokens BigInt @default(0)
cache_creation_input_tokens BigInt @default(0)
spend Float @default(0.0)
api_requests BigInt @default(0)
successful_requests BigInt @default(0)
failed_requests BigInt @default(0)
created_at DateTime @default(now())
updated_at DateTime @updatedAt
@@unique([user_id, date, api_key, model, custom_llm_provider, mcp_namespaced_tool_name, endpoint])
@@index([date])
@@index([user_id, date])
@@index([api_key])
@@index([model])
@@index([mcp_namespaced_tool_name])
@@index([endpoint])
}
// Track daily organization spend metrics per model and key
model LiteLLM_DailyOrganizationSpend {
id String @id @default(uuid())
organization_id String?
date String
api_key String
model String?
model_group String?
custom_llm_provider String?
mcp_namespaced_tool_name String?
endpoint String?
prompt_tokens BigInt @default(0)
completion_tokens BigInt @default(0)
cache_read_input_tokens BigInt @default(0)
cache_creation_input_tokens BigInt @default(0)
spend Float @default(0.0)
api_requests BigInt @default(0)
successful_requests BigInt @default(0)
failed_requests BigInt @default(0)
created_at DateTime @default(now())
updated_at DateTime @updatedAt
@@unique([organization_id, date, api_key, model, custom_llm_provider, mcp_namespaced_tool_name, endpoint])
@@index([date])
@@index([organization_id, date])
@@index([api_key])
@@index([model])
@@index([mcp_namespaced_tool_name])
@@index([endpoint])
}
// Track daily end user (customer) spend metrics per model and key
model LiteLLM_DailyEndUserSpend {
id String @id @default(uuid())
end_user_id String?
date String
api_key String
model String?
model_group String?
custom_llm_provider String?
mcp_namespaced_tool_name String?
endpoint String?
prompt_tokens BigInt @default(0)
completion_tokens BigInt @default(0)
cache_read_input_tokens BigInt @default(0)
cache_creation_input_tokens BigInt @default(0)
spend Float @default(0.0)
api_requests BigInt @default(0)
successful_requests BigInt @default(0)
failed_requests BigInt @default(0)
created_at DateTime @default(now())
updated_at DateTime @updatedAt
@@unique([end_user_id, date, api_key, model, custom_llm_provider, mcp_namespaced_tool_name, endpoint])
@@index([date])
@@index([end_user_id, date])
@@index([api_key])
@@index([model])
@@index([mcp_namespaced_tool_name])
@@index([endpoint])
}
// Track daily agent spend metrics per model and key
model LiteLLM_DailyAgentSpend {
id String @id @default(uuid())
agent_id String?
date String
api_key String
model String?
model_group String?
custom_llm_provider String?
mcp_namespaced_tool_name String?
endpoint String?
prompt_tokens BigInt @default(0)
completion_tokens BigInt @default(0)
cache_read_input_tokens BigInt @default(0)
cache_creation_input_tokens BigInt @default(0)
spend Float @default(0.0)
api_requests BigInt @default(0)
successful_requests BigInt @default(0)
failed_requests BigInt @default(0)
created_at DateTime @default(now())
updated_at DateTime @updatedAt
@@unique([agent_id, date, api_key, model, custom_llm_provider, mcp_namespaced_tool_name, endpoint])
@@index([date])
@@index([agent_id, date])
@@index([api_key])
@@index([model])
@@index([mcp_namespaced_tool_name])
@@index([endpoint])
}
// Track daily team spend metrics per model and key
model LiteLLM_DailyTeamSpend {
id String @id @default(uuid())
team_id String?
date String
api_key String
model String?
model_group String?
custom_llm_provider String?
mcp_namespaced_tool_name String?
endpoint String?
prompt_tokens BigInt @default(0)
completion_tokens BigInt @default(0)
cache_read_input_tokens BigInt @default(0)
cache_creation_input_tokens BigInt @default(0)
spend Float @default(0.0)
api_requests BigInt @default(0)
successful_requests BigInt @default(0)
failed_requests BigInt @default(0)
created_at DateTime @default(now())
updated_at DateTime @updatedAt
@@unique([team_id, date, api_key, model, custom_llm_provider, mcp_namespaced_tool_name, endpoint])
@@index([date])
@@index([team_id, date])
@@index([api_key])
@@index([model])
@@index([mcp_namespaced_tool_name])
@@index([endpoint])
}
// Track daily team spend metrics per model and key
model LiteLLM_DailyTagSpend {
id String @id @default(uuid())
request_id String?
tag String?
date String
api_key String
model String?
model_group String?
custom_llm_provider String?
mcp_namespaced_tool_name String?
endpoint String?
prompt_tokens BigInt @default(0)
completion_tokens BigInt @default(0)
cache_read_input_tokens BigInt @default(0)
cache_creation_input_tokens BigInt @default(0)
spend Float @default(0.0)
api_requests BigInt @default(0)
successful_requests BigInt @default(0)
failed_requests BigInt @default(0)
created_at DateTime @default(now())
updated_at DateTime @updatedAt
@@unique([tag, date, api_key, model, custom_llm_provider, mcp_namespaced_tool_name, endpoint])
@@index([date])
@@index([tag, date])
@@index([api_key])
@@index([model])
@@index([mcp_namespaced_tool_name])
@@index([endpoint])
}
// Track the status of cron jobs running. Only allow one pod to run the job at a time
model LiteLLM_CronJob {
cronjob_id String @id @default(cuid()) // Unique ID for the record
pod_id String // Unique identifier for the pod acting as the leader
status JobStatus @default(INACTIVE) // Status of the cron job (active or inactive)
last_updated DateTime @default(now()) // Timestamp for the last update of the cron job record
ttl DateTime // Time when the leader's lease expires
}
enum JobStatus {
ACTIVE
INACTIVE
}
model LiteLLM_ManagedFileTable {
id String @id @default(uuid())
unified_file_id String @unique // The base64 encoded unified file ID
file_object Json? // Stores the OpenAIFileObject
model_mappings Json
flat_model_file_ids String[] @default([]) // Flat list of model file id's - for faster querying of model id -> unified file id
storage_backend String? // Storage backend name (e.g., "azure_storage", "gcs", "default")
storage_url String? // The actual storage URL where the file is stored
created_at DateTime @default(now())
created_by String?
team_id String? // Team that owns the resource; populated for service-account keys without a user_id so listings can isolate by team.
updated_at DateTime @updatedAt
updated_by String?
@@index([unified_file_id])
@@index([team_id, created_at(sort: Desc)])
}
model LiteLLM_ManagedObjectTable { // for batches or finetuning jobs which use the
id String @id @default(uuid())
unified_object_id String @unique // The base64 encoded unified file ID
model_object_id String @unique // the id returned by the backend API provider
file_object Json // Stores the OpenAIFileObject
file_purpose String // either 'batch' or 'fine-tune'
status String? // check if batch cost has been tracked
batch_processed Boolean @default(false) // set to true by CheckBatchCost after cost is computed
created_at DateTime @default(now())
created_by String?
team_id String?
updated_at DateTime @updatedAt
updated_by String?
@@index([unified_object_id])
@@index([model_object_id])
@@index([team_id, created_at(sort: Desc)])
}
model LiteLLM_ManagedVectorStoreTable {
id String @id @default(uuid())
unified_resource_id String @unique // The base64 encoded unified vector store ID
resource_object Json? // Stores the VectorStoreCreateResponse
model_mappings Json // Maps model_id -> provider_vector_store_id
flat_model_resource_ids String[] @default([]) // Flat list of provider vector store IDs for faster querying
storage_backend String? // Storage backend name (if applicable)
storage_url String? // Storage URL (if applicable)
created_at DateTime @default(now())
created_by String?
team_id String?
updated_at DateTime @updatedAt
updated_by String?
@@index([unified_resource_id])
@@index([team_id, created_at(sort: Desc)])
}
model LiteLLM_ManagedVectorStoresTable {
vector_store_id String @id
custom_llm_provider String
vector_store_name String?
vector_store_description String?
vector_store_metadata Json?
created_at DateTime @default(now())
updated_at DateTime @updatedAt
litellm_credential_name String?
litellm_params Json?
team_id String?
user_id String?
@@index([team_id])
@@index([user_id])
}
// Guardrails table for storing guardrail configurations
model LiteLLM_GuardrailsTable {
guardrail_id String @id @default(uuid())
guardrail_name String @unique
litellm_params Json
guardrail_info Json?
team_id String?
created_at DateTime @default(now())
updated_at DateTime @updatedAt
// Submission lifecycle. Possible values: pending_review (team-registered, awaiting approval), active (approved), rejected
status String @default("active")
submitted_at DateTime?
reviewed_at DateTime?
// submitted_by_user_id and submitted_by_email live in guardrail_info JSON
@@index([status])
}
// Daily guardrail metrics for usage dashboard (one row per guardrail per day)
model LiteLLM_DailyGuardrailMetrics {
guardrail_id String // logical id; may not FK if guardrail from config
date String // YYYY-MM-DD
requests_evaluated BigInt @default(0)
passed_count BigInt @default(0)
blocked_count BigInt @default(0)
flagged_count BigInt @default(0)
avg_score Float?
avg_latency_ms Float?
created_at DateTime @default(now())
updated_at DateTime @updatedAt
@@id([guardrail_id, date])
@@index([date])
@@index([guardrail_id])
}
// Daily policy metrics for usage dashboard (one row per policy per day)
model LiteLLM_DailyPolicyMetrics {
policy_id String
date String // YYYY-MM-DD
requests_evaluated BigInt @default(0)
passed_count BigInt @default(0)
blocked_count BigInt @default(0)
flagged_count BigInt @default(0)
avg_score Float?
avg_latency_ms Float?
created_at DateTime @default(now())
updated_at DateTime @updatedAt
@@id([policy_id, date])
@@index([date])
@@index([policy_id])
}
// Index for fast "last N logs for guardrail/policy" from SpendLogs
model LiteLLM_SpendLogGuardrailIndex {
request_id String
guardrail_id String
policy_id String? // set when run as part of a policy pipeline
start_time DateTime
@@id([request_id, guardrail_id])
@@index([guardrail_id, start_time])
@@index([policy_id, start_time])
}
// Index for fast "last N logs for tool" from SpendLogs see how a tool is called in production
model LiteLLM_SpendLogToolIndex {
request_id String
tool_name String // matches LiteLLM_ToolTable.tool_name; join for input_policy/output_policy etc.
start_time DateTime
@@id([request_id, tool_name])
@@index([tool_name, start_time])
}
// Prompt table for storing prompt configurations
model LiteLLM_PromptTable {
id String @id @default(uuid())
prompt_id String
version Int @default(1)
environment String @default("development")
created_by String?
litellm_params Json
prompt_info Json?
created_at DateTime @default(now())
updated_at DateTime @updatedAt
@@unique([prompt_id, version, environment])
@@index([prompt_id, environment])
@@index([prompt_id])
}
model LiteLLM_HealthCheckTable {
health_check_id String @id @default(uuid())
model_name String
model_id String?
status String
healthy_count Int @default(0)
unhealthy_count Int @default(0)
error_message String?
response_time_ms Float?
details Json?
checked_by String?
checked_at DateTime @default(now())
created_at DateTime @default(now())
updated_at DateTime @updatedAt
@@index([model_name])
@@index([checked_at])
@@index([status])
@@index([model_id, model_name, checked_at(sort: Desc)], map: "LiteLLM_HealthCheckTable_model_id_model_name_checked_at_idx")
}
// Search Tools table for storing search tool configurations
model LiteLLM_SearchToolsTable {
search_tool_id String @id @default(uuid())
search_tool_name String @unique
litellm_params Json
search_tool_info Json?
created_at DateTime @default(now())
updated_at DateTime @updatedAt
}
// SSO configuration table
model LiteLLM_SSOConfig {
id String @id @default("sso_config")
sso_settings Json
created_at DateTime @default(now())
updated_at DateTime @updatedAt
}
model LiteLLM_ManagedVectorStoreIndexTable {
id String @id @default(uuid())
index_name String @unique
litellm_params Json
index_info Json?
created_at DateTime @default(now())
created_by String?
updated_at DateTime @updatedAt
updated_by String?
}
// Cache configuration table
model LiteLLM_CacheConfig {
id String @id @default("cache_config")
cache_settings Json
created_at DateTime @default(now())
updated_at DateTime @updatedAt
}
// UI Settings configuration table
model LiteLLM_UISettings {
id String @id @default("ui_settings")
ui_settings Json
created_at DateTime @default(now())
updated_at DateTime @updatedAt
}
// Generic config overrides table - one row per config_type
model LiteLLM_ConfigOverrides {
config_type String @id
config_value Json
created_at DateTime @default(now())
updated_at DateTime @updatedAt
}
// Skills table for storing LiteLLM-managed skills
model LiteLLM_SkillsTable {
skill_id String @id @default(uuid())
display_title String?
description String?
instructions String? // The skill instructions/prompt (from SKILL.md)
source String @default("custom") // "custom" or "anthropic"
latest_version String?
file_content Bytes? // Binary content of the skill files (zip)
file_name String? // Original filename
file_type String? // MIME type (e.g., "application/zip")
metadata Json? @default("{}")
created_at DateTime @default(now())
created_by String?
updated_at DateTime @default(now()) @updatedAt
updated_by String?
}
// Policy table for storing guardrail policies (versioned)
model LiteLLM_PolicyTable {
policy_id String @id @default(uuid())
policy_name String // No longer @unique; use @@unique([policy_name, version_number])
version_number Int @default(1)
version_status String @default("production") // "draft" | "published" | "production"
parent_version_id String?
is_latest Boolean @default(true)
published_at DateTime?
production_at DateTime?
inherit String? // Name of parent policy to inherit from
description String?
guardrails_add String[] @default([])
guardrails_remove String[] @default([])
condition Json? @default("{}") // Policy conditions (e.g., model matching)
pipeline Json? // Optional guardrail pipeline (mode + steps[])
created_at DateTime @default(now())
created_by String?
updated_at DateTime @default(now()) @updatedAt
updated_by String?
@@unique([policy_name, version_number])
@@index([policy_name, version_status])
}
// Policy attachment table for defining where policies apply
model LiteLLM_PolicyAttachmentTable {
attachment_id String @id @default(uuid())
policy_name String // Name of the policy to attach
scope String? // Use '*' for global scope
teams String[] @default([]) // Team aliases or patterns
keys String[] @default([]) // Key aliases or patterns
models String[] @default([]) // Model names or patterns
tags String[] @default([]) // Tag patterns (e.g., ["healthcare", "prod-*"])
created_at DateTime @default(now())
created_by String?
updated_at DateTime @default(now()) @updatedAt
updated_by String?
}
// Global tool registry - auto-discovered from LLM responses; admins set input/output policies here
model LiteLLM_ToolTable {
tool_id String @id @default(uuid())
tool_name String @unique // e.g. "huggingface_remote-mcp__dynamic_space"
origin String? // MCP server name or "user_defined"
input_policy String @default("untrusted") // "trusted" | "untrusted" | "blocked"
output_policy String @default("untrusted") // "trusted" | "untrusted"
call_count Int @default(0) // cumulative number of times this tool was seen
assignments Json? @default("{}")
key_hash String? // hash of the virtual key that first called this tool
team_id String? // team that first called this tool
key_alias String? // human-readable alias of the virtual key
user_agent String? // user-agent of the first request that discovered this tool
last_used_at DateTime? // timestamp of the most recent call
created_at DateTime @default(now())
created_by String?
updated_at DateTime @default(now()) @updatedAt
updated_by String?
@@index([input_policy])
@@index([output_policy])
@@index([team_id])
}
// Per-(tool, team/key) policy overrides. When present, override replaces global tool policy for that scope.
//Unified Access Groups table for storing unified access groups
model LiteLLM_AccessGroupTable {
access_group_id String @id @default(uuid())
access_group_name String @unique
description String?
// Resource memberships - explicit arrays per type
access_model_names String[] @default([])
access_mcp_server_ids String[] @default([])
access_agent_ids String[] @default([])
assigned_team_ids String[] @default([])
assigned_key_ids String[] @default([])
created_at DateTime @default(now())
created_by String?
updated_at DateTime @default(now()) @updatedAt
updated_by String?
}
// Claude Code Plugin Marketplace table
model LiteLLM_ClaudeCodePluginTable {
id String @id @default(uuid())
name String @unique
version String?
description String?
manifest_json String?
files_json String? @default("{}")
enabled Boolean @default(true)
created_at DateTime? @default(now())
updated_at DateTime? @default(now()) @updatedAt
created_by String?
@@map("LiteLLM_ClaudeCodePluginTable")
}
// User/team-scoped memory store with a GLOBAL unique key.
// `value` is a string (typically markdown/text meant for LLM context);
// `metadata` is an optional JSON envelope for structured tags without schema changes.
// Note: `key` is globally unique across all users/teams — callers should
// namespace their keys (e.g. `user:123:notes`) if they need per-user isolation.
// `user_id` / `team_id` stamp ownership for visibility filtering, but do NOT
// participate in the unique constraint.
model LiteLLM_MemoryTable {
memory_id String @id @default(uuid())
key String @unique
value String
metadata Json?
user_id String?
team_id String?
created_at DateTime @default(now())
created_by String?
updated_at DateTime @default(now()) @updatedAt
updated_by String?
@@index([user_id])
@@index([team_id])
}
// Per-(router, request_type, model) Beta posterior for the adaptive router.
model LiteLLM_AdaptiveRouterState {
router_name String
request_type String
model_name String
alpha Float
beta Float
total_samples Int @default(0)
last_updated_at DateTime @default(now()) @updatedAt
@@id([router_name, request_type, model_name])
}
// Per-(session, router, model) signal counters for the adaptive router.
model LiteLLM_AdaptiveRouterSession {
session_id String
router_name String
model_name String
classified_type String
misalignment_count Int @default(0)
stagnation_count Int @default(0)
disengagement_count Int @default(0)
satisfaction_count Int @default(0)
failure_count Int @default(0)
loop_count Int @default(0)
exhaustion_count Int @default(0)
last_user_content String?
last_assistant_content String?
tool_call_history Json @default("[]")
pending_tool_calls Json @default("{}")
turn_count Int @default(0)
last_processed_turn Int @default(-1)
clean_credit_awarded Boolean @default(false)
terminal_status Int?
last_activity_at DateTime @default(now()) @updatedAt
@@id([session_id, router_name, model_name])
@@index([last_activity_at], map: "idx_adaptive_router_session_activity")
}
// ---------------------------------------------------------------------------
// Workflow Run Tracking
//
// Generic durable state tracking for any agent or automated workflow.
// Design: three tables — run (header + materialized status), event (append-only
// source of truth for state transitions), message (conversation inbox/outbox).
//
// Usage:
// - Set `workflow_type` to identify the owning system (e.g. "shin-builder").
// - Store domain-specific fields in `metadata` (worktree_path, pr_url, etc.).
// - `session_id` on WorkflowRun matches `x-litellm-session-id` header sent to
// the proxy — all spend logs for this run are automatically tagged.
// ---------------------------------------------------------------------------
// One instance of work being done. `status` is a materialized cache of the
// latest event; the event log is the authoritative source of truth.
model LiteLLM_WorkflowRun {
run_id String @id @default(uuid())
session_id String @unique @default(uuid())
workflow_type String
status String @default("pending")
created_by String? // user_id of the key that created this run; null = created by master key
created_at DateTime @default(now())
updated_at DateTime @updatedAt
input Json?
output Json?
metadata Json?
events LiteLLM_WorkflowEvent[]
messages LiteLLM_WorkflowMessage[]
@@index([workflow_type, status])
@@index([session_id])
@@index([created_at])
@@index([created_by])
}
// Append-only log of state transitions. Never mutate rows here.
// `step_name` and `event_type` are caller-defined strings — no hardcoded enums.
// Status auto-update rules (applied by the append endpoint):
// step.started → run.status = running
// step.failed → run.status = failed
// hook.waiting → run.status = paused
// hook.received → run.status = running
model LiteLLM_WorkflowEvent {
event_id String @id @default(uuid())
run_id String
event_type String
step_name String
sequence_number Int
data Json?
created_at DateTime @default(now())
run LiteLLM_WorkflowRun @relation(fields: [run_id], references: [run_id])
@@unique([run_id, sequence_number])
@@index([run_id])
}
// Conversation inbox/outbox — full message content, separate from the durable
// event log. Spend logs truncate messages; this table stores them in full.
// `session_id` here is the Claude --resume session ID (or similar).
model LiteLLM_WorkflowMessage {
message_id String @id @default(uuid())
run_id String
role String
content String
sequence_number Int
session_id String?
created_at DateTime @default(now())
run LiteLLM_WorkflowRun @relation(fields: [run_id], references: [run_id])
@@unique([run_id, sequence_number])
@@index([run_id])
}