litellm

Author	SHA1	Message	Date
ishaan-berri	f9ba70d357	fix(bedrock-mantle): use /anthropic/v1/messages path for Mantle endpo… (#27976 ) * fix(bedrock-mantle): use /anthropic/v1/messages path for Mantle endpoint (#27943) * docs: add one-line docstring to _disable_debugging (#27894) Squash-merged by litellm-agent from oss-agent-shin's PR. * Add jp. Bedrock cross-region inference profile for claude-sonnet-4-6 (#27831) Squash-merged by litellm-agent from Cyberfilo's PR. * Sanitize empty text content blocks on /v1/messages (#27832) Squash-merged by litellm-agent from Cyberfilo's PR. * fix(bedrock-mantle): use /anthropic/v1/messages path for Mantle endpoint The bedrock-mantle gateway (Claude Mythos Preview) serves the Anthropic Messages API at /anthropic/v1/messages; /v1/messages returns 404 Not Found. Both AmazonMantleConfig (chat/completions caller route) and AmazonMantleMessagesConfig (anthropic-messages caller route) hardcoded the wrong path, so every Mantle request 404'd before reaching the model. Per the Anthropic docs: "[Claude in Amazon Bedrock] uses the Messages API at /anthropic/v1/messages with SSE streaming." https://platform.claude.com/docs/en/api/claude-on-amazon-bedrock Confirmed independently against the live endpoint: /v1/chat/completions -> 200 OK /v1/messages -> 404 Not Found (what litellm used) /anthropic/v1/messages -> 200 OK (Claude only) Adds a regression test asserting both Mantle configs build the /anthropic/v1/messages path, and updates the existing assertions that encoded the wrong path. --------- Co-authored-by: oss-agent-shin <ext-agent-shin@berri.ai> Co-authored-by: Filippo Menghi <113345637+Cyberfilo@users.noreply.github.com> * fix: sanitize empty text blocks in sync anthropic_messages_handler path Co-authored-by: Yassin Kortam <yassin@berri.ai> --------- Co-authored-by: João Costa <13508071+jpv-costa@users.noreply.github.com> Co-authored-by: oss-agent-shin <ext-agent-shin@berri.ai> Co-authored-by: Filippo Menghi <113345637+Cyberfilo@users.noreply.github.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Yassin Kortam <yassin@berri.ai>	2026-05-15 13:31:59 -07:00
lmcdonald-godaddy	baa68ebb12	fix(pricing): GPT-4o-Transcribe Pricing (#27875 ) * Update gpt-4o-transcribe price * Update test for gpt-4o-transcribe pricing fix * Update gpt-4o-mini-transcribe price	2026-05-13 17:42:05 -07:00
Sameer Kankute	a74e269f7d	fix(cost): align vertex_ai/gemini-embedding-2-preview with Vertex multimodal pricing (#27848 ) * fix(cost): align vertex_ai/gemini-embedding-2-preview with Vertex multimodal pricing Co-authored-by: Cursor <cursoragent@cursor.com> * fix(cost): align vertex_ai/gemini-embedding-2 GA source URL with preview Per Greptile review on #27848: GA entry referenced ai.google.dev while the preview entry was updated to the canonical Vertex AI pricing page. Both share identical pricing values; sync the source URL for consistency. https://claude.ai/code/session_01W8jRwstnmduadGw8Z8egxe --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Claude <noreply@anthropic.com>	2026-05-13 19:05:53 +00:00
superpoussin22	4801425336	Add gpt-realtime-2 model pricing	2026-05-11 17:49:53 +02:00
oss-agent-shin	f2e97380d2	Add OpenRouter Qwen 3.6 Plus metadata (#27486 ) Co-authored-by: oss-agent-shin <279349115+oss-agent-shin@users.noreply.github.com> Co-authored-by: ishaan-berri <ishaan-berri@users.noreply.github.com>	2026-05-08 16:25:45 -07:00
ishaan-berri	fee5900acc	feat(xai): add grok-4.3 and grok-4.3-latest to model_prices_and_conte… (#27154 ) * feat(xai): add grok-4.3 and grok-4.3-latest to model_prices_and_context_window.json xAI's docs page now lists grok-4.3 as the recommended chat / coding model: "We strongly recommend all API callers use grok-4.3. It is the most intelligent and fastest model we've built." (https://docs.x.ai/docs/models) Pricing/specs sourced from xAI's published model metadata: - input: $1.25 / 1M tokens (<=200k), $2.50 / 1M tokens (>200k) - output: $2.50 / 1M tokens (<=200k), $5.00 / 1M tokens (>200k) - cached: $0.20 / 1M tokens (<=200k), $0.40 / 1M tokens (>200k) - context: 1,000,000 tokens - capabilities: vision, reasoning, function calling, structured outputs, prompt caching, web search Adds two entries: `xai/grok-4.3` (canonical) and `xai/grok-4.3-latest` (alias), mirroring the pattern used for the rest of the xAI/Grok-4 family. * test(xai): add model_info test for grok-4.3 + sync backup cost map - Mirror xai/grok-4.3 and xai/grok-4.3-latest entries into litellm/model_prices_and_context_window_backup.json so the bundled model cost map matches the canonical model_prices_and_context_window.json. - Add tests/test_litellm/test_xai_grok_4_3_model_metadata.py covering pricing tiers, capability flags, context window, provider routing, and parity between the main and backup cost maps. - Point 'source' at the live xAI models page (the per-model URL https://docs.x.ai/docs/models/grok-4.3 currently 404s). Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com> --------- Co-authored-by: shin-watcher <shin-watcher@berri.ai> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>	2026-05-07 09:06:56 -07:00
ishaan-berri	924c141843	Add new chat model metadata (#27313 ) * add new model metadata Co-authored-by: ishaan-berri <ishaan-berri@users.noreply.github.com> * address review feedback Co-authored-by: ishaan-berri <ishaan-berri@users.noreply.github.com> --------- Co-authored-by: oss-agent-shin <279349115+oss-agent-shin@users.noreply.github.com> Co-authored-by: ishaan-berri <ishaan-berri@users.noreply.github.com>	2026-05-06 15:15:21 -07:00
Cursor Agent	98ced0ae43	refactor(anthropic): drive adaptive-thinking gate via supports_adaptive_thinking flag Three of greptile's open comments on #27074 (P2 converse:512, P1 databricks:361, and the underlying capability-flag policy rule) flagged the same pattern: _is_claude_4_6_model(...) or _is_claude_4_7_model(...) used inline as a runtime 'is this an adaptive-thinking model?' check. That requires a code release each time a new adaptive Claude lands. Consolidate the inline gating to AnthropicModelInfo._is_adaptive_thinking_model, and switch the helper itself to read a new supports_adaptive_thinking flag from `model_prices_and_context_window.json` via `_supports_factory`, falling back to the family pattern only when the model-map entry doesn't carry the flag (preserves OpenRouter / Vercel / Bedrock-prefixed variants that route through the same code path with non-canonical ids). Adds `supports_adaptive_thinking: true` to the four 4.6/4.7 anthropic entries (opus-4-6 + dated, opus-4-7 + dated, sonnet-4-6). Bedrock-prefixed and Vertex-prefixed entries don't need the flag because both fall back through the family pattern (the helper short-circuits early on True from either path) and the bedrock/vertex Claude IDs all match the existing opus-4-{6,7} / sonnet-4-{6,7} pattern. Affected call sites: - `bedrock/chat/converse_transformation.py:_handle_reasoning_effort_parameter` - `anthropic/chat/transformation.py:_map_reasoning_effort` - `anthropic/chat/transformation.py:map_openai_params` (output_config branch) - `databricks/chat/transformation.py:map_openai_params` (output_config branch) The remaining `_is_claude_4_6_model` / `_is_claude_4_7_model` references in `AnthropicConfig._validate_effort_for_model` and `AnthropicConfig.get_supported_openai_params` are intentionally retained: they're per-model gating fallbacks for variants whose model-map entries don't yet carry the `supports_max_reasoning_effort` / `supports_reasoning` flag. Those are documented in-place. Tests: 537 anthropic/bedrock/databricks/vertex/messages tests pass. Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>	2026-05-04 18:58:22 +00:00
mateo-berri	108b87fb24	fix(anthropic,bedrock,databricks): four reasoning_effort follow-ups - claude-sonnet-4-6 + reasoning_effort=max no longer 400s. Renamed _is_opus_4_6_model to _is_claude_4_6_model at three sites and added supports_max_reasoning_effort: true to 12 model entries in the JSON cost map (10 sonnet 4.6 ids + OpenRouter opus 4.6/4.7). - _map_reasoning_effort now raises BadRequestError(400) directly with llm_provider, instead of letting Databricks (and similar callers) surface its raw ValueError as a 500. - output_config.effort on Opus 4.5 over Bedrock no longer 400s for missing effort-2025-11-24 beta. Flipped JSON to "effort-2025-11-24" for bedrock + bedrock_converse and added an auto-attach branch in _process_tools_and_beta for non-adaptive Anthropic + output_config on Converse. - reasoning_effort=xhigh / =max on legacy budget-mode models (Haiku 4.5, Sonnet 4.5, Opus 4.5) now map to thinking.budget_tokens 8192 / 16384 instead of returning 400. Added two constants in litellm/constants.py. Tests updated for all four flips. Validated end-to-end via 306-cell live proxy matrix (6 model families x 3 routes x 17 effort cases), all pass.	2026-05-03 10:03:53 -07:00
mateo-berri	36f1f13925	fix(anthropic): drive output_config.effort support from model map flags Replace hardcoded _EFFORT_SUPPORTING_MODEL_PATTERNS with a JSON-backed check that uses supports_*_reasoning_effort flags from the model map. Add supports_minimal_reasoning_effort: true to opus-4-5 and mythos-preview entries (which previously only carried supports_reasoning) so the JSON remains the single source of truth for effort capability.	2026-05-03 11:47:19 +00:00
Cursor Agent	a6c673e7b9	fix(anthropic,bedrock,vertex): forward output_config.effort + 400 on garbage reasoning_effort Follow-up bugs surfaced by the QA sweep on PR #27039 (https://github.com/BerriAI/litellm/pull/27039#issuecomment-4363363610). 1. Stop stripping output_config.effort on Bedrock + Vertex adaptive routes. - Vertex AI Claude 4.6/4.7 accepts output_config.effort on rawPredict (verified end-to-end against us-east5 / global). The strip helper now no-ops for effort. - Bedrock Converse routes output_config into additionalModelRequestFields for anthropic base models so the requested adaptive tier (low/medium/ high/xhigh/max) actually reaches the wire instead of all collapsing to identical thinking. - Bedrock Invoke chat transformation (AmazonAnthropicClaudeConfig) stops popping output_config from the post-AnthropicConfig request body. - Bedrock Invoke /v1/messages allowlist (BedrockInvokeAnthropicMessagesRequest) now lists output_config so the runtime allowlist filter forwards it. 2. Validate effort across Bedrock Converse so 'disabled' / 'invalid' / '' / unsupported tiers (xhigh/max on Sonnet 4.6 or budget-mode 4.5 models) surface as a clean 400 BadRequestError instead of 500. 3. ValueError -> BadRequestError throughout (AnthropicConfig.map_openai_params, _apply_output_config, AmazonConverseConfig._handle_reasoning_effort_parameter). Empty-string effort is now rejected (was silently passing the 'if effort and ...' short-circuit). 4. Floor reasoning_effort='minimal' at the Anthropic provider minimum (1024 budget_tokens) via new ANTHROPIC_MIN_THINKING_BUDGET_TOKENS so it's a usable tier on direct Anthropic / Azure AI Anthropic / Vertex AI Anthropic / Bedrock Invoke (all of which 400 below 1024). 5. model_prices: dedupe duplicate supports_max_reasoning_effort key on claude-opus-4-7 / claude-opus-4-7-20260416. Adds regression tests across all five affected paths; existing tests asserting the silent-strip behavior were updated to reflect the new pass-through and clean 400 surfaces. Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>	2026-05-03 04:18:50 -07:00
Cursor Agent	a30bcc9a41	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_hotfix_gpt-5.5-minimal-flag # Conflicts: # tests/test_litellm/llms/vertex_ai/test_vertex_ai_common_utils.py Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>	2026-05-02 05:55:51 +00:00
mateo-berri	04e96a9bdc	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_clean_litellm_oss_staging_04_01_2026	2026-05-01 15:54:10 -07:00
yuneng-jiang	02582466c4	Merge pull request #24340 from BerriAI/litellm_staging_03_21_2026 Litellm staging 03 21 2026	2026-05-01 11:57:44 -07:00
Sameer Kankute	e656b2a47b	correct model map	2026-05-01 18:07:33 +05:30
Sameer Kankute	19813527fa	feat(vertex_ai): Model Garden OpenAPI for publisher model ids - Route publisher/model ids (e.g. xai/grok) to .../endpoints/openapi; keep model in JSON body - Add model_prices keys for vertex_ai/openai/xai/grok-* - Document xAI Grok on vertex_partner (aligned with GPT-OSS) - Add tests for create_vertex_url and body-model heuristic Made-with: Cursor	2026-05-01 18:05:08 +05:30
Emmanuel Acheampong	f8ba2d750b	fix(crusoe): fix streaming doc model typo and add supports_vision for Gemma 3 - Streaming example referenced Llama-3.1 instead of Llama-3.3 - Add supports_vision: true for gemma-3-12b-it in both JSON files, matching other providers (bedrock, novita)	2026-05-01 17:27:52 +05:30
Emmanuel Acheampong	51f8e5a57b	feat(crusoe): add supports_reasoning flag for DeepSeek-R1 and Kimi-K2-Thinking These are reasoning/thinking models but were missing the flag, causing litellm.supports_reasoning() to return False and reasoning-token handling to not activate. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-01 17:27:52 +05:30
Emmanuel Acheampong	caa0db3843	adding crusoe to litellm	2026-05-01 17:27:34 +05:30
Cursor Agent	3f5c589255	fix(bedrock): add 1-hour cache write tier for Claude 4.5/4.6/4.7 (Global, US) AWS Bedrock pricing publishes a separate 1-hour prompt-cache write rate for Claude 4.5 / 4.6 / 4.7 (1.6x the 5-minute rate). Without `cache_creation_input_token_cost_above_1hr`, cost tracking for 1-hour-TTL prompt caching on Bedrock falls back to the 5-minute rate and undercounts spend by ~60%. Adds the field to the spot-checked Global and US-region entries: - anthropic.claude-opus-4-7 (Global $10.00 / MTok) - anthropic.claude-opus-4-6-v1 (Global $10.00 / MTok) - anthropic.claude-opus-4-5-... (Global $10.00 / MTok) - anthropic.claude-sonnet-4-6 (Global $6.00 / MTok) - anthropic.claude-sonnet-4-5-... (Global $6.00 / MTok regular, $12.00 / MTok long-context >200K) - anthropic.claude-haiku-4-5-... (Global $2.00 / MTok) - global.anthropic.* mirrors of the above - us.anthropic.* mirrors at the US +10% premium Also updates the long-context (>200K) variants of Sonnet 4.5 with `cache_creation_input_token_cost_above_1hr_above_200k_tokens`. The mirrored entries in `litellm/model_prices_and_context_window_backup.json` are updated in lockstep. EU / AU / APAC / JP / us-gov regional variants are out of scope for this change pending separate verification against AWS Bedrock pricing for those regions. Adds tests/test_litellm/test_bedrock_anthropic_1hr_cache_pricing.py to lock in the expected values and the 1.6x ratio invariant. Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>	2026-04-29 19:21:57 +00:00
ishaan-berri	4ae2996f08	Add gpt-image-2 support (#26644 ) (#26705 ) * Add gpt-image-2 support * Address gpt-image-2 PR feedback Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com>	2026-04-28 20:10:42 -07:00
Liam McDonald	503c3921c8	Fix gpt-5.5-pro pricing	2026-04-27 15:33:59 -07:00
Mateo Wang	319193604c	[Feat] Add azure/gpt-5.5 + azure/gpt-5.5-pro entries (+ dated variants) (#26361 ) * feat(azure): add azure/gpt-5.5 + azure/gpt-5.5-pro entries (+ dated variants) Azure variants of OpenAI's GPT-5.5 family. Microsoft has not yet shipped GPT-5.5 on Azure OpenAI (latest GA on the Foundry models page is GPT-5.4 as of 2026-04-24), but adding the entries day-0 mirrors the established precedent for azure/gpt-5.4* (which were in the cost map before the Azure rollout) so cost tracking and capability flags work the moment customers deploy. Schema follows the existing azure/gpt-5.4* shape: - Same base/long-context pricing as openai/gpt-5.5: $5/$30 chat, $60/$360 pro per 1M, with priority tier 2x base - Azure variants drop the flex/batches keys (Azure has no flex tier) but keep priority pricing, matching gpt-5.4 precedent - mode=chat for the thinking model, mode=responses for pro reasoning_effort capability flags mirror the OpenAI variants exactly since Azure proxies the same API contract: minimal rejection on both chat and pro, low/none rejection on pro. Once #26456 (which sets supports_low_reasoning_effort + minimal=false on openai/gpt-5.5) lands, OpenAI and Azure flag profiles align. Tests pin entry presence + pricing for all four Azure variants and verify the live-API-derived reasoning_effort flags. test: register supports_low_reasoning_effort in cost-map JSON schema azure/gpt-5.5-pro and azure/gpt-5.5-pro-2026-04-23 added in this branch carry supports_low_reasoning_effort=false. The strict 'additionalProperties: false' schema in test_aaamodel_prices_and_context_window_json_is_valid rejected the new key. Register it alongside the other supports_*_reasoning_effort entries. Note: the runtime side of this flag (code that reads it) lands in #26456. Until that PR merges the flag is inert for both Azure and OpenAI pro entries, but having the schema accept it lets cost-map tests pass on either merge order.	2026-04-25 14:19:59 -07:00
Chesars	91e78eca3d	Merge remote-tracking branch 'upstream/litellm_internal_staging' into upstream-litellm_staging_03_21_2026 # Conflicts: # .circleci/config.yml # .circleci/requirements.txt # .github/workflows/_test-unit-base.yml # .github/workflows/_test-unit-services-base.yml # .github/workflows/auto_update_price_and_context_window.yml # .github/workflows/create-release.yml # .github/workflows/llm-translation-testing.yml # .github/workflows/publish_to_pypi.yml # .github/workflows/scan_duplicate_issues.yml # .github/workflows/test-linting.yml # .github/workflows/test-litellm-matrix.yml # .github/workflows/test-litellm.yml # .github/workflows/test-mcp.yml # .github/workflows/test-model-map.yaml # .github/workflows/test-proxy-e2e-azure-batches.yml # .github/workflows/test-unit-core-utils.yml # .github/workflows/test-unit-documentation.yml # .github/workflows/test-unit-enterprise-routing.yml # .github/workflows/test-unit-integrations.yml # .github/workflows/test-unit-llm-providers.yml # .github/workflows/test-unit-misc.yml # .github/workflows/test-unit-proxy-auth.yml # .github/workflows/test-unit-proxy-db.yml # .github/workflows/test-unit-proxy-endpoints.yml # .github/workflows/test-unit-proxy-infra.yml # .github/workflows/test-unit-proxy-legacy.yml # .github/workflows/test-unit-responses-caching-types.yml # .github/workflows/test-unit-security.yml # .github/workflows/test_server_root_path.yml # docs/my-website/docs/embedding/supported_embedding.md # litellm/litellm_core_utils/get_llm_provider_logic.py # litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_transformation.py # litellm/proxy/_experimental/out/404/index.html # litellm/proxy/_experimental/out/__next.__PAGE__.txt # litellm/proxy/_experimental/out/__next._full.txt # litellm/proxy/_experimental/out/__next._head.txt # litellm/proxy/_experimental/out/__next._index.txt # litellm/proxy/_experimental/out/__next._tree.txt # litellm/proxy/_experimental/out/_next/static/3qyC5Vtvhd5fSC6sPp1iW/_buildManifest.js # litellm/proxy/_experimental/out/_next/static/3qyC5Vtvhd5fSC6sPp1iW/_clientMiddlewareManifest.json # litellm/proxy/_experimental/out/_next/static/3qyC5Vtvhd5fSC6sPp1iW/_ssgManifest.js # litellm/proxy/_experimental/out/_next/static/aKKihXXKRJWLQThZgi8Rq/_buildManifest.js # litellm/proxy/_experimental/out/_next/static/aKKihXXKRJWLQThZgi8Rq/_clientMiddlewareManifest.json # litellm/proxy/_experimental/out/_next/static/aKKihXXKRJWLQThZgi8Rq/_ssgManifest.js # litellm/proxy/_experimental/out/_next/static/bmMTxs1O5fQKYcsMNTRMT/_buildManifest.js # litellm/proxy/_experimental/out/_next/static/bmMTxs1O5fQKYcsMNTRMT/_clientMiddlewareManifest.json # litellm/proxy/_experimental/out/_next/static/bmMTxs1O5fQKYcsMNTRMT/_ssgManifest.js # litellm/proxy/_experimental/out/_next/static/chunks/11362340846735c3.js # litellm/proxy/_experimental/out/_next/static/chunks/1a04d31843c96649.js # litellm/proxy/_experimental/out/_next/static/chunks/342c7d7210247a5e.js # litellm/proxy/_experimental/out/_next/static/chunks/39768ec0eebd2554.js # litellm/proxy/_experimental/out/_next/static/chunks/3b3c0b070b14da06.js # litellm/proxy/_experimental/out/_next/static/chunks/3bddc72a3ecc2253.js # litellm/proxy/_experimental/out/_next/static/chunks/4472ece1be7379b3.js # litellm/proxy/_experimental/out/_next/static/chunks/54e29148cb2f2582.js # litellm/proxy/_experimental/out/_next/static/chunks/67ddb5107368a659.js # litellm/proxy/_experimental/out/_next/static/chunks/6a167cef4b09b496.js # litellm/proxy/_experimental/out/_next/static/chunks/7174130ddef406dd.js # litellm/proxy/_experimental/out/_next/static/chunks/7c36bfe1ba5e3ba8.js # litellm/proxy/_experimental/out/_next/static/chunks/7e5fe5584502da06.js # litellm/proxy/_experimental/out/_next/static/chunks/8dda507c226082ca.js # litellm/proxy/_experimental/out/_next/static/chunks/8dfde809dc4ad794.js # litellm/proxy/_experimental/out/_next/static/chunks/99109c78121231a0.js # litellm/proxy/_experimental/out/_next/static/chunks/9dd55e1f36a7225c.js # litellm/proxy/_experimental/out/_next/static/chunks/a230559fcabaea23.js # litellm/proxy/_experimental/out/_next/static/chunks/a6c7f80b3968f639.js # litellm/proxy/_experimental/out/_next/static/chunks/ac9e96d21c200b48.js # litellm/proxy/_experimental/out/_next/static/chunks/ae9cf43b8c0c76aa.js # litellm/proxy/_experimental/out/_next/static/chunks/cf06797ce4e438f9.js # litellm/proxy/_experimental/out/_next/static/chunks/d069df5baead6d90.js # litellm/proxy/_experimental/out/_next/static/chunks/d2e3b7dd6499c245.js # litellm/proxy/_experimental/out/_next/static/chunks/d44e73d8ebac5747.js # litellm/proxy/_experimental/out/_next/static/chunks/dc8a270fee94ced6.js # litellm/proxy/_experimental/out/_next/static/chunks/df6546cd8a44d3b3.js # litellm/proxy/_experimental/out/_next/static/chunks/ea0f22bd4b3393bd.js # litellm/proxy/_experimental/out/_next/static/chunks/eaa9f9b9bb3e054b.js # litellm/proxy/_experimental/out/_next/static/chunks/turbopack-901b35f89c1f6751.js # litellm/proxy/_experimental/out/_next/static/chunks/turbopack-d1b22f5e0bd58c57.js # litellm/proxy/_experimental/out/_next/static/chunks/turbopack-ddedb29a5eb0118f.js # litellm/proxy/_experimental/out/_not-found.txt # litellm/proxy/_experimental/out/_not-found/__next._full.txt # litellm/proxy/_experimental/out/_not-found/__next._head.txt # litellm/proxy/_experimental/out/_not-found/__next._index.txt # litellm/proxy/_experimental/out/_not-found/__next._not-found.__PAGE__.txt # litellm/proxy/_experimental/out/_not-found/__next._not-found.txt # litellm/proxy/_experimental/out/_not-found/__next._tree.txt # litellm/proxy/_experimental/out/_not-found/index.html # litellm/proxy/_experimental/out/api-reference.html # litellm/proxy/_experimental/out/api-reference.txt # litellm/proxy/_experimental/out/api-reference/__next.!KGRhc2hib2FyZCk.api-reference.__PAGE__.txt # litellm/proxy/_experimental/out/api-reference/__next.!KGRhc2hib2FyZCk.api-reference.txt # litellm/proxy/_experimental/out/api-reference/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/api-reference/__next._full.txt # litellm/proxy/_experimental/out/api-reference/__next._head.txt # litellm/proxy/_experimental/out/api-reference/__next._index.txt # litellm/proxy/_experimental/out/api-reference/__next._tree.txt # litellm/proxy/_experimental/out/chat.html # litellm/proxy/_experimental/out/chat.txt # litellm/proxy/_experimental/out/chat/__next._full.txt # litellm/proxy/_experimental/out/chat/__next._head.txt # litellm/proxy/_experimental/out/chat/__next._index.txt # litellm/proxy/_experimental/out/chat/__next._tree.txt # litellm/proxy/_experimental/out/chat/__next.chat.__PAGE__.txt # litellm/proxy/_experimental/out/chat/__next.chat.txt # litellm/proxy/_experimental/out/experimental/api-playground.html # litellm/proxy/_experimental/out/experimental/api-playground.txt # litellm/proxy/_experimental/out/experimental/api-playground/__next.!KGRhc2hib2FyZCk.experimental.api-playground.__PAGE__.txt # litellm/proxy/_experimental/out/experimental/api-playground/__next.!KGRhc2hib2FyZCk.experimental.api-playground.txt # litellm/proxy/_experimental/out/experimental/api-playground/__next.!KGRhc2hib2FyZCk.experimental.txt # litellm/proxy/_experimental/out/experimental/api-playground/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/experimental/api-playground/__next._full.txt # litellm/proxy/_experimental/out/experimental/api-playground/__next._head.txt # litellm/proxy/_experimental/out/experimental/api-playground/__next._index.txt # litellm/proxy/_experimental/out/experimental/api-playground/__next._tree.txt # litellm/proxy/_experimental/out/experimental/budgets.html # litellm/proxy/_experimental/out/experimental/budgets.txt # litellm/proxy/_experimental/out/experimental/budgets/__next.!KGRhc2hib2FyZCk.experimental.budgets.__PAGE__.txt # litellm/proxy/_experimental/out/experimental/budgets/__next.!KGRhc2hib2FyZCk.experimental.budgets.txt # litellm/proxy/_experimental/out/experimental/budgets/__next.!KGRhc2hib2FyZCk.experimental.txt # litellm/proxy/_experimental/out/experimental/budgets/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/experimental/budgets/__next._full.txt # litellm/proxy/_experimental/out/experimental/budgets/__next._head.txt # litellm/proxy/_experimental/out/experimental/budgets/__next._index.txt # litellm/proxy/_experimental/out/experimental/budgets/__next._tree.txt # litellm/proxy/_experimental/out/experimental/caching.html # litellm/proxy/_experimental/out/experimental/caching.txt # litellm/proxy/_experimental/out/experimental/caching/__next.!KGRhc2hib2FyZCk.experimental.caching.__PAGE__.txt # litellm/proxy/_experimental/out/experimental/caching/__next.!KGRhc2hib2FyZCk.experimental.caching.txt # litellm/proxy/_experimental/out/experimental/caching/__next.!KGRhc2hib2FyZCk.experimental.txt # litellm/proxy/_experimental/out/experimental/caching/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/experimental/caching/__next._full.txt # litellm/proxy/_experimental/out/experimental/caching/__next._head.txt # litellm/proxy/_experimental/out/experimental/caching/__next._index.txt # litellm/proxy/_experimental/out/experimental/caching/__next._tree.txt # litellm/proxy/_experimental/out/experimental/claude-code-plugins.html # litellm/proxy/_experimental/out/experimental/claude-code-plugins.txt # litellm/proxy/_experimental/out/experimental/claude-code-plugins/__next.!KGRhc2hib2FyZCk.experimental.claude-code-plugins.__PAGE__.txt # litellm/proxy/_experimental/out/experimental/claude-code-plugins/__next.!KGRhc2hib2FyZCk.experimental.claude-code-plugins.txt # litellm/proxy/_experimental/out/experimental/claude-code-plugins/__next.!KGRhc2hib2FyZCk.experimental.txt # litellm/proxy/_experimental/out/experimental/claude-code-plugins/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/experimental/claude-code-plugins/__next._full.txt # litellm/proxy/_experimental/out/experimental/claude-code-plugins/__next._head.txt # litellm/proxy/_experimental/out/experimental/claude-code-plugins/__next._index.txt # litellm/proxy/_experimental/out/experimental/claude-code-plugins/__next._tree.txt # litellm/proxy/_experimental/out/experimental/old-usage.html # litellm/proxy/_experimental/out/experimental/old-usage.txt # litellm/proxy/_experimental/out/experimental/old-usage/__next.!KGRhc2hib2FyZCk.experimental.old-usage.__PAGE__.txt # litellm/proxy/_experimental/out/experimental/old-usage/__next.!KGRhc2hib2FyZCk.experimental.old-usage.txt # litellm/proxy/_experimental/out/experimental/old-usage/__next.!KGRhc2hib2FyZCk.experimental.txt # litellm/proxy/_experimental/out/experimental/old-usage/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/experimental/old-usage/__next._full.txt # litellm/proxy/_experimental/out/experimental/old-usage/__next._head.txt # litellm/proxy/_experimental/out/experimental/old-usage/__next._index.txt # litellm/proxy/_experimental/out/experimental/old-usage/__next._tree.txt # litellm/proxy/_experimental/out/experimental/prompts.html # litellm/proxy/_experimental/out/experimental/prompts.txt # litellm/proxy/_experimental/out/experimental/prompts/__next.!KGRhc2hib2FyZCk.experimental.prompts.__PAGE__.txt # litellm/proxy/_experimental/out/experimental/prompts/__next.!KGRhc2hib2FyZCk.experimental.prompts.txt # litellm/proxy/_experimental/out/experimental/prompts/__next.!KGRhc2hib2FyZCk.experimental.txt # litellm/proxy/_experimental/out/experimental/prompts/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/experimental/prompts/__next._full.txt # litellm/proxy/_experimental/out/experimental/prompts/__next._head.txt # litellm/proxy/_experimental/out/experimental/prompts/__next._index.txt # litellm/proxy/_experimental/out/experimental/prompts/__next._tree.txt # litellm/proxy/_experimental/out/experimental/tag-management.html # litellm/proxy/_experimental/out/experimental/tag-management.txt # litellm/proxy/_experimental/out/experimental/tag-management/__next.!KGRhc2hib2FyZCk.experimental.tag-management.__PAGE__.txt # litellm/proxy/_experimental/out/experimental/tag-management/__next.!KGRhc2hib2FyZCk.experimental.tag-management.txt # litellm/proxy/_experimental/out/experimental/tag-management/__next.!KGRhc2hib2FyZCk.experimental.txt # litellm/proxy/_experimental/out/experimental/tag-management/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/experimental/tag-management/__next._full.txt # litellm/proxy/_experimental/out/experimental/tag-management/__next._head.txt # litellm/proxy/_experimental/out/experimental/tag-management/__next._index.txt # litellm/proxy/_experimental/out/experimental/tag-management/__next._tree.txt # litellm/proxy/_experimental/out/guardrails.html # litellm/proxy/_experimental/out/guardrails.txt # litellm/proxy/_experimental/out/guardrails/__next.!KGRhc2hib2FyZCk.guardrails.__PAGE__.txt # litellm/proxy/_experimental/out/guardrails/__next.!KGRhc2hib2FyZCk.guardrails.txt # litellm/proxy/_experimental/out/guardrails/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/guardrails/__next._full.txt # litellm/proxy/_experimental/out/guardrails/__next._head.txt # litellm/proxy/_experimental/out/guardrails/__next._index.txt # litellm/proxy/_experimental/out/guardrails/__next._tree.txt # litellm/proxy/_experimental/out/index.html # litellm/proxy/_experimental/out/index.txt # litellm/proxy/_experimental/out/login.html # litellm/proxy/_experimental/out/login.txt # litellm/proxy/_experimental/out/login/__next._full.txt # litellm/proxy/_experimental/out/login/__next._head.txt # litellm/proxy/_experimental/out/login/__next._index.txt # litellm/proxy/_experimental/out/login/__next._tree.txt # litellm/proxy/_experimental/out/login/__next.login.__PAGE__.txt # litellm/proxy/_experimental/out/login/__next.login.txt # litellm/proxy/_experimental/out/logs.html # litellm/proxy/_experimental/out/logs.txt # litellm/proxy/_experimental/out/logs/__next.!KGRhc2hib2FyZCk.logs.__PAGE__.txt # litellm/proxy/_experimental/out/logs/__next.!KGRhc2hib2FyZCk.logs.txt # litellm/proxy/_experimental/out/logs/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/logs/__next._full.txt # litellm/proxy/_experimental/out/logs/__next._head.txt # litellm/proxy/_experimental/out/logs/__next._index.txt # litellm/proxy/_experimental/out/logs/__next._tree.txt # litellm/proxy/_experimental/out/mcp/oauth/callback.txt # litellm/proxy/_experimental/out/mcp/oauth/callback/__next._full.txt # litellm/proxy/_experimental/out/mcp/oauth/callback/__next._head.txt # litellm/proxy/_experimental/out/mcp/oauth/callback/__next._index.txt # litellm/proxy/_experimental/out/mcp/oauth/callback/__next._tree.txt # litellm/proxy/_experimental/out/mcp/oauth/callback/__next.mcp.oauth.callback.__PAGE__.txt # litellm/proxy/_experimental/out/mcp/oauth/callback/__next.mcp.oauth.callback.txt # litellm/proxy/_experimental/out/mcp/oauth/callback/__next.mcp.oauth.txt # litellm/proxy/_experimental/out/mcp/oauth/callback/__next.mcp.txt # litellm/proxy/_experimental/out/mcp/oauth/callback/index.html # litellm/proxy/_experimental/out/model-hub.html # litellm/proxy/_experimental/out/model-hub.txt # litellm/proxy/_experimental/out/model-hub/__next.!KGRhc2hib2FyZCk.model-hub.__PAGE__.txt # litellm/proxy/_experimental/out/model-hub/__next.!KGRhc2hib2FyZCk.model-hub.txt # litellm/proxy/_experimental/out/model-hub/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/model-hub/__next._full.txt # litellm/proxy/_experimental/out/model-hub/__next._head.txt # litellm/proxy/_experimental/out/model-hub/__next._index.txt # litellm/proxy/_experimental/out/model-hub/__next._tree.txt # litellm/proxy/_experimental/out/model_hub.html # litellm/proxy/_experimental/out/model_hub.txt # litellm/proxy/_experimental/out/model_hub/__next._full.txt # litellm/proxy/_experimental/out/model_hub/__next._head.txt # litellm/proxy/_experimental/out/model_hub/__next._index.txt # litellm/proxy/_experimental/out/model_hub/__next._tree.txt # litellm/proxy/_experimental/out/model_hub/__next.model_hub.__PAGE__.txt # litellm/proxy/_experimental/out/model_hub/__next.model_hub.txt # litellm/proxy/_experimental/out/model_hub_table.html # litellm/proxy/_experimental/out/model_hub_table.txt # litellm/proxy/_experimental/out/model_hub_table/__next._full.txt # litellm/proxy/_experimental/out/model_hub_table/__next._head.txt # litellm/proxy/_experimental/out/model_hub_table/__next._index.txt # litellm/proxy/_experimental/out/model_hub_table/__next._tree.txt # litellm/proxy/_experimental/out/model_hub_table/__next.model_hub_table.__PAGE__.txt # litellm/proxy/_experimental/out/model_hub_table/__next.model_hub_table.txt # litellm/proxy/_experimental/out/models-and-endpoints.html # litellm/proxy/_experimental/out/models-and-endpoints.txt # litellm/proxy/_experimental/out/models-and-endpoints/__next.!KGRhc2hib2FyZCk.models-and-endpoints.__PAGE__.txt # litellm/proxy/_experimental/out/models-and-endpoints/__next.!KGRhc2hib2FyZCk.models-and-endpoints.txt # litellm/proxy/_experimental/out/models-and-endpoints/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/models-and-endpoints/__next._full.txt # litellm/proxy/_experimental/out/models-and-endpoints/__next._head.txt # litellm/proxy/_experimental/out/models-and-endpoints/__next._index.txt # litellm/proxy/_experimental/out/models-and-endpoints/__next._tree.txt # litellm/proxy/_experimental/out/onboarding.html # litellm/proxy/_experimental/out/onboarding.txt # litellm/proxy/_experimental/out/onboarding/__next._full.txt # litellm/proxy/_experimental/out/onboarding/__next._head.txt # litellm/proxy/_experimental/out/onboarding/__next._index.txt # litellm/proxy/_experimental/out/onboarding/__next._tree.txt # litellm/proxy/_experimental/out/onboarding/__next.onboarding.__PAGE__.txt # litellm/proxy/_experimental/out/onboarding/__next.onboarding.txt # litellm/proxy/_experimental/out/organizations.html # litellm/proxy/_experimental/out/organizations.txt # litellm/proxy/_experimental/out/organizations/__next.!KGRhc2hib2FyZCk.organizations.__PAGE__.txt # litellm/proxy/_experimental/out/organizations/__next.!KGRhc2hib2FyZCk.organizations.txt # litellm/proxy/_experimental/out/organizations/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/organizations/__next._full.txt # litellm/proxy/_experimental/out/organizations/__next._head.txt # litellm/proxy/_experimental/out/organizations/__next._index.txt # litellm/proxy/_experimental/out/organizations/__next._tree.txt # litellm/proxy/_experimental/out/playground.html # litellm/proxy/_experimental/out/playground.txt # litellm/proxy/_experimental/out/playground/__next.!KGRhc2hib2FyZCk.playground.__PAGE__.txt # litellm/proxy/_experimental/out/playground/__next.!KGRhc2hib2FyZCk.playground.txt # litellm/proxy/_experimental/out/playground/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/playground/__next._full.txt # litellm/proxy/_experimental/out/playground/__next._head.txt # litellm/proxy/_experimental/out/playground/__next._index.txt # litellm/proxy/_experimental/out/playground/__next._tree.txt # litellm/proxy/_experimental/out/policies.html # litellm/proxy/_experimental/out/policies.txt # litellm/proxy/_experimental/out/policies/__next.!KGRhc2hib2FyZCk.policies.__PAGE__.txt # litellm/proxy/_experimental/out/policies/__next.!KGRhc2hib2FyZCk.policies.txt # litellm/proxy/_experimental/out/policies/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/policies/__next._full.txt # litellm/proxy/_experimental/out/policies/__next._head.txt # litellm/proxy/_experimental/out/policies/__next._index.txt # litellm/proxy/_experimental/out/policies/__next._tree.txt # litellm/proxy/_experimental/out/settings/admin-settings.html # litellm/proxy/_experimental/out/settings/admin-settings.txt # litellm/proxy/_experimental/out/settings/admin-settings/__next.!KGRhc2hib2FyZCk.settings.admin-settings.__PAGE__.txt # litellm/proxy/_experimental/out/settings/admin-settings/__next.!KGRhc2hib2FyZCk.settings.admin-settings.txt # litellm/proxy/_experimental/out/settings/admin-settings/__next.!KGRhc2hib2FyZCk.settings.txt # litellm/proxy/_experimental/out/settings/admin-settings/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/settings/admin-settings/__next._full.txt # litellm/proxy/_experimental/out/settings/admin-settings/__next._head.txt # litellm/proxy/_experimental/out/settings/admin-settings/__next._index.txt # litellm/proxy/_experimental/out/settings/admin-settings/__next._tree.txt # litellm/proxy/_experimental/out/settings/logging-and-alerts.html # litellm/proxy/_experimental/out/settings/logging-and-alerts.txt # litellm/proxy/_experimental/out/settings/logging-and-alerts/__next.!KGRhc2hib2FyZCk.settings.logging-and-alerts.__PAGE__.txt # litellm/proxy/_experimental/out/settings/logging-and-alerts/__next.!KGRhc2hib2FyZCk.settings.logging-and-alerts.txt # litellm/proxy/_experimental/out/settings/logging-and-alerts/__next.!KGRhc2hib2FyZCk.settings.txt # litellm/proxy/_experimental/out/settings/logging-and-alerts/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/settings/logging-and-alerts/__next._full.txt # litellm/proxy/_experimental/out/settings/logging-and-alerts/__next._head.txt # litellm/proxy/_experimental/out/settings/logging-and-alerts/__next._index.txt # litellm/proxy/_experimental/out/settings/logging-and-alerts/__next._tree.txt # litellm/proxy/_experimental/out/settings/router-settings.html # litellm/proxy/_experimental/out/settings/router-settings.txt # litellm/proxy/_experimental/out/settings/router-settings/__next.!KGRhc2hib2FyZCk.settings.router-settings.__PAGE__.txt # litellm/proxy/_experimental/out/settings/router-settings/__next.!KGRhc2hib2FyZCk.settings.router-settings.txt # litellm/proxy/_experimental/out/settings/router-settings/__next.!KGRhc2hib2FyZCk.settings.txt # litellm/proxy/_experimental/out/settings/router-settings/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/settings/router-settings/__next._full.txt # litellm/proxy/_experimental/out/settings/router-settings/__next._head.txt # litellm/proxy/_experimental/out/settings/router-settings/__next._index.txt # litellm/proxy/_experimental/out/settings/router-settings/__next._tree.txt # litellm/proxy/_experimental/out/settings/ui-theme.html # litellm/proxy/_experimental/out/settings/ui-theme.txt # litellm/proxy/_experimental/out/settings/ui-theme/__next.!KGRhc2hib2FyZCk.settings.txt # litellm/proxy/_experimental/out/settings/ui-theme/__next.!KGRhc2hib2FyZCk.settings.ui-theme.__PAGE__.txt # litellm/proxy/_experimental/out/settings/ui-theme/__next.!KGRhc2hib2FyZCk.settings.ui-theme.txt # litellm/proxy/_experimental/out/settings/ui-theme/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/settings/ui-theme/__next._full.txt # litellm/proxy/_experimental/out/settings/ui-theme/__next._head.txt # litellm/proxy/_experimental/out/settings/ui-theme/__next._index.txt # litellm/proxy/_experimental/out/settings/ui-theme/__next._tree.txt # litellm/proxy/_experimental/out/teams.html # litellm/proxy/_experimental/out/teams.txt # litellm/proxy/_experimental/out/teams/__next.!KGRhc2hib2FyZCk.teams.__PAGE__.txt # litellm/proxy/_experimental/out/teams/__next.!KGRhc2hib2FyZCk.teams.txt # litellm/proxy/_experimental/out/teams/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/teams/__next._full.txt # litellm/proxy/_experimental/out/teams/__next._head.txt # litellm/proxy/_experimental/out/teams/__next._index.txt # litellm/proxy/_experimental/out/teams/__next._tree.txt # litellm/proxy/_experimental/out/test-key.html # litellm/proxy/_experimental/out/test-key.txt # litellm/proxy/_experimental/out/test-key/__next.!KGRhc2hib2FyZCk.test-key.__PAGE__.txt # litellm/proxy/_experimental/out/test-key/__next.!KGRhc2hib2FyZCk.test-key.txt # litellm/proxy/_experimental/out/test-key/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/test-key/__next._full.txt # litellm/proxy/_experimental/out/test-key/__next._head.txt # litellm/proxy/_experimental/out/test-key/__next._index.txt # litellm/proxy/_experimental/out/test-key/__next._tree.txt # litellm/proxy/_experimental/out/tools/mcp-servers.html # litellm/proxy/_experimental/out/tools/mcp-servers.txt # litellm/proxy/_experimental/out/tools/mcp-servers/__next.!KGRhc2hib2FyZCk.tools.mcp-servers.__PAGE__.txt # litellm/proxy/_experimental/out/tools/mcp-servers/__next.!KGRhc2hib2FyZCk.tools.mcp-servers.txt # litellm/proxy/_experimental/out/tools/mcp-servers/__next.!KGRhc2hib2FyZCk.tools.txt # litellm/proxy/_experimental/out/tools/mcp-servers/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/tools/mcp-servers/__next._full.txt # litellm/proxy/_experimental/out/tools/mcp-servers/__next._head.txt # litellm/proxy/_experimental/out/tools/mcp-servers/__next._index.txt # litellm/proxy/_experimental/out/tools/mcp-servers/__next._tree.txt # litellm/proxy/_experimental/out/tools/vector-stores.html # litellm/proxy/_experimental/out/tools/vector-stores.txt # litellm/proxy/_experimental/out/tools/vector-stores/__next.!KGRhc2hib2FyZCk.tools.txt # litellm/proxy/_experimental/out/tools/vector-stores/__next.!KGRhc2hib2FyZCk.tools.vector-stores.__PAGE__.txt # litellm/proxy/_experimental/out/tools/vector-stores/__next.!KGRhc2hib2FyZCk.tools.vector-stores.txt # litellm/proxy/_experimental/out/tools/vector-stores/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/tools/vector-stores/__next._full.txt # litellm/proxy/_experimental/out/tools/vector-stores/__next._head.txt # litellm/proxy/_experimental/out/tools/vector-stores/__next._index.txt # litellm/proxy/_experimental/out/tools/vector-stores/__next._tree.txt # litellm/proxy/_experimental/out/usage.html # litellm/proxy/_experimental/out/usage.txt # litellm/proxy/_experimental/out/usage/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/usage/__next.!KGRhc2hib2FyZCk.usage.__PAGE__.txt # litellm/proxy/_experimental/out/usage/__next.!KGRhc2hib2FyZCk.usage.txt # litellm/proxy/_experimental/out/usage/__next._full.txt # litellm/proxy/_experimental/out/usage/__next._head.txt # litellm/proxy/_experimental/out/usage/__next._index.txt # litellm/proxy/_experimental/out/usage/__next._tree.txt # litellm/proxy/_experimental/out/users.html # litellm/proxy/_experimental/out/users.txt # litellm/proxy/_experimental/out/users/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/users/__next.!KGRhc2hib2FyZCk.users.__PAGE__.txt # litellm/proxy/_experimental/out/users/__next.!KGRhc2hib2FyZCk.users.txt # litellm/proxy/_experimental/out/users/__next._full.txt # litellm/proxy/_experimental/out/users/__next._head.txt # litellm/proxy/_experimental/out/users/__next._index.txt # litellm/proxy/_experimental/out/users/__next._tree.txt # litellm/proxy/_experimental/out/virtual-keys.html # litellm/proxy/_experimental/out/virtual-keys.txt # litellm/proxy/_experimental/out/virtual-keys/__next.!KGRhc2hib2FyZCk.txt # litellm/proxy/_experimental/out/virtual-keys/__next.!KGRhc2hib2FyZCk.virtual-keys.__PAGE__.txt # litellm/proxy/_experimental/out/virtual-keys/__next.!KGRhc2hib2FyZCk.virtual-keys.txt # litellm/proxy/_experimental/out/virtual-keys/__next._full.txt # litellm/proxy/_experimental/out/virtual-keys/__next._head.txt # litellm/proxy/_experimental/out/virtual-keys/__next._index.txt # litellm/proxy/_experimental/out/virtual-keys/__next._tree.txt # scripts/install.sh # tests/local_testing/test_get_llm_provider.py	2026-04-25 17:15:24 -03:00
Chesars	ebe16072f2	Merge remote-tracking branch 'upstream/litellm_internal_staging' into litellm_staging_03_23_2026 # Conflicts: # model_prices_and_context_window.json # tests/test_litellm/llms/vertex_ai/multimodal_embeddings/test_vertex_ai_multimodal_embedding_transformation.py	2026-04-25 15:16:13 -03:00
Chesars	384cfdad47	Revert "Merge pull request #24164 from dongyu-turo/feat/update-bedrock-claude-price-above-200k" This reverts commit `b8189ea1de`, reversing changes made to `19c8f3d565`.	2026-04-25 15:04:05 -03:00
Krrish Dholakia	70492cee42	feat(proxy): add /v1/memory CRUD endpoints (#26218 ) * feat(proxy): add /v1/memory CRUD endpoints with user/team scoping New LiteLLM_MemoryTable stores user/team-scoped key/value entries with optional JSON metadata. Value is a String (LLM-readable text) and metadata is an optional Json? envelope, matching the Letta + mem0 hybrid model so future structured fields can be added without a schema migration. Endpoints: POST /v1/memory - create GET /v1/memory - list (caller-scoped; admins see all) GET /v1/memory/{key} - fetch one PUT /v1/memory/{key} - upsert DELETE /v1/memory/{key} - delete Non-admin callers cannot set a user_id/team_id other than their own. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(proxy/memory): omit metadata field when None on create Prisma's Python client rejects `metadata=None` on a `Json?` field with "A value is required but not set" — the field must be omitted from the `data` dict entirely to store SQL NULL. Build the create payload conditionally in both `create_memory` and the PUT-create branch of `upsert_memory`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(ui): add Memory page to view/manage /v1/memory entries Adds a new "Memory" sidebar item under Tools so users can see what their agents have stored. Lists all memories visible to the caller (scoped by the backend), with a key-search filter, preview column, scope tags, and view/edit/delete actions. Create modal accepts optional JSON metadata. - networking.tsx: fetchMemoryList / createMemory / updateMemory / deleteMemory wired to the /v1/memory CRUD endpoints. - MemoryView + MemoryEditModal: new antd-based components (per CLAUDE.md: use antd for new UI, not tremor). - page.tsx + leftnav.tsx: wire the "memory" route + sidebar entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(memory): add key_prefix filter + promote Memory to AI GATEWAY nav Backend: - GET /v1/memory now accepts `key_prefix` for Redis-style namespace scans (e.g. `?key_prefix=user:`). When both `key` and `key_prefix` are passed, `key_prefix` wins. - Prefix filter sits under the visibility filter in the Prisma where clause, so it can never leak rows across user/team scopes. - New tests: prefix match, and cross-scope isolation (another user's `user:` rows must not appear in the caller's results). UI: - Memory moved from a Tools submenu to a top-level AI GATEWAY item (alongside Agents, MCP Servers, Skills) — it's an API primitive, not a tool-management surface. - Search box now drives prefix search, matching the Redis mental model ("type the namespace, see everything under it"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> fix(memory): enforce unique key per scope by using NULLS NOT DISTINCT The unique constraint `(key, user_id, team_id)` on LiteLLM_MemoryTable silently allowed duplicates when user_id or team_id was NULL, because Postgres treats every NULL as distinct by default (ANSI semantics). A caller with no team_id could POST the same key three times and get three rows. Migration: 1. Dedupe existing rows, keeping the most recent per (key, user_id, team_id), using `IS NOT DISTINCT FROM` so NULL == NULL. 2. Drop the old unique index. 3. Recreate it with `NULLS NOT DISTINCT` (Postgres 15+). No code change: POST already returns 409 on unique-violation error messages — it just wasn't firing before because the constraint didn't catch the NULL-team case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(memory): make key globally unique, 409 on any duplicate Switches from the compound unique `(key, user_id, team_id)` to a simple `key @unique`. The compound form silently allowed duplicates when user_id or team_id was NULL (Postgres treats each NULL as distinct), so callers could POST the same key repeatedly. Globally-unique key means one row per key, period — any duplicate create → 409. - schema.prisma (×3): `key String @unique`, drop `@@unique(...)`. - initial add_memory_table migration: unique index on (key) only. - Remove the now-unused follow-up NULLS NOT DISTINCT migration. - Endpoint error message simplified ("already exists" — no "for this scope"). - Test fake's create() now enforces global key uniqueness. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ui/memory): full-width layout + user/teams-style columns - Add `w-full` to the MemoryView outer div so the page fills the flex-flex-1 container (was collapsing to intrinsic width). - Replace the combined "Scope" column with separate User ID / Team ID columns, matching the layout of the Users / Teams pages: ID, Name, Preview, User ID, Team ID, Updated, Actions. - IDs render with a truncated mono label + copy-to-clipboard button, same pattern as view_users. - Detail drawer now shows Memory ID / User ID / Team ID as separate fields instead of stacked color tags. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ui/memory): use clean MCP-style ID pill, drop copy icons The ID / User ID / Team ID columns showed a mono text blob with a copy-to-clipboard icon next to each value — too busy compared to the MCP Servers page. Swap the renderer for MCP's pill style: - Truncated mono ID inside a blue Tailwind pill (`font-mono text-blue-600 bg-blue-50 ... rounded-md border`). - No copy icon. Full ID surfaces via tooltip. - ID column is a button that opens the detail drawer on click; user/team ID pills are static (not clickable). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(memory): address greptile review feedback Addresses 5 greptile findings (3/5 → higher confidence target): 1. Identity-less orphan rows (P1): non-admin callers with no user_id AND no team_id could create rows that the visibility filter would never match again. Now rejected up front with 400 — caller must authenticate with a scoped key or act as PROXY_ADMIN. 2. Upsert race returning 500 (P1): PUT's check-then-create isn't atomic; a concurrent writer could slip a row in between the 404-check and the create call. Now catch unique-violation on create, re-read, and fall through to update — PUT stays idempotent. If the conflicting row belongs to a different scope, surface a 409 instead of 500. 3. PUT-create scope inconsistency (P2): PUT's create branch always used the caller's own user_id/team_id, so admins couldn't bootstrap rows scoped elsewhere via PUT (only POST). Now PUT-create calls the shared `_resolve_scope()` helper, matching POST semantics. 4. Stale schema comment (P2): schema said "Keyed by (key, user_id, team_id)" but `key` is globally unique. Updated all three schema copies to reflect the actual design. 5. UI silently truncated at 200 (P2): MemoryView fetched pageSize=200 with no load-more. Swapped to real server-side pagination driven by `data.total`; page size is now 50 and the pager is a real AntD control. Also extracts a shared `_resolve_scope()` helper and `_is_unique_violation()` from create_memory so POST and PUT don't drift on the scope/error logic. Tests: +3 new (identity-less 400, PUT admin bootstrap, PUT race → update), 18/18 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(memory): typed Prisma error + explicit-null metadata on PUT Two more greptile threads from the last review: - Unique-violation detection was string-matching "Unique"/"UniqueViolation" in the exception message, fragile across Prisma/driver versions. Now check the typed error `code == "P2002"` first, with string fallback. - PUT could not distinguish "metadata omitted" from "metadata: null" — both parsed as `None`, so callers had no way to clear stored metadata. Switch to Pydantic v2's `model_fields_set` to tell which fields the caller actually sent; explicit null now clears the column. New tests: - explicit null clears metadata - omitted metadata preserves existing value Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ui/memory): send explicit null when user clears metadata Addresses the remaining P1 from the last greptile review: When the edit modal's metadata textarea was cleared and saved, `metadataParsed` stayed `undefined`, `JSON.stringify` dropped the key entirely, and the backend's `model_fields_set` guard therefore left the stored metadata untouched — UI showed success but nothing changed. Now: empty textarea on edit → send explicit `null` so the backend sees `metadata` in `model_fields_set` and clears the column. Empty textarea on create still maps to `undefined` (field omitted) to avoid Prisma's `Json? = None` quirk on insert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ui/memory): preserve slashes in key path encoding The backend route `/v1/memory/{key:path}` supports keys with slashes, but `encodeURIComponent` encoded `/` as `%2F`. Some proxies (nginx default, CloudFlare, AWS ALB) reject or re-decode `%2F` mid-flight, so UI update/delete calls on slash-containing keys could fail or silently misroute. New helper `encodeMemoryKeyForPath` splits by `/`, URL-encodes each segment, then rejoins with literal `/`. Every other unsafe char (spaces, `?`, `#`, `%`) stays encoded per-segment; slashes stay as path delimiters, matching what the `:path` converter expects. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ui/memory): drop misleading client-side column sorters With server-side pagination, client sorters on `key` and `updated_at` only reorder the current page while pretending to sort the full dataset — users would see "sorted by name" but only the visible 50 rows would actually be sorted. Remove the sorters. The backend already returns rows in `updated_at DESC` order (sensible default for a memory view), and users can narrow the result with the key-prefix filter. Greptile also flagged missing `@@map` on the new model as a "consistency" issue, but only 1 of 59 tables in this repo uses `@@map` — the dominant pattern is to rely on Prisma's default (model name == table name). Skipping that finding as a false-positive on convention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(memory): compose visibility + key filters via explicit AND Greptile P1 (filter-fragility): `where.update(vis)` was semantically correct today, but dict-merging by key meant any future visibility filter that grew a new top-level "OR" would silently clobber the existing key filter. Compose explicitly instead: where = {"AND": [key_filter, vis]} Applied to both `list_memory` and `_find_memory_for_caller`. When either side is empty (admin has no visibility filter; list has no key filter), skip the wrapper and use the non-empty side directly to keep the generated SQL clean. Test fake's `_matches` now understands top-level `AND` too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(ui/memory): wrap write helpers with react-query useMutation Previously the Memory view read via `useQuery` but called the raw create/update/delete fetch helpers directly in handlers, tracking loading state with a local `submitting` flag and invalidating state via `refetch()`. That mixes two concerns: - it skips react-query's mutation state (isPending / isError / isSuccess) - `refetch()` only retouches the currently-mounted query instance, not other cached pages, so navigating back to an older page could show stale rows Switch the three write paths to `useMutation`: - `createMutation`, `updateMutation`, `deleteMutation` — each owns the mutation fn, success toast, and error toast. - Success handlers invalidate the whole `["memoryList", ...]` prefix via `queryClient.invalidateQueries`, so every cached page refetches (pagination + filter-aware). - Refresh button now invalidates instead of `refetch()`, keeping all behavior consistent. - handleSave/handleDelete become thin adapters that call `.mutateAsync`; their errors are swallowed locally since the mutation's onError has already surfaced the toast. Also tightened the edit modal's key-field tooltip to reflect the actual global-unique semantics (was "Unique per user/team scope"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(memory): close cross-user write gap + sanitize 500 errors (Veria) Addresses two Veria findings: High — cross-user memory tampering via team membership. The visibility filter uses an OR (`user_id == caller OR team_id == caller`) so team members can SEE each other's team-scoped rows. That's intentional for list/get. But because PUT/DELETE used the same filter to find the target row, any team member could overwrite or delete a teammate's personal row whenever both `user_id` and `team_id` were stamped on it — broader visibility was being silently treated as broader authority. New `_assert_write_access(row, caller)` enforces ownership for mutations. Non-admin rules: - The row's `user_id` must match the caller (personal ownership), OR - The row has no `user_id` and its `team_id` matches the caller's team (a "pure team row" intended for shared writes). Admins bypass the check. The same gate runs in PUT (both regular and post-race-recovery branches) and DELETE. Medium — DB internals leaked through 500 detail. Every `except` block was raising `HTTPException(500, detail=str(e))`, which surfaces Prisma error strings (table/column names, host:port, error class names) to API callers. New `_internal_error()` helper logs the real exception server-side and returns a generic, caller-safe `detail`. Applied to create, list, upsert (general fallthrough), and delete. Also tightened the race-recovery 409 message to drop the "in a different scope" wording — the caller never needs to know whose scope it lives in. Tests (+5): - teammate cannot overwrite personal row → 403 - teammate cannot delete personal row → 403 - teammate CAN modify pure team row (no user_id stamped) → 200 - admin bypasses write-auth → 200 - 500 response never echoes Prisma internals (table/host/class names) 25/25 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(memory): require team admin to modify pure team rows Tightens the write-authorization rule for "pure team rows" (rows with no user_id stamped, only team_id) to match the pattern used by team-management endpoints (`_is_user_team_admin` + `_is_user_org_admin_for_team`): - Plain team members can READ team rows via the OR visibility filter (intentional, unchanged). - Only PROXY_ADMIN, team admins of the row's team_id, or org admins for the team's organization may MODIFY them. Plain members get 403. `_assert_write_access` is now async and takes the prisma_client so it can fetch the team and run the existing `_is_user_team_admin` / `_is_user_org_admin_for_team` helpers from `litellm.proxy.management_endpoints.common_utils`. The org-admin path is best-effort: it calls `get_user_object`, which depends on the proxy_server module being initialized, so any exception there is treated as "not an org admin" rather than crashing the request. Tests: - team admin can modify pure team row → 200 - plain team member cannot modify pure team row → 403 - plain team member cannot delete pure team row → 403 Updates the test fake to add a tiny `litellm_teamtable.find_unique` implementation and a `_make_team(team_id, admin_user_ids=[...])` helper. 27/27 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: mypy + UI page-metadata sync for memory page Two CI failures: 1. mypy: `_find_memory_for_caller` had `key_filter` inferred as `dict[str, str]` (literal type) and the conditional `{"AND": [key_filter, vis]}` returned `dict[str, list[...]]`, so the join site failed `dict-item` typing. Annotate both intermediates as `dict` so mypy widens the value type. 2. UI test (`page_utils.test.ts > should have descriptions for all pages`): every leftnav entry must have a description in `page_metadata.ts`, and `memory` was missing. Added a one-line description, matching the style of neighboring entries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [Feat] Day-0 support for GPT-5.5 and GPT-5.5 Pro (#26449) * feat(openai): day-0 support for GPT-5.5 and GPT-5.5 Pro Add pricing + capability entries for the new GPT-5.5 family launched by OpenAI on 2026-04-24: - gpt-5.5 / gpt-5.5-2026-04-23 (chat): $5/$30/$0.50 per 1M input/output/cached input - gpt-5.5-pro / gpt-5.5-pro-2026-04-23 (responses-only): $60/$360/$6 per 1M input/output/cached input Other fees (long-context >272k, flex, batches, priority, cache discounts) follow the same ratios as GPT-5.4, with context window retained at 1.05M input / 128K output. No transformation / classifier code changes are required: OpenAIGPT5Config.is_model_gpt_5_4_plus_model() already matches 5.5+ via numeric version parsing, and model registration is driven from the JSON. The existing responses-API bridge for tools + reasoning_effort (litellm/main.py:970) already covers gpt-5.5-pro. Tests: - GPT5_MODELS regression list now covers gpt-5.5-pro and dated variants - New test_generic_cost_per_token_gpt55_pro cost-calc test - Updated test_generic_cost_per_token_gpt55 for long-context fields * fix(openai): mirror reasoning_effort flags onto gpt-5.5 dated variants gpt-5.5-2026-04-23 and gpt-5.5-pro-2026-04-23 were missing the supports_none_reasoning_effort, supports_xhigh_reasoning_effort, and supports_minimal_reasoning_effort flags that their non-dated counterparts define. Reasoning-effort routing in OpenAIGPT5Config is fully capability-driven from these JSON flags — since an absent flag is treated as False for opt-in levels (xhigh), users pinning to a dated snapshot would silently lose xhigh support and diverge from the base alias on logprobs + flexible temperature handling. Copy the flags onto both dated variants so every dated snapshot inherits the base model's reasoning-effort capability profile. Adds a parametrized regression test that asserts supports_{none,minimal,xhigh}_reasoning_effort parity between each dated variant and its non-dated counterpart, preventing future drift when new snapshots are added. * fix(schema): close LiteLLM_MemoryTable model brace dropped during merge The rebase against `litellm_internal_staging` (which added `LiteLLM_AdaptiveRouterState` / `LiteLLM_AdaptiveRouterSession`) left the closing brace of `LiteLLM_MemoryTable` missing in all three schema copies — the next model declaration ended up parsed as a field of the memory table, surfacing as the CI prisma error: error: This line is not a valid field or attribute definition. --> schema.prisma:1250 \| 1249 \| // Per-(router, request_type, model) Beta posterior for the adaptive router. 1250 \| model LiteLLM_AdaptiveRouterState { Add the missing `}` (and the standard blank line) after the memory table's `@@index([team_id])` in `schema.prisma`, `litellm/proxy/schema.prisma`, and `litellm-proxy-extras/litellm_proxy_extras/schema.prisma`. `prisma generate --schema litellm/proxy/schema.prisma` now runs clean; 27/27 memory unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Mateo Wang <277851410+mateo-berri@users.noreply.github.com>	2026-04-24 18:38:07 -07:00
mateo-berri	94f8f12a00	feat(openai): add supports_low_reasoning_effort flag; reject low on gpt-5.5-pro gpt-5.5-pro only accepts reasoning_effort in {medium, high, xhigh} (verified live against OpenAI's API on 2026-04-24). LiteLLM previously had no way to express this constraint — the existing JSON schema covered none/minimal/xhigh but not low. Result: drop_params=true users saw an avoidable 400 from OpenAI. Add supports_low_reasoning_effort following the existing opt-out pattern (default-allow, explicit false to block). Mirror the minimal branch in OpenAIGPT5Config.map_openai_params so 'low' goes through the same _is_reasoning_effort_level_explicitly_disabled gate. Set the flag to false on gpt-5.5-pro and gpt-5.5-pro-2026-04-23 in both model_prices JSON files (kept in sync). Other models leave the key absent so behavior is unchanged. Tests cover: rejection on pro variants (no drop_params), drop on pro with drop_params=True, passthrough on gpt-5.5 chat, passthrough on unknown models, and the helper-level _is_reasoning_effort_level_explicitly_disabled contract.	2026-04-24 15:05:43 -07:00
mateo-berri	34c93645e9	fix(openai): gpt-5.5 does not support reasoning_effort=minimal Verified against OpenAI's live Chat Completions API on 2026-04-24: POST /v1/chat/completions {"model": "gpt-5.5", "reasoning_effort": "minimal", ...} -> 400 Unsupported value: 'reasoning_effort' does not support 'minimal' with this model. Supported values are: 'none', 'low', 'medium', 'high', and 'xhigh'. POST /v1/chat/completions {"model": "gpt-5.5-pro", "reasoning_effort": "minimal", ...} -> 400 Unsupported value: 'minimal' is not supported with the 'gpt-5.5-pro' model. Supported values are: 'medium', 'high', and 'xhigh'. Set supports_minimal_reasoning_effort=false on all four entries (gpt-5.5, gpt-5.5-2026-04-23, gpt-5.5-pro, gpt-5.5-pro-2026-04-23) so OpenAIGPT5Config._is_reasoning_effort_level_explicitly_disabled fires and LiteLLM either drops the param (drop_params=True) or raises a local UnsupportedParamsError, instead of round-tripping to OpenAI for a 400. Adds a parametrized test_gpt55_reasoning_effort_flags_match_live_openai_api test that pins supports_{none,minimal,xhigh}_reasoning_effort on each entry to OpenAI's actual API contract. Note: gpt-5.5-pro additionally rejects 'none' and 'low'. 'none' is already handled (supports_none_reasoning_effort=false). 'low' is not representable in the current JSON schema (no supports_low flag); filing separately.	2026-04-24 14:30:41 -07:00
Mateo Wang	d21e90f683	[Feat] Day-0 support for GPT-5.5 and GPT-5.5 Pro (#26449 ) * feat(openai): day-0 support for GPT-5.5 and GPT-5.5 Pro Add pricing + capability entries for the new GPT-5.5 family launched by OpenAI on 2026-04-24: - gpt-5.5 / gpt-5.5-2026-04-23 (chat): $5/$30/$0.50 per 1M input/output/cached input - gpt-5.5-pro / gpt-5.5-pro-2026-04-23 (responses-only): $60/$360/$6 per 1M input/output/cached input Other fees (long-context >272k, flex, batches, priority, cache discounts) follow the same ratios as GPT-5.4, with context window retained at 1.05M input / 128K output. No transformation / classifier code changes are required: OpenAIGPT5Config.is_model_gpt_5_4_plus_model() already matches 5.5+ via numeric version parsing, and model registration is driven from the JSON. The existing responses-API bridge for tools + reasoning_effort (litellm/main.py:970) already covers gpt-5.5-pro. Tests: - GPT5_MODELS regression list now covers gpt-5.5-pro and dated variants - New test_generic_cost_per_token_gpt55_pro cost-calc test - Updated test_generic_cost_per_token_gpt55 for long-context fields * fix(openai): mirror reasoning_effort flags onto gpt-5.5 dated variants gpt-5.5-2026-04-23 and gpt-5.5-pro-2026-04-23 were missing the supports_none_reasoning_effort, supports_xhigh_reasoning_effort, and supports_minimal_reasoning_effort flags that their non-dated counterparts define. Reasoning-effort routing in OpenAIGPT5Config is fully capability-driven from these JSON flags — since an absent flag is treated as False for opt-in levels (xhigh), users pinning to a dated snapshot would silently lose xhigh support and diverge from the base alias on logprobs + flexible temperature handling. Copy the flags onto both dated variants so every dated snapshot inherits the base model's reasoning-effort capability profile. Adds a parametrized regression test that asserts supports_{none,minimal,xhigh}_reasoning_effort parity between each dated variant and its non-dated counterpart, preventing future drift when new snapshots are added.	2026-04-24 14:10:42 -07:00
shin-berri	ca443a957c	Merge pull request #24374 from BerriAI/litellm_staging_03_22_2026 Litellm staging 03 22 2026	2026-04-24 12:38:47 -07:00
yuneng-jiang	d73b790cae	Merge pull request #26248 from BerriAI/litellm_anthropic_messages_call_type_fix fix(proxy): preserve anthropic_messages call type for /v1/messages logging	2026-04-24 09:42:36 -07:00
Yuneng Jiang	55ea431c05	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_gpt54_mini_nano_versioned_models	2026-04-24 09:28:54 -07:00
Sameer Kankute	e1466be825	feat(pricing): gemini-embedding-2 GA cost map, blog, and test (#26391 ) * feat(pricing): gemini-embedding-2 GA cost map, blog, and test - Add model_prices entries for gemini-embedding-2 (Gemini + Vertex paths) - Add docs blog gemini_embedding_2_ga with LiteLLM proxy curl examples - Add test_gemini_embedding_2_ga_in_cost_map in test_utils Made-with: Cursor * Fix greptile reviews	2026-04-24 09:28:18 -07:00
Cesar Garcia	8bd58fb82d	Merge branch 'litellm_internal_staging' into litellm_staging_03_22_2026	2026-04-24 13:12:19 -03:00
Mateo Wang	3950f5ea72	feat: add gpt-5.5 to model cost map (#26345 ) * feat: add gpt-5.5 to model cost map Add gpt-5.5 entry with pricing from OpenAI flagship page: input $5/1M, cached input $0.50/1M, output $30/1M, 272K context. * test: add gpt-5.5 coverage for model cost map and gpt-5 routing - Add gpt-5.5 to GPT5_MODELS parametrized list so both OpenAIGPT5Config and AzureOpenAIGPT5Config routing tests cover the new model. - Add test_generic_cost_per_token_gpt55 verifying the new entry's cost-map values ($5/$0.50/$30 per 1M) and that generic_cost_per_token returns the expected prompt/completion costs.	2026-04-23 14:05:22 -07:00
Sameer Kankute	d5449f5b1a	Merge pull request #26300 from BerriAI/litellm_oss_staging_04_22_2026 Litellm oss staging 04 22 2026	2026-04-23 18:53:58 +05:30
Zark .	fcf917df6d	Feat(dashscope): add image generation support for qwen-image-2.0 and qwen-image-2.0-pro (#25672 ) * feat: add dashscope/qwen-image-2.0 and qwen-image-2.0-pro to model cost map * feat: implement DashScope image generation transformation class * feat: register DashScope in ProviderConfigManager for image generation * feat: add DashScope to image generation provider routing * feat: auto-route qwen-image /chat/completions requests to /images/generations * test: add unit tests for DashScope image generation (22 cases) * refactor: remove proxy-layer qwen-image auto-routing * feat: auto-redirect image_generation models in acompletion() * test: add acompletion auto-redirect test for image_generation models * fix: remove unused Union import in DashScope transformation * fix: scope acompletion redirect to dashscope and narrow exception handler * fix: move get_str_from_messages to module-level import and forward n param to aimage_generation * refactor: remove acompletion image_generation auto-redirect for dashscope * test: remove acompletion auto-redirect test for dashscope image models --------- Co-authored-by: zark.lin <zark.lin@thinkchina.com>	2026-04-22 20:03:46 -07:00
Vigilans	b42b86df7a	fix(adapter): normalize reasoning effort with graceful degradation (#26111 ) * fix(model-info): include reasoning effort support fields in get_model_info _get_model_info_helper constructs ModelInfoBase explicitly but never reads supports_xhigh/minimal/none_reasoning_effort from the cost map JSON. Add the three fields so get_model_info() returns them correctly. Also add supports_minimal_reasoning_effort to the ModelInfo TypedDict (xhigh and none were already declared, minimal was missing). * fix(model-registry): add missing reasoning effort fields for claude 4.6/4.7 Claude Opus 4.7 supports max reasoning effort (above xhigh). The field was present for Opus 4.6 but missing for all Opus 4.7 entries (base, dated, Bedrock, Vertex AI, Azure AI). All Claude 4.6/4.7 models (Opus 4.6, Sonnet 4.6, Opus 4.7) support minimal reasoning effort via adaptive thinking. Add the field to all provider variants. * fix(adapter): map output_config.effort to reasoning_effort (#25079) Anthropic's adaptive thinking (thinking.type="adaptive") and output_config.effort were silently dropped when translating to OpenAI format, resulting in no reasoning_effort on the outgoing request. Adapter changes (format translation): - adapters/transformation.py: add "adaptive" branch to translate_anthropic_thinking_to_reasoning_effort(); pass through output_config.effort as-is in _translate_thinking_to_openai(); add "output_config" to translatable_anthropic_params - adapters/handler.py: extract output_config from extra_kwargs into request_data so it reaches the translation layer - responses_adapters/transformation.py: add "adaptive" branch and output_config param to translate_thinking_to_reasoning() Handler changes (model-aware normalization): - utils.py: add normalize_reasoning_effort_value() that uses get_model_info() to map "max" → "xhigh"/"high" and "minimal" → "minimal"/"low" based on model capabilities - adapters/handler.py: call normalization before responses routing - responses_adapters/handler.py: call normalization after translation Relates to BerriAI/litellm#25079 * test(reasoning-effort): add tests for effort capability fields and normalize logic Test coverage for: - get_model_info returning supports_minimal/max_reasoning_effort fields - JSON registry entries for claude 4.6/4.7 across all providers - normalize_reasoning_effort_value degradation chains and exception fallback - Adapter translation of adaptive thinking + output_config.effort * fix: forward custom_llm_provider to normalize_reasoning_effort_value in responses adapter	2026-04-22 19:19:54 -07:00
Cesar Garcia	25c0aa8bfd	Merge pull request #26283 from BerriAI/litellm_internal_staging Sync litellm_staging_03_22_2026 with litellm_internal_staging	2026-04-22 19:55:27 -03:00
Sameer Kankute	6ebbfe5190	fix(anthropic): allow output_config effort max for Opus 4.7 and model map - Validate max effort like xhigh: Opus 4.6/4.7 id patterns or supports_max_reasoning_effort - Set supports_max_reasoning_effort on claude-opus-4-7 entries in model cost JSON - Update tests and add test_max_effort_accepted_for_opus_47 Made-with: Cursor	2026-04-22 22:06:07 +05:30
ishaan-berri	0e42d4cb08	April 21st Ishaan Branch (#26213 ) * fix(otel): preserve Splunk Observability Cloud trace OTLP endpoint (#26183) * fix(otel): preserve Splunk Observability Cloud trace OTLP URL Splunk ingest uses /v2/trace/otlp; _normalize_otel_endpoint must not append /v1/traces. - Return trace endpoints unchanged when they match Splunk OTLP path patterns - Add unit tests for observability.splunkcloud.com, signalfx.com, and /trace/otlp suffix - Set OTEL_EXPORTER_OTLP_PROTOCOL in protocol selection tests (from_env precedence over OTEL_EXPORTER) Made-with: Cursor * test(otel): use parameterized.expand for Splunk OTLP URL cases Made-with: Cursor * fix(otel): narrow Splunk trace URL guard to /v2/trace/otlp only Made-with: Cursor * test(otel): cover OTEL_EXPORTER fallback when OTLP protocol env unset Made-with: Cursor * Add Openrouter Opus 4.7 Entry (#26130) --------- Co-authored-by: milan-berri <milan@berri.ai> Co-authored-by: Matt Greathouse <matt5316@gmail.com>	2026-04-21 20:18:56 -07:00
ishaan-berri	e6897f5510	add moonshot/kimi-k2.6 to model registry (#26203 ) * add moonshot/kimi-k2.6 to model registry * add moonshot/kimi-k2.6 to backup model registry * add tests for moonshot/kimi-k2.6 model registry * fix moonshot/kimi-k2.6 pricing and add reasoning support * fix moonshot/kimi-k2.6 pricing and add reasoning support in backup * update kimi-k2.6 tests: fix pricing, add tool_choice and reasoning checks * fix: load kimi-k2.6 registry tests from local backup instead of remote cost map	2026-04-21 19:58:43 -07:00
ishaan-berri	a302613eb5	feat(bedrock): add support for bedrock-mantle endpoint (Claude Mythos Preview) (#26196 ) * add anthropic.claude-mythos-preview to model_prices_and_context_window.json * add mantle route to bedrock common_utils: route detection, chat config, messages config dispatch * add AmazonMantleConfig for bedrock/mantle /chat/completions endpoint * add AmazonMantleMessagesConfig for bedrock/mantle /messages endpoint * register AmazonMantleMessagesConfig in __init__.py and lazy imports registry * add unit tests for bedrock mantle route and config dispatch * add e2e tests for bedrock mantle: URL, body, SigV4 header, region routing	2026-04-21 15:41:58 -07:00
Michael-RZ-Berri	4f823cedac	Add supported providers to prompt caching doc (#26124 ) * Add supported providers to prompt caching doc * Move Z.ai / GLM to cache_control marker list * Mark xAI models as supporting prompt caching * Narrow xAI prompt caching flag to models with documented cache pricing * Add prompt caching flag to grok-4, grok-4-0709, grok-4-latest --------- Co-authored-by: Michael Riad Zaky <michaelr@Michaels-MacBook-Air.local>	2026-04-20 15:25:21 -07:00
Sameer Kankute	d5cfdcc6ee	feat(models): add versioned GPT-5.4 mini and nano aliases Add dated snapshot entries for GPT-5.4 mini and nano (including Azure-prefixed aliases) so users can pin to the 2026-03-17 model versions.	2026-04-20 21:02:28 +05:30
Sameer Kankute	57eae8d01c	Merge branch 'litellm_internal_staging' into litellm_staging_03_22_2026	2026-04-20 19:56:00 +05:30
Sameer Kankute	3ef362289b	Add support for grok-4.20-0309-reasoning model	2026-04-17 08:49:59 +05:30
ishaan-berri	44c992416c	Merge pull request #25867 from BerriAI/litellm_day_0_opus_4.7_support Litellm day 0 opus 4.7 support	2026-04-16 09:42:11 -07:00
Sameer Kankute	07d863b8e7	Remove max support for opus 4.7	2026-04-16 21:58:03 +05:30

1 2 3 4 5 ...

1567 Commits