litellm

Author	SHA1	Message	Date
Mateo Wang	f27df8d516	docs: hand-written CLAUDE.md; point GEMINI.md and AGENTS.md at it (#29252 ) * docs: replace generated CLAUDE.md with hand-written guidance, remove AGENTS.md Swap the auto-generated CLAUDE.md for a concise hand-written version that captures how we actually want agents to work in this repo: minimal comments, simplicity first, meaningful tests with a high mutation kill rate, PRs based off litellm_internal_staging rather than main, and curl against a live proxy as proof of fix instead of pasted pytest output. Remove AGENTS.md so there is one source of truth for agent guidance. The customer and company name confidentiality policy, along with the MCP available_on_public_internet note, are carried over from the previous CLAUDE.md. * fix: further clarify communication guidelines * docs: point GEMINI.md at CLAUDE.md instead of duplicating guidance Replace the standalone GEMINI.md copy, which had already drifted from the new CLAUDE.md, with a one-line pointer so Gemini reads the same single source of truth. * docs: simplify PR template test checklist item Replace the rigid "at least 1 test is a hard requirement" checklist line with "I have added meaningful tests", which matches the testing guidance in CLAUDE.md, and tidy a comma into a semicolon in the scope-isolation item. * docs: point AGENTS.md at CLAUDE.md instead of deleting it Keep AGENTS.md so tools that read it still resolve guidance, but collapse it to the same one-line pointer to CLAUDE.md used by GEMINI.md, keeping a single source of truth. * fix: make AI-generated rules more concise * fix: spelling Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: make the .env usage more careful * docs: restore MCP available_on_public_internet note to CLAUDE.md The PR description states this note was carried over verbatim from the previous CLAUDE.md, but it was dropped in the rewrite. Restore it so the file matches the description and the team guidance is not lost. * docs: restore browser storage and CI supply-chain safety notes to CLAUDE.md These security-relevant rules were dropped in the rewrite. Restore the sessionStorage-over-localStorage (XSS) guidance and the CI supply-chain rules (no curl\|bash, pin versions, verify checksums) so agents editing UI or CI code are still steered away from those pitfalls. * docs: move area-specific guidance into nested CLAUDE.md files The MCP, browser-storage, and CI supply-chain notes are scoped to particular parts of the tree, so move each into a nested CLAUDE.md that Claude Code loads on demand when those files are touched: the MCP note under the mcp_server gateway, the browser-storage rule under the UI dashboard, and the CI supply-chain rules under .circleci. Keeps the root CLAUDE.md focused on general guidance while the area notes surface where they are relevant. * docs: keep CI supply-chain note in root CLAUDE.md CI guidance applies beyond .circleci (it also covers downloads in GitHub workflows and any CI script), and CI work does not reliably touch a single subtree, so a nested file under .circleci would not surface it dependably. Keep it in the always-loaded root instead. The MCP and browser-storage notes stay nested where they map cleanly to one area of the tree. * fix: make it clear we prefer httpOnly * chore: make ci rule more concise * chore: make concise Fix formatting and punctuation in MCP note. * fix: don't include Claude attribution --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-05-29 00:05:05 -07:00
yuneng-jiang	b6fd7f7746	docs(agents): require consent before writing new third-party names (#28908 ) Adds a rule to CLAUDE.md and AGENTS.md instructing AI coding agents to pause and ask the user before introducing any third-party organization name that does not already appear in the repository. Names already established in the codebase (existing LLM providers, etc.) remain fine to use without prompting.	2026-05-26 16:40:07 -07:00
Yassin Kortam	fa5eae8bc9	chore: remove legacy deployment artifacts and litellm-js packages (#27541 ) - Remove litellm-js/proxy and litellm-js/spend-logs TypeScript packages that provided Cloudflare Worker proxy and Node.js spend logging services, as these are no longer maintained - Remove deprecated Docker variants (Dockerfile.alpine, Dockerfile.dev, Dockerfile.custom_ui, Dockerfile.health_check, Dockerfile.ghcr_base) that have been superseded by the primary Dockerfile - Remove legacy Kubernetes manifests (kub.yaml, service.yaml) from deploy/kubernetes in favor of the Helm chart - Remove stale index.yaml Helm chart index pinned to an old version (v1.43.18) - Remove dev_config.yaml development configuration file that contained hardcoded credentials and example endpoints - Clean up ~3,500 lines of unused code and configuration to reduce repository maintenance burden Co-authored-by: Yassin Kortam <yassinkortam@g.ucla.edu>	2026-05-09 20:51:34 +00:00
Yuneng Jiang	c35f3a50ae	docs: remove docs/my-website, point contributors to litellm-docs The documentation source has moved to a separate repository, BerriAI/litellm-docs, served at docs.litellm.ai. This PR removes docs/my-website/ from this repo and updates README.md, AGENTS.md, and CLAUDE.md to direct doc contributions to the new repo. Also fixes a broken relative link in litellm/integrations/levo/README.md. The existing CI symlink in .github/workflows/test-code-quality.yml (which clones litellm-docs and symlinks docs/my-website to it for tests/documentation_tests/*) continues to work without change.	2026-04-24 14:17:46 -07:00
yuneng-jiang	a306092d47	Merge pull request #25463 from BerriAI/litellm_oss_staging_04_09_2026 Litellm oss staging 04 09 2026	2026-04-13 17:25:53 -07:00
Ryan Crabbe	842523a918	chore(ui): use antd in GuardrailSettingsView and document the rule Converts GuardrailSettingsView from @tremor/react (Badge, Text) to antd (Tag, plain spans) as part of the Tremor migration. Also captures the "no new Tremor imports" rule in CLAUDE.md and expands the existing note in AGENTS.md with the specific antd equivalents and the yellow→gold gotcha.	2026-04-13 11:49:39 -07:00
stuxf	a6c30b30bf	build: migrate packaging, CI, and Docker from Poetry to uv (#25007 ) * build: migrate packaging metadata to uv * ci: move automation and local tooling to uv * docker: migrate image builds and runtime setup to uv * docs: update install and deployment guidance for uv * chore: align auxiliary scripts and tests with uv * test: harden test_litellm isolation * fix: keep release and health check images self-contained * build: pin uv tooling and health check deps * test: isolate bedrock image request formatting from suite state * test: cover sandbox executor requirements flow * ci: fix circleci no-op command steps * ci: fix circleci publish workflow parsing * fix: stabilize remaining uv migration CI checks * ci: increase matrix test timeout headroom * fix: restore published docker and license coverage * fix: restore proxy runtime build parity * fix: restore proxy extras parity and venv migrations * ci: persist uv path across circleci steps * fix: keep psycopg binary in default test env * docker: preserve prisma cache across stages * test: run local proxy checks through uv python * build: restore runtime deps moved into ci * build: refresh uv lock after upstream merge * fix: restore module import in test_check_migration after merge The conflict resolution imported only the function but the test body references check_migration as a module throughout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: revert dependency promotions, remove nodejs-wheel-binaries, fix Docker layer caching - Move google-generativeai, Pillow, tenacity back to ci group (they are lazily imported and bloat the base SDK install needlessly) - Remove nodejs-wheel-binaries from extra_proxy and proxy-dev (redundant in Docker where system Node.js is already installed via apk) - Remove all nodejs-wheel node replacement and venv npm patching blocks from Dockerfiles since the wheel is no longer installed - Add --no-default-groups to CodSpeed benchmark workflow so the benchmark environment matches the old minimal pip install footprint - Apply standard uv two-phase Docker pattern: copy metadata first, install deps (cached layer), then copy source and install project - Replace CircleCI enterprise no-op with proper uv sync command Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate uv.lock after removing nodejs-wheel-binaries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): use cache/restore instead of cache to prevent cache poisoning The old workflow used actions/cache/restore (read-only). The uv migration changed it to actions/cache (read-write), which zizmor flags as a cache poisoning risk. Restore the safer read-only variant. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): disable setup-uv built-in cache to silence cache-poisoning alert The setup-uv action enables caching by default, which zizmor flags as a cache poisoning risk. Disable it since we already use a read-only cache/restore step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): disable setup-uv cache in publish workflow Silences zizmor cache-poisoning alert. Publishing workflow runs infrequently on protected branches so caching adds no real benefit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(test): remove duplicate verbose_logger mock in test_check_migration The logger was patched twice — first via mocker.patch() then via mocker.patch.object(autospec=True). The second call fails because autospec cannot inspect an already-mocked attribute. Remove the redundant first patch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): free disk space before Docker build in test-server-root-path The Dockerfile.non_root build ran out of disk on the CI runner. Remove Android SDK, .NET, Boost, and GHC toolchains (~12GB) to free space. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 11:46:23 -07:00
yuneng-jiang	d2d99aa082	[Docs] Enforce Black Formatting in Contributor Docs (#25135 ) * [Docs] Enforce Black formatting in contributor docs Black formatting is now enforced in CI. Update CLAUDE.md, AGENTS.md, and CONTRIBUTING.md to instruct contributors and AI agents to run `poetry run black .` before committing, and add VS Code setup guidance. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: fixes --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 16:51:25 -07:00
Ishaan Jaff	cfd0e2cf99	[Feat] UI Polish - MCP Servers page - show transport type (#23051 ) * Update AGENTS.md with additional Cursor Cloud setup notes - Add note about openapi-core dependency needed for OpenAPI compliance tests - Add note about poetry lock fallback when lock file is out of sync Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Sync lock files with current dependency specs - poetry.lock: regenerated to match pyproject.toml (litellm-proxy-extras 0.4.50 -> 0.4.51) - package-lock.json: updated from npm install Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Polish MCP Servers UI for enterprise-grade look and feel 10 improvements to the MCP Servers table and related components: 1. Remove debug console.logs from mcp_servers.tsx 2. Fix health status icons: distinct ✓/✗/? per state instead of identical dots 3. Health status badges: proper pill styling with rounded-full and borders 4. Health loading state: subtle pulsing dot instead of raw SVG spinner 5. Transport column: color-coded badges (HTTP=blue, SSE=purple, STDIO=amber, OPENAPI=teal) 6. Auth type column: color-coded badges (oauth2=indigo, bearer_token=sky, api_key=emerald) 7. Server ID chip: rounded corners, border, and transition effect 8. Filter bar: lighter border, cleaner labels, vertical divider between filters 9. Network Access: pill badges with colored dots (Public/Internal) 10. Date columns: shorter headers, dash for missing values, tooltip with full datetime Also: - Improved delete modal: cleaner layout, neutral background instead of red - Access Groups column: shows first group with +N count instead of truncated text - Empty state message includes CTA guidance - Updated test to match renamed filter label Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Polish MCP server detail views and table refinements (round 2) 10 more enterprise polish improvements: 1. Overview cards: use color-coded badges for Transport and Auth Type values 2. Overview cards: fix 'Host Url' typo -> 'Host URL', uppercase card labels 3. Settings tab: show em-dash placeholder for empty/missing values 4. Settings tab: use consistent Transport/Auth/Network badge styling matching table 5. Settings tab: definition-list layout with label/value grid columns 6. Server detail header: show server name prominently with alias as badge 7. Server detail header: show description below name, smaller server ID 8. Actions column: improved hover states with background color transitions 9. Credential column: pill badge for Connected state, shadow on Connect button 10. Table header: server count badge next to title, CTA button moved right Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Revert colorful transport/auth badges to neutral gray Color should only carry semantic meaning. Transport type (HTTP/SSE) and auth type (oauth2/bearer_token) are informational labels, not status indicators, so they use a uniform gray badge. Color remains on: - Health status: green (healthy), red (unhealthy) - Network access: green (public), orange (internal) - Credential: green (connected) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>	2026-03-07 13:05:46 -08:00
Ishaan Jaff	503eb2fd4c	fix: don't close HTTP/SDK clients on LLMClientCache eviction (#22925 ) * fix: don't close HTTP/SDK clients on LLMClientCache eviction Removing the _remove_key override that eagerly called aclose()/close() on evicted clients. Evicted clients may still be held by in-flight streaming requests; closing them causes: RuntimeError: Cannot send a request, as the client has been closed. This is a regression from commit `fb72979432`. Clients that are no longer referenced will be garbage-collected naturally. Explicit shutdown cleanup happens via close_litellm_async_clients(). Fixes production crashes after the 1-hour cache TTL expires. * test: update LLMClientCache unit tests for no-close-on-eviction behavior Flip the assertions: evicted clients must NOT be closed. Replace test_remove_key_closes_async_client → test_remove_key_does_not_close_async_client and equivalents for sync/eviction paths. Add test_remove_key_removes_plain_values for non-client cache entries. Remove test_background_tasks_cleaned_up_after_completion (no more _background_tasks). Remove test_remove_key_no_event_loop variant that depended on old behavior. * test: add e2e tests for OpenAI SDK client surviving cache eviction Add two new e2e tests using real AsyncOpenAI clients: - test_evicted_openai_sdk_client_stays_usable: verifies size-based eviction doesn't close the client - test_ttl_expired_openai_sdk_client_stays_usable: verifies TTL expiry eviction doesn't close the client Both tests sleep after eviction so any create_task()-based close would have time to run, making the regression detectable. Also expand the module docstring to explain why the sleep is required. * docs(AGENTS.md): add rule — never close HTTP/SDK clients on cache eviction * docs(CLAUDE.md): add HTTP client cache safety guideline	2026-03-05 12:00:38 -08:00
Ishaan Jaff	1f412bc6d8	[Feat] Add Tool Policies for AI Gateway (#22732 ) * fix: fix ui render * fix: fix minor bugs * refactor: use prisma functions instead of raw sql (safer) * fix(add-new-tiles-to-tool-policies): allow developer to see what's available * feat: ensure tool allowlist runs correctly for tool names + mcp's * refactor: more ui improvements * feat: working key tool blocking * feat(tools): show tool logs * refactor: backend code improvements * refactor: improve log viewer for tools * fix: address PR review feedback for tool access control - Add missing blocked_tools column to root schema.prisma (schema drift) - Invalidate ToolPolicyRegistry after policy mutations so changes take effect immediately - Remove dead code: unused get_effective_policies, get_tool_policies_cached, and helpers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: race condition in permission resolution and remove duplicate allowlist check - Use atomic update_many with object_permission_id=None to prevent concurrent requests from creating orphaned permission rows and losing tool blocks - Remove duplicate allowed_tools enforcement from guardrail (already enforced in auth layer via check_tools_allowlist) - Move inline uuid import to module level Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * update to account for userAgent * UI - Add ToolDetails * input/output policy * LiteLLM_PolicyAttachmentTable * LiteLLM_PolicyAttachmentTable * fix: add _enqueue_tool_registry_upsert * fix: tool mgmt endpoints * tool mgmt endpoints * Update tests/test_litellm/proxy/db/test_tool_registry_writer.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update tests/test_litellm/proxy/db/test_tool_registry_writer.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update tests/test_litellm/proxy/db/test_tool_registry_writer.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: sync root schema.prisma and fix test_tool_registry_writer for input/output policy - Migrate root schema.prisma LiteLLM_ToolTable from call_policy to input_policy/output_policy, add missing user_agent and last_used_at columns (now consistent with litellm/proxy/schema.prisma and litellm-proxy-extras) - Fix SpendLogToolIndex comment across all three schema files - Fix all call_policy references in test_tool_registry_writer.py: swapped update_tool_policy arguments, wrong get_tools_by_names return type assertions, _mock_tool_row setting call_policy instead of input_policy Addresses Greptile review feedback on PR #22732. Made-with: Cursor --------- Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-03-03 20:22:20 -08:00
Ishaan Jaff	500a88f01b	[UI QA] - Add all provider models + providers on ui (#22461 ) * feat(ui): add missing provider logos and map all backend providers to UI - Downloaded 26 SVG logos from lobehub/lobe-icons for providers that were missing visual branding (AI21, Baseten, Cloudflare, GitHub, Huggingface, Hyperbolic, Lambda, LM Studio, Meta Llama, Moonshot, Nebius, Novita, Nvidia NIM, Replicate, Recraft, Topaz, V0, Vercel, Watsonx/IBM, Xinference, Friendli, Morph, Cometapi, Featherless, Langfuse, GitHub Copilot) - Extended Providers enum from 47 to 107 entries to cover all backend providers from provider_create_fields.json - Extended provider_map to map all new enum keys to litellm_provider values - Extended providerLogoMap to assign logos to all providers where available, reusing parent logos for variants (e.g. Anthropic Text -> anthropic.svg) - Fixed SVG currentColor issue: replaced fill='currentColor' with explicit colors since CSS inheritance doesn't work in <img> elements - Updated test reference from Providers.Watsonx to Providers.WATSONX Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * docs(agents): add UI dashboard dev notes to Cursor Cloud instructions Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * refactor(ui): remove non-LLM providers from Add Model dropdown Remove Custom, Custom OpenAI, GitHub, Humanloop, Langfuse, Litellm Proxy, and Milvus from the Providers enum, provider_map, and providerLogoMap. These are not LLM API providers (they are internal tools, vector stores, or observability platforms) and should not appear in the Add Model form. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>	2026-02-28 17:35:08 -08:00
Krrish Dholakia	b232e2f564	docs(agents.md): update agents.md	2026-02-28 15:18:13 -08:00
Ishaan Jaff	adba088df2	Realtime API: spend log storage, playground UI, tools logging, and guardrail support (#22105 ) Backend - Spend Log Storage for Realtime Calls: - Collect user voice transcripts and text input during WebSocket sessions - Store collected messages in spend logs when store_prompts_in_spend_logs enabled - Capture tool definitions from session.update and tool calls from response.done - Enrich proxy_server_request with tools and response with tool_calls for UI Backend - WebSocket Auth: - Support browser-based auth via Sec-WebSocket-Protocol subprotocol - Echo back subprotocol on WebSocket accept UI - Realtime Playground: - New RealtimePlayground component with WebSocket voice+text chat - Mic recording (PCM16 24kHz), server VAD, audio playback, text input - Handle binary WebSocket frames (Blob/ArrayBuffer decoding) - Add /v1/realtime endpoint option to playground endpoint selector UI - Tools Section for Realtime Logs: - Extract tool calls from realtime response format (response.tool_calls and response.results[].response.output[].type=function_call) Tests: - 15 new backend tests for realtime streaming and spend log storage - 4 new UI tests for realtime tool call extraction Fixes pre-existing build errors: - ToolPolicies.tsx: duplicate import, antd styles type - create_key_button.tsx: missing message import Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>	2026-02-25 14:55:27 -08:00
Krish Dholakia	5662228e20	feat(ui): add user filtering to usage page (#22059 ) * feat(ui): add user filtering to usage page Adds "User Usage" as a new view option in the usage page dropdown, allowing admins to view and filter usage data by individual users via the existing /user/daily/activity backend endpoint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(ui/): working usage filtering * fix(ui): use single-select for user filter and add tests The user entity type's backend endpoint only accepts a single user_id, so the filter now uses single-select mode instead of multi-select. Added tests for the new user entity type in EntityUsage and UsageViewSelect. Updated CLAUDE.md and AGENTS.md with guidance on UI/backend contract consistency and test coverage for new entity types. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * revert: remove unintended package-lock.json changes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * revert: restore package-lock.json to merge base state Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 10:37:45 -08:00
yuneng-jiang	9791293fb0	Add light/dark mode slider for dev	2026-01-26 11:22:14 -08:00
yuneng-jiang	901d145b1a	Adding UI portion for Agents MD	2025-12-20 17:37:13 -08:00
Cesar Garcia	5b0729034c	docs: cleanup README and improve agent guides (#17003 ) * docs: cleanup README and improve AI agent guides - Remove obsolete version warnings (openai>=1.0.0, pydantic>=2.0.0) - Add note about Responses API in README - Add GitHub templates section to CLAUDE.md, GEMINI.md, and AGENTS.md - Remove temporary test file test_pydantic_fields.py * update files * update Gemini file	2025-11-23 21:53:53 -08:00
Cesar Garcia	0908208076	chore: cleanup repo and improve AI docs (#16775 ) * chore: remove development setup files from repository Removes VERTEX_ENV_SETUP.md and setup_vertex_env.sh from the repository as they are not referenced in documentation or tests. These files were added in PR #15824 alongside the VertexAI Search feature. The setup information is already well-documented in the official docs at: docs/my-website/docs/pass_through/vertex_ai_search_datastores.md * docs: add poetry run python usage to CLAUDE.md and AGENTS.md	2025-11-18 11:36:27 -08:00
Cole McIntosh	a3da7f1876	Add AGENTS.md (#11461 )	2025-06-05 16:29:28 -07:00

20 Commits