Converts GuardrailSettingsView from @tremor/react (Badge, Text) to antd (Tag, plain spans) as part of the Tremor migration. Also captures the "no new Tremor imports" rule in CLAUDE.md and expands the existing note in AGENTS.md with the specific antd equivalents and the yellow→gold gotcha.
13 KiB
INSTRUCTIONS FOR LITELLM
This document provides comprehensive instructions for AI agents working in the LiteLLM repository.
OVERVIEW
LiteLLM is a unified interface for 100+ LLMs that:
- Translates inputs to provider-specific completion, embedding, and image generation endpoints
- Provides consistent OpenAI-format output across all providers
- Includes retry/fallback logic across multiple deployments (Router)
- Offers a proxy server (LLM Gateway) with budgets, rate limits, and authentication
- Supports advanced features like function calling, streaming, caching, and observability
REPOSITORY STRUCTURE
Core Components
litellm/- Main library codellms/- Provider-specific implementations (OpenAI, Anthropic, Azure, etc.)proxy/- Proxy server implementation (LLM Gateway)router_utils/- Load balancing and fallback logictypes/- Type definitions and schemasintegrations/- Third-party integrations (observability, caching, etc.)
Key Directories
tests/- Comprehensive test suitesdocs/my-website/- Documentation websiteui/litellm-dashboard/- Admin dashboard UIenterprise/- Enterprise-specific features
DEVELOPMENT GUIDELINES
MAKING CODE CHANGES
-
Provider Implementations: When adding/modifying LLM providers:
- Follow existing patterns in
litellm/llms/{provider}/ - Implement proper transformation classes that inherit from
BaseConfig - Support both sync and async operations
- Handle streaming responses appropriately
- Include proper error handling with provider-specific exceptions
- Follow existing patterns in
-
Type Safety:
- Use proper type hints throughout
- Update type definitions in
litellm/types/ - Ensure compatibility with both Pydantic v1 and v2
-
Testing:
- Add tests in appropriate
tests/subdirectories - Include both unit tests and integration tests
- Test provider-specific functionality thoroughly
- Consider adding load tests for performance-critical changes
- Add tests in appropriate
MAKING CODE CHANGES FOR THE UI (IGNORE FOR BACKEND)
-
Always use
antdfor new UI components — Tremor is DEPRECATED- We are migrating off of
@tremor/react. Do not introduce newBadge,Text,Card,Grid,Title, or other imports from@tremor/reactin any new or modified file. - Use
antdequivalents:Tagfor labels, plain<span>/<div>with Tailwind classes (orTypography.Text) for text,Cardfromantd, etc. Note thatantdhas no"yellow"Tag color — use"gold"for amber/yellow. - The only exception is the Tremor Table component and its required Tremor Table sub components.
- We are migrating off of
-
Use Common Components as much as possible:
- These are usually defined in the
common_componentsdirectory - Use these components as much as possible and avoid building new components unless needed
- These are usually defined in the
-
Testing:
- The codebase uses Vitest and React Testing Library
- Query Priority Order: Use query methods in this order:
getByRole,getByLabelText,getByPlaceholderText,getByText,getByTestId - Always use
screeninstead of destructuring fromrender()(e.g., usescreen.getByText()notgetByText) - Wrap user interactions in
act(): Always wrapfireEventcalls withact()to ensure React state updates are properly handled - Use
querymethods for absence checks: UsequeryBy*methods (notgetBy*) when expecting an element to NOT be present - Test names must start with "should": All test names should follow the pattern
it("should ...") - Mock external dependencies: Check
setupTests.tsfor global mocks and mock child components/networking calls as needed - Structure tests properly:
- First test should verify the component renders successfully
- Subsequent tests should focus on functionality and user interactions
- Use
waitForfor async operations that aren't already awaited
- Avoid using
querySelector: Prefer React Testing Library queries over direct DOM manipulation
IMPORTANT PATTERNS
-
Function/Tool Calling:
- LiteLLM standardizes tool calling across providers
- OpenAI format is the standard, with transformations for other providers
- See
litellm/llms/anthropic/chat/transformation.pyfor complex tool handling
-
Streaming:
- All providers should support streaming where possible
- Use consistent chunk formatting across providers
- Handle both sync and async streaming
-
Error Handling:
- Use provider-specific exception classes
- Maintain consistent error formats across providers
- Include proper retry logic and fallback mechanisms
-
Configuration:
- Support both environment variables and programmatic configuration
- Use
BaseConfigclasses for provider configurations - Allow dynamic parameter passing
PROXY SERVER (LLM GATEWAY)
The proxy server is a critical component that provides:
- Authentication and authorization
- Rate limiting and budget management
- Load balancing across multiple models/deployments
- Observability and logging
- Admin dashboard UI
- Enterprise features
Key files:
litellm/proxy/proxy_server.py- Main server implementationlitellm/proxy/auth/- Authentication logiclitellm/proxy/management_endpoints/- Admin API endpoints
Database (proxy): Use Prisma model methods (prisma_client.db.<model>.upsert, .find_many, .find_unique, etc.), not raw SQL (execute_raw/query_raw). See COMMON PITFALLS for details.
MCP (MODEL CONTEXT PROTOCOL) SUPPORT
LiteLLM supports MCP for agent workflows:
- MCP server integration for tool calling
- Transformation between OpenAI and MCP tool formats
- Support for external MCP servers (Zapier, Jira, Linear, etc.)
- See
litellm/experimental_mcp_client/andlitellm/proxy/_experimental/mcp_server/
RUNNING SCRIPTS
Use poetry run python script.py to run Python scripts in the project environment (for non-test files).
GITHUB TEMPLATES
When opening issues or pull requests, follow these templates:
Bug Reports (.github/ISSUE_TEMPLATE/bug_report.yml)
- Describe what happened vs. expected behavior
- Include relevant log output
- Specify LiteLLM version
- Indicate if you're part of an ML Ops team (helps with prioritization)
Feature Requests (.github/ISSUE_TEMPLATE/feature_request.yml)
- Clearly describe the feature
- Explain motivation and use case with concrete examples
Pull Requests (.github/pull_request_template.md)
- Add at least 1 test in
tests/litellm/ - Ensure
make test-unitpasses
TESTING CONSIDERATIONS
- Provider Tests: Test against real provider APIs when possible
- Proxy Tests: Include authentication, rate limiting, and routing tests
- Performance Tests: Load testing for high-throughput scenarios
- Integration Tests: End-to-end workflows including tool calling
DOCUMENTATION
- Keep documentation in sync with code changes
- Update provider documentation when adding new providers
- Include code examples for new features
- Update changelog and release notes
SECURITY CONSIDERATIONS
- Handle API keys securely
- Validate all inputs, especially for proxy endpoints
- Consider rate limiting and abuse prevention
- Follow security best practices for authentication
ENTERPRISE FEATURES
- Some features are enterprise-only
- Check
enterprise/directory for enterprise-specific code - Maintain compatibility between open-source and enterprise versions
COMMON PITFALLS TO AVOID
-
Breaking Changes: LiteLLM has many users - avoid breaking existing APIs
-
Provider Specifics: Each provider has unique quirks - handle them properly
-
Rate Limits: Respect provider rate limits in tests
-
Memory Usage: Be mindful of memory usage in streaming scenarios
-
Dependencies: Keep dependencies minimal and well-justified
-
UI/Backend Contract Mismatch: When adding a new entity type to the UI, always check whether the backend endpoint accepts a single value or an array. Match the UI control accordingly (single-select vs. multi-select) to avoid silently dropping user selections
-
Missing Tests for New Entity Types: When adding a new entity type (e.g., in
EntityUsage,UsageViewSelect), always add corresponding tests in the existing test files and update any icon/component mocks -
Raw SQL in proxy DB code: Do not use
execute_raworquery_rawfor proxy database access. Use Prisma model methods (e.g.prisma_client.db.litellm_tooltable.upsert(),.find_many(),.find_unique()) so behavior stays consistent with the schema, the client stays mockable in tests, and you avoid the pitfalls of hand-written SQL (parameter ordering, type casting, schema drift) -
Do not hardcode model-specific flags: Put model-specific capability flags in
model_prices_and_context_window.jsonand read them viaget_model_info(or existing helpers likesupports_reasoning). This prevents users from needing to upgrade LiteLLM each time a new model supports a feature.Example of BAD (hardcoded model checks):
@staticmethod def _is_effort_supported_model(model: str) -> bool: """Check if the model supports the output_config.effort parameter...""" model_lower = model.lower() if AnthropicConfig._is_claude_4_6_model(model): return True return any( v in model_lower for v in ("opus-4-5", "opus_4_5", "opus-4.5", "opus_4.5") )Example of GOOD (config-driven or helper that reads from config):
if ( "claude-3-7-sonnet" in model or AnthropicConfig._is_claude_4_6_model(model) or supports_reasoning( model=model, custom_llm_provider=self.custom_llm_provider, ) ): ...Using helpers like
supports_reasoning(which read frommodel_prices_and_context_window.json/get_model_info) allows future model updates to "just work" without code changes. -
Never close HTTP/SDK clients on cache eviction: Do not add
close(),aclose(), orcreate_task(close_fn())insideLLMClientCache._remove_key()or any cache eviction path. Evicted clients may still be held by in-flight requests; closing them causesRuntimeError: Cannot send a request, as the client has been closed.in production after the cache TTL (1 hour) expires. Connection cleanup is handled at shutdown byclose_litellm_async_clients(). See PR #22247 for the full incident history.
HELPFUL RESOURCES
- Main documentation: https://docs.litellm.ai/
- Provider-specific docs in
docs/my-website/docs/providers/ - Admin UI for testing proxy features
WHEN IN DOUBT
- Follow existing patterns in the codebase
- Check similar provider implementations
- Ensure comprehensive test coverage
- Update documentation appropriately
- Consider backward compatibility impact
Cursor Cloud specific instructions
Environment
- Poetry is installed in
~/.local/bin; the update script ensures it is onPATH. - Python 3.12, Node 22 are pre-installed.
- The virtual environment lives under
~/.cache/pypoetry/virtualenvs/.
Running the proxy server
Start the proxy with a config file:
poetry run litellm --config dev_config.yaml --port 4000
The proxy takes ~15-20 seconds to fully start (it runs Prisma migrations on boot). Wait for /health to return before sending requests. Without a PostgreSQL DATABASE_URL, the proxy connects to a default Neon dev database embedded in the litellm-proxy-extras package.
Running tests
See CLAUDE.md and the Makefile for standard commands. Key notes:
psycopg-binarymust be installed (poetry run pip install psycopg-binary) because the pytest-postgresql plugin requires it and the lock file only includespsycopg(no binary).openapi-coremust be installed (poetry run pip install openapi-core) for the OpenAPI compliance tests intests/test_litellm/interactions/.- The
--timeoutpytest flag is NOT available; don't pass it. - Unit tests:
poetry run pytest tests/test_litellm/ -x -vv -n 4 - Before committing, always run
poetry run black .to format your code. Black formatting is enforced in CI. - If
poetry installfails with "pyproject.toml changed significantly since poetry.lock was last generated", runpoetry lockfirst to regenerate the lock file.
Lint
cd litellm && poetry run ruff check .
Ruff is the primary fast linter. For the full lint suite (including mypy, black, circular imports), run make lint per CLAUDE.md.
UI Dashboard development
- The UI is at
ui/litellm-dashboard/. Runnpm run devfrom that directory for the Next.js dev server on port 3000. - The proxy at port 4000 serves a pre-built static UI from
litellm/proxy/_experimental/out/. After making UI code changes, you must runnpm run buildin the dashboard directory and copy the output:cp -r ui/litellm-dashboard/out/* litellm/proxy/_experimental/out/for the proxy to serve the updated UI. - SVGs used as provider logos (loaded via
<img>tags) must NOT usefill="currentColor"— replace with an explicit color like#000000or use the-colorvariant from lobehub icons, since CSS color inheritance does not work inside<img>elements. - Provider logos live in
ui/litellm-dashboard/public/assets/logos/(source) andlitellm/proxy/_experimental/out/assets/logos/(pre-built). Both locations must have the file for it to work in dev and proxy-served modes. - UI Vitest tests:
cd ui/litellm-dashboard && npx vitest run