* fix(spend-tracking): fall back to direct spend-counter increment when reservation reconcile fails When the reservation-reconcile path in `_reconcile_budget_reservation_for_counter_update` hits a Redis error, it now correctly returns an empty set so that `increment_spend_counters` re-runs the direct increment for the affected counters. Previously, the function logged the failure, invalidated the reserved counters, and still returned the reserved counter keys, which caused the caller to skip the direct increment. With the increment skipped and the counter deleted, the next request reseeded the counter from `LiteLLM_VerificationToken.spend`, a column the batched flusher only updates every few seconds, so the enforced cross-pod spend value collapsed to a stale snapshot and budget gating stopped firing for affected keys. Adds a regression test that exercises the failure path with a flaky redis backend and asserts the actual response cost lands in the shared counter. * fix(register_model): preserve built-in cache pricing when registering custom overrides under unmapped keys When a custom-priced model is registered under a key shape that get_model_info cannot resolve (e.g. litellm_params.model set to bedrock/bedrock/us.anthropic.claude-sonnet-4-6 or another non-canonical alias), register_model previously fell back to an empty existing_model. The merged entry then carried only the fields the user set explicitly (input/output cost, provider) and dropped cache pricing. Downstream the cost calculator defaulted cache_creation_input_token_cost and cache_read_input_token_cost to 0, silently dropping the bulk of the bill for cache-heavy Anthropic traffic. register_model now attempts to resolve a canonical built-in entry by stripping provider prefixes, region prefixes, and provider-specific suffixes before giving up. When a variant resolves, its defaults (notably cache pricing) are inherited while the user's explicit overrides still win. When nothing resolves and the user supplied no cache pricing, it logs a warning instead of silently under-billing. * fix(router): inherit built-in cache pricing on deployments with partial custom pricing A deployment configured with only input_cost_per_token and output_cost_per_token under model_info was being registered under its model_info.id with no cache cost fields. The cost calculator then defaulted cache_creation_input_token_cost and cache_read_input_token_cost to 0, silently billing cache_read and cache_creation tokens at zero. For cache-heavy Anthropic traffic this drops the bulk of the bill. When the deployment's litellm_params.model resolves to a built-in cost-map entry, pull the cache pricing fields from there before registering. User-specified cache fields still win on merge; only missing fields are inherited. Pairs with the register_model fallback added earlier in this branch: that handles unmapped key shapes like bedrock/bedrock/x, this handles deploy-id keys whose backend model is mapped. * fix(register_model): inherit only cache pricing on unmapped-key fallback, not provider The unmapped-key fallback in register_model copied the entire resolved built-in entry, so registering openai/command-r-plus inherited the cohere built-in's litellm_provider and get_model_info(custom_llm_provider=openai) could no longer resolve it. Restrict the fallback to the cache-pricing fields, matching the router-side _inherit_builtin_cache_pricing, so the cache-cost dropout stays fixed without clobbering the registered provider. Add a direct unit test for Router._inherit_builtin_cache_pricing so the router coverage check sees it, and pin the fixed spend-counter contract: when reservation reconcile fails the counter must hold the directly incremented cost rather than being left at None. |
||
|---|---|---|
| .. | ||
| agent_tests | ||
| audio_tests | ||
| basic_proxy_startup_tests | ||
| batches_tests | ||
| benchmarks | ||
| code_coverage_tests | ||
| documentation_tests | ||
| enterprise | ||
| guardrails_tests | ||
| image_gen_tests | ||
| integration | ||
| litellm | ||
| litellm_core_utils | ||
| litellm_utils_tests | ||
| litellm-proxy-extras | ||
| llm_responses_api_testing | ||
| llm_translation | ||
| load_tests | ||
| local_testing | ||
| logging_callback_tests | ||
| mcp_tests | ||
| multi_instance_e2e_tests | ||
| ocr_tests | ||
| old_proxy_tests/tests | ||
| openai_endpoints_tests | ||
| otel_tests | ||
| pass_through_tests | ||
| pass_through_unit_tests | ||
| proxy_admin_ui_tests | ||
| proxy_behavior | ||
| proxy_e2e_anthropic_messages_tests | ||
| proxy_migration_tests | ||
| proxy_security_tests | ||
| proxy_unit_tests | ||
| router_unit_tests | ||
| scim_tests | ||
| search_tests | ||
| spend_tracking_tests | ||
| store_model_in_db_tests | ||
| test_litellm | ||
| unified_google_tests | ||
| vector_store_tests | ||
| windows_tests | ||
| __init__.py | ||
| _flush_vcr_cache.py | ||
| _live_test_helpers.py | ||
| _openai_record_replay_proxy.py | ||
| _vcr_conftest_common.py | ||
| _vcr_redis_persister.py | ||
| eval_swe_bench.py | ||
| gettysburg.wav | ||
| large_text.py | ||
| openai_batch_completions.jsonl | ||
| README.MD | ||
| test_budget_management.py | ||
| test_callbacks_on_proxy.py | ||
| test_config.py | ||
| test_debug_warning.py | ||
| test_default_encoding_non_root.py | ||
| test_end_users.py | ||
| test_entrypoint.py | ||
| test_fallbacks.py | ||
| test_gpt5_azure_temperature_support.py | ||
| test_health.py | ||
| test_keys.py | ||
| test_litellm_proxy_responses_config.py | ||
| test_logging.conf | ||
| test_models.py | ||
| test_new_vector_store_endpoints.py | ||
| test_openai_endpoints.py | ||
| test_organizations.py | ||
| test_otel_thread_leak.py | ||
| test_passthrough_endpoints.py | ||
| test_presidio_latency.py | ||
| test_proxy_server_non_root.py | ||
| test_ratelimit.py | ||
| test_resource_cleanup.py | ||
| test_service_logger_otel.py | ||
| test_spend_logs.py | ||
| test_team_logging.py | ||
| test_team_members.py | ||
| test_team.py | ||
| test_users.py | ||
In total litellm runs 1000+ tests
[02/20/2025] Update:
To make it easier to contribute and map what behavior is tested,
we've started mapping the litellm directory in tests/test_litellm
This folder can only run mock tests.