litellm

History

Mateo Wang e15b37a18e Add Claude Fable 5 across Anthropic, Bedrock, Vertex AI, and Azure AI (#30064 ) * Add Claude Fable 5 across Anthropic, Bedrock, Vertex AI, and Azure AI Adds cost map entries for claude-fable-5 ($10/$50 per MTok, 1M context, 128K output, adaptive thinking only) on the Anthropic API, Bedrock converse (base, global, and us/eu geo inference profiles at the 10% regional premium), Vertex AI, and Azure AI (Microsoft Foundry, which serves Fable 5 with the full 1M context window unlike Opus 4.8). Registers anthropic.claude-fable-5 in BEDROCK_CONVERSE_MODELS, lists the model in the setup wizard, and extends the reasoning effort e2e grid. The Bedrock, Vertex, and Azure grid cells carry fail_reason markers until the CI accounts are provisioned: Bedrock needs the provider data sharing opt-in Fable 5 requires, and the Foundry resource needs a claude-fable-5 deployment. The first-party entry carries provider_specific_entry {us: 1.1} for the inference_geo premium and deliberately no fast multiplier since Fable 5 has no fast mode. https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm * Drop removed sampling params for Claude 4.7+ when drop_params is set Fable 5, Opus 4.7, and Opus 4.8 removed sampling params: the API rejects top_p, top_k, and any temperature other than 1 with a 400. LiteLLM was forwarding them even with drop_params enabled because the Anthropic and Bedrock converse transformations passed temperature/top_p through unconditionally. Mirror the GPT-5/o-series handling: temperature=1 still passes through, other values and any top_p are dropped when drop_params is set, and without drop_params a clean client-side UnsupportedParamsError tells the caller how to opt in, instead of surfacing the raw provider error. https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm * Drive sampling param gating from the cost map and cover top_k Greptile review follow-ups on the sampling param fix: the restriction for Fable 5 / Opus 4.7 / 4.8 is now declared as supports_sampling_params: false on every affected cost map entry (perplexity excluded; that route is OpenAI-compatible and maps sampling params upstream) and read back through a tri-state map lookup, keeping the name check only as a fallback for provider-routed ids whose hosted map entries predate the flag, the same layering supports_adaptive_thinking uses. top_k bypasses map_openai_params as a provider-specific kwarg, so it is gated at the shared AnthropicConfig.transform_request boundary (direct, Bedrock invoke, Vertex, Azure) and in the Bedrock converse _handle_top_k_value path, with drop_params threaded through the converse transform helpers. Also updates the reasoning effort grid cell count assertion for the four Fable 5 rows added on this branch (29 x 11 cells). https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm * Declare supports_sampling_params in the cost map schema The model map validation schema uses additionalProperties: false, so the new flag must be declared for the 28 entries that carry it; this was the one failing job (misc / Run tests) on the previous commit. https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm * fix(bedrock): gate top_k=0 on converse to match Anthropic boundary Truthiness check let top_k=0 silently disappear on models that removed sampling params, while AnthropicConfig.transform_request treats 0 as present and raises UnsupportedParamsError (or drops when drop_params is set). Switch to 'is not None' so converse, direct Anthropic, invoke, Vertex, and Azure all behave the same for top_k=0. --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>		2026-06-10 08:50:15 +05:30
..
fixtures	test fix	2025-10-16 18:00:46 -07:00
realtime	test: stabilize batch VCR coverage and stop live upload/network leaks (#29477 )	2026-06-02 16:11:52 -07:00
reasoning_effort_grid	Add Claude Fable 5 across Anthropic, Bedrock, Vertex AI, and Azure AI (#30064 )	2026-06-10 08:50:15 +05:30
test_llm_response_utils	Litellm oss staging (#29492 )	2026-06-02 08:48:10 -07:00
test_skills_data	Remove Apache 2 license from SKILL.md (#22322 )	2026-02-27 19:33:55 -08:00
test-skill	[Feat] New API - Claude Skills API (Anthropic) (#17042 )	2025-11-24 15:01:40 -08:00
base_audio_transcription_unit_tests.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
base_embedding_unit_tests.py	Litellm dev 12 25 2025 p2 (#7420 )	2024-12-25 18:35:34 -08:00
base_llm_unit_tests.py	test(vcr): close out the remaining VCR live-call leaks (#29603 )	2026-06-03 13:46:43 -07:00
base_rerank_unit_tests.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
conftest.py	test(vcr): close out the remaining VCR live-call leaks (#29603 )	2026-06-03 13:46:43 -07:00
dog.wav	(feat) Support audio param in responses streaming (#6312 )	2024-10-18 19:16:14 +05:30
duck.png	fix vertex ai multimodal embedding translation (#9471 )	2025-03-24 23:23:28 -07:00
gettysburg.wav	Litellm dev 12 25 2025 p2 (#7420 )	2024-12-25 18:35:34 -08:00
guinea.png	fix vertex ai multimodal embedding translation (#9471 )	2025-03-24 23:23:28 -07:00
log.xt	Litellm dev 04 05 2025 p2 (#9774 )	2025-04-07 21:02:52 -07:00
Readme.md	test: add 24hr Redis-backed VCR cache to additional test suites (#27159 )	2026-05-05 15:13:31 -07:00
test_a2a.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_anthropic_completion.py	chore(ci): modernize model references in tests and configs (#27856 )	2026-05-15 15:44:28 -07:00
test_aws_base_llm.py	Add support for AWS assume_role with a session token	2025-08-23 22:37:21 -07:00
test_azure_agents.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_azure_ai.py	test_completion_azure	2026-03-30 21:54:27 -07:00
test_azure_o_series.py	[Fix] CI: Enable VCR replay for test_azure_o_series	2026-05-04 20:48:26 -07:00
test_azure_openai.py	Litellm ishaan april15 2 (#25828 )	2026-04-15 18:42:23 -07:00
test_bedrock_agentcore.py	Revert "chore(tests): migrate Bedrock CI to AWS account 941277531214 (#28728 )" (#29326 )	2026-05-30 11:26:24 -07:00
test_bedrock_agents.py	test: skip test with invalid arn	2025-09-09 20:35:44 -07:00
test_bedrock_anthropic_regression.py	fix(tests): replace deprecated Bedrock Claude 3.7 Sonnet model ID	2026-04-28 14:24:19 -07:00
test_bedrock_common_utils.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_bedrock_completion.py	Litellm oss staging 080626 (#29932 )	2026-06-08 13:49:52 -07:00
test_bedrock_dynamic_auth_params_unit_tests.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_bedrock_embedding.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_bedrock_govcloud.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_bedrock_gpt_oss.py	[Test] add request-body mock test for bedrock gpt-oss tool schema	2026-04-14 19:36:57 -07:00
test_bedrock_invoke_tests.py	test(vcr): close out the remaining VCR live-call leaks (#29603 )	2026-06-03 13:46:43 -07:00
test_bedrock_llama.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_bedrock_mantle.py	fix(bedrock-mantle): use /anthropic/v1/messages path for Mantle endpo… (#27976 )	2026-05-15 13:31:59 -07:00
test_bedrock_moonshot.py	[Test] Mock remaining live Bedrock Moonshot tests	2026-04-16 17:43:43 -07:00
test_bedrock_nova_embedding.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_bedrock_nova_json.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_cloudflare.py	fix(cloudflare): support response_text in streaming chunk parser	2026-05-02 05:59:15 +00:00
test_cohere.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_containers_api.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_convert_dict_to_image.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_crusoe.py	fix(crusoe): remove trailing slashes from API base URLs and fix list indentation	2026-05-01 17:27:52 +05:30
test_databricks.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_deepgram.py	Litellm dev 12 28 2024 p3 (#7464 )	2024-12-28 19:18:58 -08:00
test_deepseek_completion.py	Litellm oss staging (#28161 )	2026-05-18 16:27:44 -07:00
test_elevenlabs.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_evals_api.py	[Fix] Tests: Reduce VCR cassette bloat and fix multipart caching	2026-05-07 11:54:19 -07:00
test_fireworks_ai_translation.py	Litellm oss staging 040626 (#29671 )	2026-06-04 11:07:20 -07:00
test_gemini_image_usage.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_gemini.py	Litellm oss staging 030626 (#29578 )	2026-06-03 11:01:51 -07:00
test_gigachat.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_gpt4o_audio.py	fix(tests): replace shut-down gpt-4o-audio-preview with gpt-audio-1.5 (#28281 )	2026-05-19 14:48:30 -07:00
test_groq.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_hosted_vllm_embedding_e2e.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_huggingface_chat_completion.py	Revert "Revert "fix tests (#12286 )""	2025-07-03 12:08:27 -07:00
test_hyperbolic.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_infinity.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_jina_ai.py	Litellm 12 02 2024 (#6994 )	2024-12-02 22:00:01 -08:00
test_lambda_ai.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_langgraph.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_litellm_proxy_provider.py	fix(tests): replace deprecated Bedrock Claude 3.7 Sonnet model ID	2026-04-28 14:24:19 -07:00
test_minimax_tts.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_mistral_api.py	test: reduce mistral direct tests b/c of rate limit errors	2025-08-23 11:15:03 -07:00
test_model_cost_map_resilience.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_morph.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_nvidia_nim.py	fix(tests): migrate realtime + rerank tests off shut-down upstream models (#28191 )	2026-05-18 15:41:51 -07:00
test_openai_o1.py	test(vcr): drop dead 'from respx import MockRouter' imports	2026-05-13 00:32:03 +00:00
test_openai_record_replay_proxy.py	Extend the record/replay proxy to chat, embeddings, moderations, rerank, and Anthropic (#29847 )	2026-06-06 14:33:42 -07:00
test_openai.py	test(vcr): drop dead 'from respx import MockRouter' imports	2026-05-13 00:32:03 +00:00
test_openrouter.py	Fix deprecated model test	2026-05-11 09:49:47 +05:30
test_optional_params.py	chore(ci): modernize model references in tests and configs (#27856 )	2026-05-15 15:44:28 -07:00
test_perplexity_reasoning.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_prompt_caching.py	test(vcr): drop dead 'from respx import MockRouter' imports	2026-05-13 00:32:03 +00:00
test_prompt_factory.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_replicate.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_rerank.py	refactor: refactor testing	2026-03-28 18:39:32 -07:00
test_router_llm_translation_tests.py	test: test	2026-03-28 19:17:38 -07:00
test_sambanova_chat_transformation.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_skills_api.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_skills_e2e.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_snowflake.py	Merge main and resolve Snowflake test conflict	2026-03-30 18:06:37 -07:00
test_text_completion_unit_tests.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_text_completion.py	Add inference providers support for Hugging Face (#8258 ) (#9738 ) (#9773 )	2025-04-05 10:50:15 -07:00
test_together_ai.py	[Fix] TogetherAIConfig.get_supported_openai_params recursion	2026-04-16 17:20:58 -07:00
test_triton.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_unit_test_bedrock_invoke.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_v0.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_vcr_classification.py	test: stabilize batch VCR coverage and stop live upload/network leaks (#29477 )	2026-06-02 16:11:52 -07:00
test_vcr_conftest_common_banner.py	fix(tests/vcr): make Redis cassette cache replay deterministically (zero VCR misses on consecutive runs) (#28826 )	2026-05-26 11:30:44 -07:00
test_vcr_filters.py	fix(tests/vcr): mint Google OAuth tokens live to prevent stale-token replay (#29229 )	2026-05-28 17:12:02 -07:00
test_vcr_redis_persister.py	test(vcr): stop refreshing cassette TTL on read so cassettes lapse after 24h (#29784 )	2026-06-05 10:22:41 -07:00
test_voyage_ai.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_watsonx.py	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
test_xai.py	test(vcr): drop dead 'from respx import MockRouter' imports	2026-05-13 00:32:03 +00:00

Readme.md

Unit tests for individual LLM providers.

Name of the test file is the name of the LLM provider - e.g. test_openai.py is for OpenAI.

Redis-backed VCR cache

Every test in this directory is auto-decorated with @pytest.mark.vcr (via conftest.py). The first time a test runs we hit the live provider and record the HTTP exchange into Redis under litellm:vcr:cassette:<test_id>. Every subsequent run within 24h replays from Redis without touching the network. The 24h TTL means each new day's first run records again, so upstream API drift surfaces within a day.

The persister, header scrubbing, and 2xx-only filtering are defined in tests/_vcr_redis_persister.py. Files that already use respx (which patches the same httpx transport vcrpy does) are excluded from the auto-marker — see _RESPX_CONFLICTING_FILES in conftest.py.

The same VCR cache is used by other test directories that exercise live provider APIs. The reusable conftest plumbing lives in tests/_vcr_conftest_common.py and is wired into:

tests/llm_translation/
tests/llm_responses_api_testing/
tests/audio_tests/
tests/batches_tests/
tests/guardrails_tests/
tests/image_gen_tests/
tests/litellm_utils_tests/
tests/local_testing/ (covers local_testing_part1, local_testing_part2, litellm_router_testing, litellm_assistants_api_testing, langfuse_logging_unit_tests)
tests/logging_callback_tests/
tests/pass_through_unit_tests/
tests/router_unit_tests/
tests/unified_google_tests/

Test directories that run LiteLLM proxy in Docker (e.g. build_and_test, proxy_logging_guardrails_model_info_tests, proxy_store_model_in_db_tests) are intentionally not included: VCR.py patches the in-process httpx transport, so it cannot intercept the LLM calls that originate inside the Docker container.

Required environment

CASSETTE_REDIS_URL — separate Redis instance from the application Redis (REDIS_URL/REDIS_HOST) so test cassettes are not flushed by proxy tests. Provider credentials (ANTHROPIC_API_KEY, OPENAI_API_KEY, AWS_*, etc.) are needed only on cache-miss (the daily re-record), not on replay.

Flushing the cache

When you want the next run to re-record immediately instead of waiting for the 24h TTL:

make test-llm-translation-flush-vcr-cache

Disabling VCR

Skip the cache entirely (every call goes live, no recording):

LITELLM_VCR_DISABLE=1 uv run pytest tests/llm_translation/test_<file>.py