* Add Claude Fable 5 across Anthropic, Bedrock, Vertex AI, and Azure AI
Adds cost map entries for claude-fable-5 ($10/$50 per MTok, 1M context,
128K output, adaptive thinking only) on the Anthropic API, Bedrock
converse (base, global, and us/eu geo inference profiles at the 10%
regional premium), Vertex AI, and Azure AI (Microsoft Foundry, which
serves Fable 5 with the full 1M context window unlike Opus 4.8).
Registers anthropic.claude-fable-5 in BEDROCK_CONVERSE_MODELS, lists the
model in the setup wizard, and extends the reasoning effort e2e grid.
The Bedrock, Vertex, and Azure grid cells carry fail_reason markers
until the CI accounts are provisioned: Bedrock needs the provider data
sharing opt-in Fable 5 requires, and the Foundry resource needs a
claude-fable-5 deployment.
The first-party entry carries provider_specific_entry {us: 1.1} for the
inference_geo premium and deliberately no fast multiplier since Fable 5
has no fast mode.
https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm
* Drop removed sampling params for Claude 4.7+ when drop_params is set
Fable 5, Opus 4.7, and Opus 4.8 removed sampling params: the API rejects
top_p, top_k, and any temperature other than 1 with a 400. LiteLLM was
forwarding them even with drop_params enabled because the Anthropic and
Bedrock converse transformations passed temperature/top_p through
unconditionally.
Mirror the GPT-5/o-series handling: temperature=1 still passes through,
other values and any top_p are dropped when drop_params is set, and
without drop_params a clean client-side UnsupportedParamsError tells the
caller how to opt in, instead of surfacing the raw provider error.
https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm
* Drive sampling param gating from the cost map and cover top_k
Greptile review follow-ups on the sampling param fix: the restriction for
Fable 5 / Opus 4.7 / 4.8 is now declared as supports_sampling_params: false
on every affected cost map entry (perplexity excluded; that route is
OpenAI-compatible and maps sampling params upstream) and read back through
a tri-state map lookup, keeping the name check only as a fallback for
provider-routed ids whose hosted map entries predate the flag, the same
layering supports_adaptive_thinking uses. top_k bypasses map_openai_params
as a provider-specific kwarg, so it is gated at the shared
AnthropicConfig.transform_request boundary (direct, Bedrock invoke, Vertex,
Azure) and in the Bedrock converse _handle_top_k_value path, with
drop_params threaded through the converse transform helpers.
Also updates the reasoning effort grid cell count assertion for the four
Fable 5 rows added on this branch (29 x 11 cells).
https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm
* Declare supports_sampling_params in the cost map schema
The model map validation schema uses additionalProperties: false, so the
new flag must be declared for the 28 entries that carry it; this was the
one failing job (misc / Run tests) on the previous commit.
https://claude.ai/code/session_01MZarYYT3aS7DxaNjoax6Gm
* fix(bedrock): gate top_k=0 on converse to match Anthropic boundary
Truthiness check let top_k=0 silently disappear on models that removed
sampling params, while AnthropicConfig.transform_request treats 0 as
present and raises UnsupportedParamsError (or drops when drop_params is
set). Switch to 'is not None' so converse, direct Anthropic, invoke,
Vertex, and Azure all behave the same for top_k=0.
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
|
||
|---|---|---|
| .. | ||
| fixtures | ||
| realtime | ||
| reasoning_effort_grid | ||
| test_llm_response_utils | ||
| test_skills_data | ||
| test-skill | ||
| base_audio_transcription_unit_tests.py | ||
| base_embedding_unit_tests.py | ||
| base_llm_unit_tests.py | ||
| base_rerank_unit_tests.py | ||
| conftest.py | ||
| dog.wav | ||
| duck.png | ||
| gettysburg.wav | ||
| guinea.png | ||
| log.xt | ||
| Readme.md | ||
| test_a2a.py | ||
| test_anthropic_completion.py | ||
| test_aws_base_llm.py | ||
| test_azure_agents.py | ||
| test_azure_ai.py | ||
| test_azure_o_series.py | ||
| test_azure_openai.py | ||
| test_bedrock_agentcore.py | ||
| test_bedrock_agents.py | ||
| test_bedrock_anthropic_regression.py | ||
| test_bedrock_common_utils.py | ||
| test_bedrock_completion.py | ||
| test_bedrock_dynamic_auth_params_unit_tests.py | ||
| test_bedrock_embedding.py | ||
| test_bedrock_govcloud.py | ||
| test_bedrock_gpt_oss.py | ||
| test_bedrock_invoke_tests.py | ||
| test_bedrock_llama.py | ||
| test_bedrock_mantle.py | ||
| test_bedrock_moonshot.py | ||
| test_bedrock_nova_embedding.py | ||
| test_bedrock_nova_json.py | ||
| test_cloudflare.py | ||
| test_cohere.py | ||
| test_containers_api.py | ||
| test_convert_dict_to_image.py | ||
| test_crusoe.py | ||
| test_databricks.py | ||
| test_deepgram.py | ||
| test_deepseek_completion.py | ||
| test_elevenlabs.py | ||
| test_evals_api.py | ||
| test_fireworks_ai_translation.py | ||
| test_gemini_image_usage.py | ||
| test_gemini.py | ||
| test_gigachat.py | ||
| test_gpt4o_audio.py | ||
| test_groq.py | ||
| test_hosted_vllm_embedding_e2e.py | ||
| test_huggingface_chat_completion.py | ||
| test_hyperbolic.py | ||
| test_infinity.py | ||
| test_jina_ai.py | ||
| test_lambda_ai.py | ||
| test_langgraph.py | ||
| test_litellm_proxy_provider.py | ||
| test_minimax_tts.py | ||
| test_mistral_api.py | ||
| test_model_cost_map_resilience.py | ||
| test_morph.py | ||
| test_nvidia_nim.py | ||
| test_openai_o1.py | ||
| test_openai_record_replay_proxy.py | ||
| test_openai.py | ||
| test_openrouter.py | ||
| test_optional_params.py | ||
| test_perplexity_reasoning.py | ||
| test_prompt_caching.py | ||
| test_prompt_factory.py | ||
| test_replicate.py | ||
| test_rerank.py | ||
| test_router_llm_translation_tests.py | ||
| test_sambanova_chat_transformation.py | ||
| test_skills_api.py | ||
| test_skills_e2e.py | ||
| test_snowflake.py | ||
| test_text_completion_unit_tests.py | ||
| test_text_completion.py | ||
| test_together_ai.py | ||
| test_triton.py | ||
| test_unit_test_bedrock_invoke.py | ||
| test_v0.py | ||
| test_vcr_classification.py | ||
| test_vcr_conftest_common_banner.py | ||
| test_vcr_filters.py | ||
| test_vcr_redis_persister.py | ||
| test_voyage_ai.py | ||
| test_watsonx.py | ||
| test_xai.py | ||
Unit tests for individual LLM providers.
Name of the test file is the name of the LLM provider - e.g. test_openai.py is for OpenAI.
Redis-backed VCR cache
Every test in this directory is auto-decorated with @pytest.mark.vcr (via
conftest.py). The first time a test runs we hit the live provider and
record the HTTP exchange into Redis under
litellm:vcr:cassette:<test_id>. Every subsequent run within 24h replays
from Redis without touching the network. The 24h TTL means each new day's
first run records again, so upstream API drift surfaces within a day.
The persister, header scrubbing, and 2xx-only filtering are defined in
tests/_vcr_redis_persister.py. Files that already use respx (which
patches the same httpx transport vcrpy does) are excluded from the
auto-marker — see _RESPX_CONFLICTING_FILES in conftest.py.
The same VCR cache is used by other test directories that exercise live
provider APIs. The reusable conftest plumbing lives in
tests/_vcr_conftest_common.py and is wired into:
tests/llm_translation/tests/llm_responses_api_testing/tests/audio_tests/tests/batches_tests/tests/guardrails_tests/tests/image_gen_tests/tests/litellm_utils_tests/tests/local_testing/(coverslocal_testing_part1,local_testing_part2,litellm_router_testing,litellm_assistants_api_testing,langfuse_logging_unit_tests)tests/logging_callback_tests/tests/pass_through_unit_tests/tests/router_unit_tests/tests/unified_google_tests/
Test directories that run LiteLLM proxy in Docker (e.g. build_and_test,
proxy_logging_guardrails_model_info_tests, proxy_store_model_in_db_tests)
are intentionally not included: VCR.py patches the in-process httpx
transport, so it cannot intercept the LLM calls that originate inside the
Docker container.
Required environment
CASSETTE_REDIS_URL — separate Redis instance from the application
Redis (REDIS_URL/REDIS_HOST) so test cassettes are not flushed by
proxy tests. Provider credentials (ANTHROPIC_API_KEY, OPENAI_API_KEY,
AWS_*, etc.) are needed only on cache-miss (the daily re-record), not
on replay.
Flushing the cache
When you want the next run to re-record immediately instead of waiting for the 24h TTL:
make test-llm-translation-flush-vcr-cache
Disabling VCR
Skip the cache entirely (every call goes live, no recording):
LITELLM_VCR_DISABLE=1 uv run pytest tests/llm_translation/test_<file>.py