* test(ci): extend record/replay proxy to chat, embeddings, moderations, rerank, anthropic
The record/replay proxy that took the gpt-image-1 spend E2E off the live OpenAI
path now fronts every provider, so the other real-provider E2Es stop paying for
and depending on live calls each commit. It keys per upstream and selects a
non-OpenAI provider by a /__recorder_upstream/<host>/ path prefix carried on the
model's api_base, since some litellm handlers (cohere rerank) drop custom
request headers. Wired into build_and_test (chat, embeddings, moderations,
image), the otel job (cohere rerank), and the anthropic-messages job via a
reusable start_openai_record_replay_proxy command.
Dropped the time.time()/uuid prompt cache-busters in the build_and_test chat
tests, whose config has the response cache off, so identical requests are
recordable. The image spend test now asserts a repeat call still bills spend,
failing loudly if the proxy response cache is ever turned on.
Responses, the anthropic passthrough, bedrock, and fake-endpoint tests are left
live: their lifecycles, api_base assertions, providers, or fake targets make a
stateless body-keyed cache either break them or add nothing.
* docs(ci): note the recorder command's OpenAI default upstream and prefix override
Addresses a review note: the shared start_openai_record_replay_proxy command
defaults the upstream to OpenAI, so a non-OpenAI model must carry the
/__recorder_upstream/<host>/ prefix on its api_base. Document that in the
command description so a future caller does not assume the default follows the
provider.