From bb448b0031c8b5fc20db52383291798fd5ea001d Mon Sep 17 00:00:00 2001 From: Mateo Wang <277851410+mateo-berri@users.noreply.github.com> Date: Mon, 18 May 2026 09:15:39 -0700 Subject: [PATCH] fix(tests): stabilize image-edit VCR cassettes to stop live gpt-image-1 spend (#28110) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix(tests): stabilize image-edit VCR cassettes to stop live gpt-image-1 spend The image-edit cassettes for ``gpt-image-1`` were accumulating >50 episodes and being refused by the persister (``tests/_vcr_redis_persister.py``), so every CI run was hitting the real OpenAI endpoint. The async parametrize was the clearest tell: ``test_openai_image_edit_litellm_sdk[True]`` cached to 1 entry, but the ``[False]`` (async) sibling grew to 51 entries and never replayed. Two non-deterministic sources were fueling the growth, both fixed here. After this patch, the cassettes settle at one episode per unique call and replay for the 24-hour TTL like every other suite. 1. Pin httpx's multipart boundary at the source. The existing ``_normalize_multipart_boundary`` rewrites the boundary in the ``Content-Type`` header reliably, but on the async transport path the body is not always a contiguous ``bytes`` object when ``before_record_request`` runs, so the body-side replacement silently no-ops and the recorded cassette retains the random ``boundary=`` string. The next CI run gets a fresh random boundary, the ``safe_body`` matcher misses, and ``record_mode="new_episodes"`` appends another episode. Wrapping ``httpx._multipart.MultipartStream.__init__`` so it always uses ``vcr-static-boundary`` when no boundary is supplied eliminates the variance for both sync and async paths and leaves the normalizer in place as a backstop. Exposed as ``pin_httpx_multipart_boundary`` so other multipart-heavy suites (audio, ocr, batches) can adopt the same fixture later. 2. Pass raw ``bytes`` (not ``BytesIO`` streams) through the image-edit fixtures. A ``BytesIO`` whose file pointer is at EOF after the first multipart upload silently encodes an empty image on the next SDK / Router retry — yet another divergent body that VCR records as a new episode. ``bytes`` are immutable and position-less, so retries re-encode an identical payload every time. This is also a small production-correctness improvement: a customer passing ``BytesIO`` today would hit the same empty-body retry bug. The BytesIO-specific smoke test (``test_openai_image_edit_with_bytesio``) is preserved by giving ``get_test_images_as_bytesio`` its own factory instead of aliasing the bytes one. 3. Add ``scripts/flush_image_edit_vcr_cassettes.py`` — a one-shot Redis SCAN/DEL helper that clears the bloated pre-fix cassettes under ``litellm:vcr:cassette:tests/image_gen_tests/test_image_edits/*``. Without this, the next CI run still loads the existing 51-entry cassette, the new fixed-boundary body still doesn't match any of the stale entries, the persister still refuses to save, and the bleed continues. Run once with the production ``CASSETTE_REDIS_URL`` after merge (dry-run by default). * DIAGNOSTIC: log VCR body mismatches + per-episode body hashes Temporary observability boost so we can root-cause why ``test_image_edits.py`` async parametrizes still record fresh episodes on every CI run even though the multipart boundary is now pinned (sync parametrizes cache cleanly as VCR HIT). The matcher currently raises ``AssertionError("request bodies differ")`` with zero context, so we cannot tell whether the live body genuinely varies, the matcher is comparing a bytes object to a stream object, or the normalizer is silently skipping the body because it is not bytes/str. Three logs added; the first two are worth keeping permanently, the third is intended to be reverted after the diagnosis lands: 1. ``_safe_body_matcher`` now emits a structured stderr block on mismatch (type of each side, length, SHA-256, first divergent byte offset, ±100-byte window). Always-on -- mismatches are signal, not noise, and the existing per-test verdict already logs once per test. PERMANENT. 2. ``_normalize_multipart_boundary`` now logs to stderr when the body type is not bytes/bytearray/str -- the silent ``else: return`` branch was masking exactly the case we suspect is firing on async (httpx ``MultipartStream`` handed to vcrpy before the body is read). PERMANENT. 3. ``_RedisPersister.save_cassette`` now logs every episode's body SHA-256, length, and 120-byte preview at save time. This lets two consecutive CI runs be diffed: if the same test records a different hash run-to-run, the live body genuinely varies; if both runs record the same hash but the matcher still misses, the bug is in the matcher itself. TEMPORARY -- revert once the async variance is identified and fixed. Once a single ``image_gen_testing`` CI run produces these logs, revert this commit (or just the persister hash block) with a force push so the cassette save path is not noisy in steady-state. * DIAGNOSTIC: route VCR diagnostics through per-PID files (bypass xdist capture) Re-push of the diagnostic logging from the previous commit, this time wired so the output actually survives to the CI log. xdist captures stdout/stderr from every passing test in the worker process; the body-matcher and normalizer-skip diagnostics fire from inside vcrpy machinery during the test, so for any test that ultimately passes (which is all of them once the cassettes are recorded), the diagnostic lines are silently swallowed. Fix: write each diagnostic line to a per-PID file under ``test-results/vcr-diagnostics/.log`` instead of writing to stderr. The controller's ``pytest_terminal_summary`` aggregates those files and writes them through ``terminalreporter.write_line``, which is not subject to per-test capture. As a bonus, ``test-results/`` is already collected by the ``store_test_results`` step in CircleCI, so the raw per-worker logs survive as build artifacts even after the test session ends. Three call sites updated: 1. ``_emit_body_mismatch_diagnostic`` (matcher) -- writes the structured type/length/sha/window block via ``vcr_diag_write_line``. 2. ``_normalize_multipart_boundary`` -- logs the silent-skip path (body not bytes/bytearray/str) the same way. 3. ``_maybe_log_episode_body_hashes`` (persister) -- replaces the ``_log.warning`` calls (which the root-logger config also swallows in CI) with ``vcr_diag_write_line``. Image-gen conftest is the only suite wired to dump the aggregated log at session end. Other suites can opt in by adding ``emit_vcr_diagnostic_log(terminalreporter)`` to their own ``pytest_terminal_summary``. The diagnostic dir is cleared at the start of each session (controller-only) so a local rerun does not mix output from prior runs. Same revert plan as the previous diagnostic commit: keep the matcher + normalizer skip diagnostics permanently (they only fire on signal events), revert the persister body-hash dump once the async variance is identified. * fix(tests): coalesce iterable request bodies before matching/recording Root cause of the residual async image-edit cassette leak. The diagnostic run for ``ba3915d9`` printed: [vcr-safe-body-matcher] request body mismatch body[a]: type='list_iterator' length=unknown sha256=N/A body[b]: type='list_iterator' length=unknown sha256=N/A httpx's async transport hands vcrpy a ``request.body`` that is a ``list_iterator`` over multipart chunks rather than a contiguous ``bytes`` blob. Two consequences: 1. ``_safe_body_matcher`` compares the two iterator objects with ``==``, which is identity comparison for arbitrary iterators - semantically identical multipart bodies never compare equal, and ``record_mode="new_episodes"`` appends a new episode on every CI run until the cassette crosses ``MAX_EPISODES_PER_CASSETTE`` and the persister refuses to save (this is exactly what the OVERFLOW warning has been catching). 2. ``_normalize_multipart_boundary`` short-circuits its ``else: return`` branch because the body is neither bytes nor str, so any residual random boundary characters in the body bytes are never rewritten. Sync requests do not hit this code path: httpx's sync transport hands vcrpy a single ``bytes`` body, so ``==`` works and the boundary normalizer runs as intended. That is why ``test_openai_image_edit_litellm_sdk[True]`` records to ``entries=1`` and replays cleanly while ``[False]`` (async) kept growing by one episode per run. Fix: add ``_materialize_iterable_body`` which coalesces an iterable ``request.body`` into ``bytes`` in-place. Call it from two places: * The top of ``_before_record_request``, so the boundary normalizer and the cassette serializer both see bytes from then on. * The top of ``_safe_body_matcher``, as defense in depth in case a future vcrpy code path invokes the matcher without first going through ``_before_record_request``. The vcrpy ``Request`` is a wrapper used for matching and recording; the underlying httpx transport sends its own request body separately, so replacing the iterator on the vcrpy wrapper does not starve the live HTTP send. After this lands the async parametrizes should flip from ``[VCR MISS:RECORDED] entries=N+1`` to ``[VCR HIT] entries=N`` on the next CI run, matching the sync side and dropping the residual ~$3/day to $0. * fix(tests): handle bytes_iterator + never leave an exhausted body Follow-up to 8e08272b. The previous attempt at coalescing iterable request bodies bailed out (``return`` without writing ``request.body``) whenever it could not classify the chunk type. That was the wrong failure mode for one critical case: vcrpy sometimes presents the body as ``iter(some_bytes)``, whose Python type is ``bytes_iterator`` and which yields ``int`` byte values (0-255), not byte chunks. The old code saw an ``int`` chunk, hit the ``else: return`` branch, and left ``request.body`` pointing at the now-exhausted iterator. The post-fix diagnostic run made this loud: [vcr-safe-body-matcher] request body mismatch body[a]: type='bytes_iterator' length=unknown sha256=N/A body[b]: type='bytes_iterator' length=unknown sha256=N/A Every async image-edit test then ballooned from entries=2 to entries=10 in that single CI run -- the exhausted iterator meant the live multipart upload went out as an empty body, OpenAI returned 400, the SDK + flaky retries fired, each retry got a fresh iterator that my hook exhausted again, and ``new_episodes`` recorded each failed attempt as a new cassette episode. This patch: * Recognizes ``bytes_iterator`` (chunks are ``int``) and reconstructs the buffer via ``bytes(chunks)``. * Keeps the existing ``list_iterator``-over-bytes-chunks handling via ``b"".join(...)``. * **Always writes a bytes value back to ``request.body`` after consuming the iterator.** If the chunk shape is unrecognized, ``request.body`` is set to ``b""`` rather than left as an exhausted iterator. That is wrong in the sense of "we lost the body" but right in the sense of "the failure mode is now visible (live API call sends empty body and fails fast) instead of invisible (corrupt cassette grows silently)". Combined with the matcher diagnostic, any future regression in this code path will surface in the CI log immediately. Local verification covers ``bytes_iterator``, ``list_iterator`` over bytes chunks, generator over bytes chunks, empty iterator, already-bytes (idempotent), identical-content iterator equality in the matcher (now matches), and differing-content iterator inequality (still raises). * fix(tests): clear vcrpy's sticky _was_iter flag so materialized bodies stay bytes Actual root cause of the async image-edit cassette leak. The previous diagnostic run produced this dead giveaway: [vcr-episode-body-hash] ... episode[0]: body type='bytes_iterator' is not bytes/bytearray/str -- cannot hash [vcr-safe-body-matcher] request body mismatch body[a]: type='bytes_iterator' length=unknown sha256=N/A body[b]: type='bytes_iterator' length=unknown sha256=N/A Both sides of the matcher were ``bytes_iterator`` **after** the materializer had supposedly converted them to bytes. That made no sense until I read vcrpy's ``Request`` class. vcrpy's ``Request`` keeps two private flags that are set in ``__init__`` from the original body's type and **never cleared by the setter**: def __init__(self, method, uri, body, headers): self._was_file = hasattr(body, "read") self._was_iter = _is_nonsequence_iterator(body) ... @property def body(self): if self._was_file: return BytesIO(self._body) if self._was_iter: return iter(self._body) return self._body @body.setter def body(self, value): if isinstance(value, str): value = value.encode("utf-8") self._body = value # <-- does NOT touch _was_iter / _was_file So when httpx's async transport hands vcrpy an iterator body, ``_was_iter`` becomes ``True`` and stays there forever. Even after ``_materialize_iterable_body`` writes plain bytes via ``request.body = out``, the next read of ``.body`` re-wraps the stored bytes in ``iter()`` -- producing a fresh ``bytes_iterator`` that compares unequal to any other ``bytes_iterator`` via object identity. The matcher missed every time, the cassette grew by one episode per run, and the persister saw the same iterator type when trying to hash the body for the diagnostic log. Fix: after writing the materialized bytes, also force ``_was_iter`` and ``_was_file`` to ``False``. vcrpy exposes no public API for this, so we touch the private flags directly -- acknowledged as a pragmatic test-only hack with a clear unit boundary (the only call site is ``_materialize_iterable_body``). Local repro reproduces the exact production setup: ``Request('POST', url, iter(b'multipart-content'), {})`` on two sides, runs the matcher, asserts HIT. Verified the matcher hits on identical content and still raises on differing content. Should be the last fix needed. Existing cassettes that contain oddly-shaped bodies (lists of int chunks, etc. from the previous ``_was_iter=True`` save path) still match because the materializer canonicalises both sides to bytes before comparison -- no fourth re-flush required. * revert(tests): drop the temp per-episode body-hash diagnostic Removed now that 1c51ad13 has confirmed the root cause (vcrpy's sticky ``_was_iter`` flag making the body getter re-wrap stored bytes in ``iter()`` on every access). The hash dump did its job -- the post-1c51ad13 image_gen_testing run shows all five async image-edit tests as ``[VCR HIT]`` with stable entry counts and zero billing errors -- and is too noisy to keep on by default (over 100 lines per session at steady state). Kept permanently: * ``_safe_body_matcher`` mismatch diagnostic in ``_vcr_conftest_common.py``. Only fires on a body mismatch, which is signal worth surfacing whenever it happens. * ``_normalize_multipart_boundary`` "skipped" log line. Same rationale -- only fires when the body shape is something the normalizer cannot rewrite in place. * The ``test-results/vcr-diagnostics/.log`` per-PID file plumbing (``vcr_diag_write_line`` / ``emit_vcr_diagnostic_log``). Useful for any future diagnostic that needs to bypass xdist stdout/stderr capture; cheap to keep. * chore(tests): delete unused flush script + wire VCR diagnostic dump everywhere * Remove ``scripts/flush_image_edit_vcr_cassettes.py``. It was a one-shot helper for the initial cassette flush; the iterator and ``_was_iter`` fixes mean no future flush should be required, and the script was never run anywhere (the actual flushes happened inside the CI conftest via the temp hacks that have since been reverted). * The matcher mismatch + normalizer skip diagnostics already write per-PID files for every suite that imports the shared VCR plumbing, but ``emit_vcr_diagnostic_log`` -- the controller-side dump that surfaces those files into the CI log at session end -- was only wired into ``image_gen_tests``. Add the one-line call to the 12 sibling conftests that already use VCR so the diagnostics surface in any suite's terminal output if a body matcher ever misses. No new output in steady state -- the dump is a no-op when no diagnostics were recorded that session. * chore(tests): trim non-essential comments per project comment policy Strips docstrings, inline comments, and block comments that this PR introduced where the code itself was already self-evident. Keeps the few lines that document non-obvious behaviour (raw-bytes-not-BytesIO rationale on the image fixtures, the per-PID-files-bypass-xdist note on the diagnostic directory). Touches only comments this PR added -- no pre-existing comment is removed. Net: -161 lines of comment/docstring across 3 files, no code behaviour change. * chore(tests): forward **kwargs in pin_httpx_multipart_boundary wrapper Defensive against future httpx MultipartStream.__init__ adding new optional kwargs. Without the forward, the wrapper would silently drop them. No behaviour change today. * chore(tests): canonicalize VCR matchers and surface shouldn't-happen branches Bundles the "follow-up cleanup PR" into this one so it does not get lost. Four small changes: 1. Introduce ``_canonical_body(req) -> (bytes, pre_type)`` and route ``_safe_body_matcher`` through it. The matcher now operates on bytes by construction; the "compare two iterator objects via ``==`` and silently get object-identity semantics" failure mode (which cost us this entire PR to diagnose) is structurally impossible to reintroduce. ``pre_type`` is the body type *before* canonicalization, surfaced by the mismatch diagnostic so a future regression involving a new body shape is still visible. 2. Add a structured diagnostic to ``_key_fingerprint_matcher``. It was previously raising a bare ``AssertionError("API key fingerprints differ")`` with zero context -- exactly the anti-pattern the body matcher had before this PR. 3. Surface "shouldn't-happen" branches via ``vcr_diag_write_line``: * ``_strip_image_b64_payloads`` -- logs when ``response``, ``response['body']``, or ``response['body']['string']`` arrives in an unexpected shape (vcrpy contract violation). * ``_compute_key_fingerprint`` -- logs the ``"no-key"`` fallback with the request method/URL so a stripped-auth-header bug is visible instead of masked. * ``_canonical_body`` -- logs its own empty-bytes fallback when a body has a shape ``_materialize_iterable_body`` did not handle. 4. Re-introduce per-episode body-hash logging in ``_RedisPersister.save_cassette`` (was reverted in 927c5548 as "noisy"). Quantified cost: ~25 KB of CI log per session at peak, ~ms-scale CPU, zero output in steady state (no save = no log). Trade-off favours keeping it: lets two consecutive CI runs be diffed by body hash, which is how we will spot the next regression in the same class. All call sites still work: local repro confirms iter==iter HIT, iter!=iter raises, plain-bytes HIT, body-hash log emits via the same per-PID file plumbing as the matcher diagnostics. * chore(tests): symmetrize diag-log cleanup across every VCR-using conftest ``image_gen_tests/conftest.py`` was the only suite that cleared ``test-results/vcr-diagnostics/*.log`` at session start. The other 12 VCR-using conftests inherited any stale per-PID logs from a previous local run and would dump them in the terminal summary -- harmless in CI (fresh container) but confusing locally when running multiple suites in sequence. Extracts the cleanup into a ``reset_vcr_diag_dir`` helper in ``tests/_vcr_conftest_common.py`` and calls it from every VCR-using conftest's ``pytest_configure``. Same single source of truth, no inline duplication. * fix(tests): gate body materialization on __next__ and strip PR comments aiohttp/vcrpy stores the json kwarg as a dict; _materialize_iterable_body was iterating it via __iter__ and joining the keys, replacing the request body with concatenated key names ("textlanguageentities"). Gate on __next__ so containers (dict/list/tuple) are left alone — only single-use iterators like httpx's bytes_iterator / list_iterator are materialized. Log diagnostic line when chunk type is unrecognized. * fix(tests): JSON-encode dict bodies in canonical_body for stable matching aiohttp stubs store the json kwarg as a dict; the fallback that compared all dicts as b"" caused concurrent presidio analyze calls to be served the wrong cassette episode. JSON-encode with sort_keys for stable bytes. * fix(tests): guard emit_vcr_diagnostic_log against multi-conftest re-emission Co-authored-by: Yassin Kortam * fix(tests): globalize multipart-boundary pin + stabilize whisper fixtures Diagnostic shows audio_testing was silently re-recording 50+ live Whisper episodes per CI run (over MAX_EPISODES_PER_CASSETTE, so the persister refused to save). Two changes: * Move the session-autouse _pin_multipart_boundary fixture into the shared _vcr_conftest_common module so every VCR-using suite picks it up via a single import. image_gen had it inline; the other 12 suites silently lacked it. * Replace the module-level open("rb") audio file handles in test_whisper with cached bytes + a per-call (filename, bytes, mimetype) tuple, mirroring the image_edits raw-bytes pattern. Stops the file-pointer- at-EOF bug where the second test got an empty multipart body. * chore(tests): drop per-episode body-hash dump and redundant emit guard --------- Co-authored-by: shin-berri Co-authored-by: yuneng-jiang Co-authored-by: Cursor Agent Co-authored-by: Yassin Kortam --- tests/_vcr_conftest_common.py | 269 ++++++++++++++++++-- tests/audio_tests/conftest.py | 7 +- tests/audio_tests/test_whisper.py | 43 ++-- tests/guardrails_tests/conftest.py | 7 +- tests/image_gen_tests/conftest.py | 7 +- tests/image_gen_tests/test_image_edits.py | 37 +-- tests/litellm_utils_tests/conftest.py | 7 +- tests/llm_responses_api_testing/conftest.py | 7 +- tests/llm_translation/conftest.py | 7 +- tests/local_testing/conftest.py | 7 +- tests/logging_callback_tests/conftest.py | 7 +- tests/ocr_tests/conftest.py | 7 +- tests/pass_through_unit_tests/conftest.py | 7 +- tests/router_unit_tests/conftest.py | 7 +- tests/search_tests/conftest.py | 7 +- tests/unified_google_tests/conftest.py | 7 +- 16 files changed, 366 insertions(+), 74 deletions(-) diff --git a/tests/_vcr_conftest_common.py b/tests/_vcr_conftest_common.py index a179a21ba6..cb43f1abbd 100644 --- a/tests/_vcr_conftest_common.py +++ b/tests/_vcr_conftest_common.py @@ -36,6 +36,75 @@ SAFE_BODY_MATCHER_NAME = "safe_body" KEY_FINGERPRINT_MATCHER_NAME = "key_fingerprint" KEY_FINGERPRINT_HEADER = "x-litellm-key-fp" +VCR_DIAG_DIR_ENV = "LITELLM_VCR_DIAG_DIR" +VCR_DIAG_DIR_DEFAULT = "test-results/vcr-diagnostics" + + +def _vcr_diag_dir() -> str: + return os.environ.get(VCR_DIAG_DIR_ENV) or VCR_DIAG_DIR_DEFAULT + + +def vcr_diag_write_line(msg: str) -> None: + try: + directory = _vcr_diag_dir() + os.makedirs(directory, exist_ok=True) + path = os.path.join(directory, f"{os.getpid()}.log") + with open(path, "a", encoding="utf-8") as fh: + fh.write(msg.rstrip("\n") + "\n") + except OSError: + pass + + +def reset_vcr_diag_dir() -> None: + if os.environ.get("PYTEST_XDIST_WORKER"): + return + directory = _vcr_diag_dir() + if not os.path.isdir(directory): + return + try: + names = os.listdir(directory) + except OSError: + return + for name in names: + if name.endswith(".log"): + try: + os.remove(os.path.join(directory, name)) + except OSError: + pass + + +def emit_vcr_diagnostic_log(terminalreporter) -> None: + directory = _vcr_diag_dir() + if not os.path.isdir(directory): + return + try: + files = sorted(f for f in os.listdir(directory) if f.endswith(".log")) + except OSError: + return + if not files: + return + terminalreporter.write_sep("=", "VCR DIAGNOSTIC LOG", bold=True) + terminalreporter.write_line( + f" source dir: {directory} (also archived as a CI artifact)" + ) + for name in files: + path = os.path.join(directory, name) + try: + with open(path, "r", encoding="utf-8") as fh: + content = fh.read() + except OSError as exc: + terminalreporter.write_line( + f" [failed to read {name}: {type(exc).__name__}: {exc}]" + ) + continue + if not content.strip(): + continue + terminalreporter.write_sep("-", name, bold=False) + for line in content.splitlines(): + terminalreporter.write_line(line) + terminalreporter.write_sep("=", bold=True) + + # Intentionally narrower than ``FILTERED_REQUEST_HEADERS``: AWS SigV4 headers # carry secrets but their values rotate on every call, so fingerprinting them # would defeat caching. @@ -91,6 +160,32 @@ VCR_IMAGE_B64_PLACEHOLDER = "dGVzdA==" VCR_FIXED_MULTIPART_BOUNDARY = "vcr-static-boundary" +def pin_httpx_multipart_boundary(monkeypatch) -> None: + try: + import httpx._multipart as _httpx_multipart + except ImportError: + return + + _original_init = _httpx_multipart.MultipartStream.__init__ + + def _init_with_fixed_boundary(self, data, files, boundary=None, **kwargs): + if boundary is None: + boundary = VCR_FIXED_MULTIPART_BOUNDARY.encode("ascii") + return _original_init(self, data=data, files=files, boundary=boundary, **kwargs) + + monkeypatch.setattr( + _httpx_multipart.MultipartStream, "__init__", _init_with_fixed_boundary + ) + + +@pytest.fixture(scope="session", autouse=True) +def _pin_multipart_boundary(): + monkeypatch = pytest.MonkeyPatch() + pin_httpx_multipart_boundary(monkeypatch) + yield + monkeypatch.undo() + + def _scrub_response(response): if not isinstance(response, dict): return response @@ -139,9 +234,17 @@ def _strip_image_b64_payloads(response): preserves all those checks while shrinking cassettes by ~99%. """ if not isinstance(response, dict): + vcr_diag_write_line( + f"[vcr-strip-b64] response is {type(response).__name__!r}, not " + "dict; skipping b64 scrub" + ) return response body = response.get("body") if not isinstance(body, dict): + vcr_diag_write_line( + f"[vcr-strip-b64] response['body'] is {type(body).__name__!r}, " + "not dict; skipping b64 scrub" + ) return response raw = body.get("string") if raw is None: @@ -151,12 +254,20 @@ def _strip_image_b64_payloads(response): try: text = bytes(raw).decode("utf-8") except UnicodeDecodeError: + vcr_diag_write_line( + "[vcr-strip-b64] response body bytes are not valid UTF-8; " + "skipping b64 scrub" + ) return response was_bytes = True elif isinstance(raw, str): text = raw was_bytes = False else: + vcr_diag_write_line( + f"[vcr-strip-b64] response['body']['string'] is " + f"{type(raw).__name__!r}, not bytes/str; skipping b64 scrub" + ) return response try: @@ -186,6 +297,35 @@ def _before_record_response(response): return filter_non_2xx_response(_scrub_response(_strip_image_b64_payloads(response))) +def _canonical_body(request) -> tuple[bytes, str]: + pre_type = type(getattr(request, "body", None)).__name__ + _materialize_iterable_body(request) + body = getattr(request, "body", None) + if body is None: + return b"", pre_type + if isinstance(body, bytes): + return body, pre_type + if isinstance(body, bytearray): + return bytes(body), pre_type + if isinstance(body, str): + return body.encode("utf-8"), pre_type + if isinstance(body, (dict, list)): + try: + return ( + json.dumps(body, sort_keys=True, separators=(",", ":")).encode("utf-8"), + pre_type, + ) + except (TypeError, ValueError): + pass + method = getattr(request, "method", "?") + uri = getattr(request, "uri", getattr(request, "url", "?")) + vcr_diag_write_line( + f"[vcr-canonical-body] FALLBACK: {method} {uri} body type " + f"{type(body).__name__!r} not coerced to bytes; comparing as b''" + ) + return b"", pre_type + + def _safe_body_matcher(r1, r2) -> None: """Compare request bodies as bytes; never invokes ``json.loads``. @@ -195,27 +335,47 @@ def _safe_body_matcher(r1, r2) -> None: This matcher is strictly more conservative — the only equivalence it gives up vs. the default is "JSON key order doesn't matter". """ - body1 = getattr(r1, "body", None) - body2 = getattr(r2, "body", None) + body1, pre1 = _canonical_body(r1) + body2, pre2 = _canonical_body(r2) if body1 == body2: return - - def _to_bytes(b): - if b is None: - return b"" - if isinstance(b, bytes): - return b - if isinstance(b, str): - return b.encode("utf-8") - return None - - n1 = _to_bytes(body1) - n2 = _to_bytes(body2) - if n1 is not None and n2 is not None and n1 == n2: - return + _emit_body_mismatch_diagnostic(r1, r2, body1, body2, pre1, pre2) raise AssertionError("request bodies differ") +def _emit_body_mismatch_diagnostic(r1, r2, body1, body2, pre1, pre2) -> None: + def _describe(label, asbytes, pre_type): + return ( + f" {label}: pre_canonical_type={pre_type!r} length={len(asbytes)} " + f"sha256={hashlib.sha256(asbytes).hexdigest()} " + f"preview={asbytes[:120]!r}" + ) + + method_a = getattr(r1, "method", "?") + method_b = getattr(r2, "method", "?") + url_a = getattr(r1, "uri", getattr(r1, "url", "?")) + url_b = getattr(r2, "uri", getattr(r2, "url", "?")) + lines = [ + "[vcr-safe-body-matcher] request body mismatch", + f" request[a]: {method_a} {url_a}", + f" request[b]: {method_b} {url_b}", + _describe("body[a]", body1, pre1), + _describe("body[b]", body2, pre2), + ] + if body1 != body2: + offset = next( + (i for i in range(min(len(body1), len(body2))) if body1[i] != body2[i]), + min(len(body1), len(body2)), + ) + start = max(0, offset - 100) + end_a = min(len(body1), offset + 100) + end_b = min(len(body2), offset + 100) + lines.append(f" first divergent byte offset: {offset}") + lines.append(f" window[a] @ {start}..{end_a}: {body1[start:end_a]!r}") + lines.append(f" window[b] @ {start}..{end_b}: {body2[start:end_b]!r}") + vcr_diag_write_line("\n".join(lines)) + + def _iter_header_values(headers, name: str): if headers is None: return @@ -271,6 +431,13 @@ def _compute_key_fingerprint(request) -> str: stable = _stable_key_value(header_name, text) parts.append(f"{header_name}={stable}") if not parts: + method = getattr(request, "method", "?") + uri = getattr(request, "uri", getattr(request, "url", "?")) + vcr_diag_write_line( + f"[vcr-key-fingerprint] no API key header found on {method} " + f"{uri}; falling back to 'no-key'. If this request should have " + "carried auth, something earlier in the pipeline stripped it." + ) return "no-key" digest = hashlib.sha256("\n".join(parts).encode("utf-8")).hexdigest() return digest[:16] @@ -360,6 +527,13 @@ def _normalize_multipart_boundary(request) -> None: elif isinstance(body, str): new_body = body.replace(current_boundary, VCR_FIXED_MULTIPART_BOUNDARY) else: + vcr_diag_write_line( + f"[vcr-multipart-normalize] body normalization SKIPPED: " + f"body type {type(body).__name__!r} is not bytes/bytearray/str. " + f"content-type={content_type_value!r}. " + f"Recorded body will retain the random boundary substring " + f"and the safe_body matcher will miss on the next run." + ) return try: @@ -389,6 +563,7 @@ def _before_record_request(request): headers = getattr(request, "headers", None) if headers is None: return request + _materialize_iterable_body(request) if not any(_iter_header_values(headers, KEY_FINGERPRINT_HEADER)): fingerprint = _compute_key_fingerprint(request) try: @@ -400,6 +575,56 @@ def _before_record_request(request): return request +def _materialize_iterable_body(request) -> None: + body = getattr(request, "body", None) + if body is None or isinstance(body, (bytes, bytearray, str)): + return + if not hasattr(body, "__next__"): + return + try: + chunks = list(body) + except TypeError: + return + + out = _coalesce_chunks_to_bytes(chunks) + if out is None: + method = getattr(request, "method", "?") + uri = getattr(request, "uri", getattr(request, "url", "?")) + first_type = type(chunks[0]).__name__ if chunks else "empty" + vcr_diag_write_line( + f"[vcr-materialize] FALLBACK: {method} {uri} chunk type " + f"{first_type!r} not coerced to bytes; storing b''" + ) + out = b"" + + try: + request.body = out + except (AttributeError, TypeError): + pass + + for attr in ("_was_iter", "_was_file"): + try: + setattr(request, attr, False) + except (AttributeError, TypeError): + pass + + +def _coalesce_chunks_to_bytes(chunks): + if not chunks: + return b"" + first = chunks[0] + try: + if isinstance(first, int): + return bytes(chunks) + if isinstance(first, (bytes, bytearray)): + return b"".join(c if isinstance(c, bytes) else bytes(c) for c in chunks) + if isinstance(first, str): + return "".join(chunks).encode("utf-8") + except (TypeError, ValueError): + return None + return None + + def _key_fingerprint_matcher(r1, r2) -> None: def _fp(req): for value in _iter_header_values( @@ -410,7 +635,17 @@ def _key_fingerprint_matcher(r1, r2) -> None: return value if isinstance(value, str) else str(value) return "no-key" - if _fp(r1) != _fp(r2): + fp1, fp2 = _fp(r1), _fp(r2) + if fp1 != fp2: + method_a = getattr(r1, "method", "?") + method_b = getattr(r2, "method", "?") + url_a = getattr(r1, "uri", getattr(r1, "url", "?")) + url_b = getattr(r2, "uri", getattr(r2, "url", "?")) + vcr_diag_write_line( + "[vcr-key-fingerprint-matcher] API key fingerprints differ\n" + f" request[a]: {method_a} {url_a} fingerprint={fp1!r}\n" + f" request[b]: {method_b} {url_b} fingerprint={fp2!r}" + ) raise AssertionError("API key fingerprints differ") diff --git a/tests/audio_tests/conftest.py b/tests/audio_tests/conftest.py index ff47853d49..c4ff576e5b 100644 --- a/tests/audio_tests/conftest.py +++ b/tests/audio_tests/conftest.py @@ -5,14 +5,17 @@ import pytest sys.path.insert(0, os.path.abspath("../..")) -from tests._vcr_conftest_common import ( # noqa: E402 +from tests._vcr_conftest_common import ( # noqa: E402,F401 VerboseReporterState, + _pin_multipart_boundary, apply_vcr_auto_marker_to_items, emit_cassette_cache_session_banner, emit_vcr_classification_summary, + emit_vcr_diagnostic_log, install_live_call_probe, record_vcr_outcome, register_persister_if_enabled, + reset_vcr_diag_dir, vcr_config_dict, ) @@ -44,6 +47,7 @@ def _vcr_outcome_gate(request, vcr): def pytest_configure(config): _verbose_state.remember_pluginmanager(config) + reset_vcr_diag_dir() def pytest_runtest_logreport(report): @@ -57,3 +61,4 @@ def pytest_collection_modifyitems(config, items): def pytest_terminal_summary(terminalreporter, exitstatus, config): emit_cassette_cache_session_banner(terminalreporter) emit_vcr_classification_summary(terminalreporter) + emit_vcr_diagnostic_log(terminalreporter) diff --git a/tests/audio_tests/test_whisper.py b/tests/audio_tests/test_whisper.py index cdf079f8cb..243d27614b 100644 --- a/tests/audio_tests/test_whisper.py +++ b/tests/audio_tests/test_whisper.py @@ -23,12 +23,21 @@ pwd = os.path.dirname(os.path.realpath(__file__)) print(pwd) file_path = os.path.join(pwd, "gettysburg.wav") - -audio_file = open(file_path, "rb") - - file2_path = os.path.join(pwd, "eagle.wav") -audio_file2 = open(file2_path, "rb") + +with open(file_path, "rb") as _f: + _GETTYSBURG_BYTES = _f.read() +with open(file2_path, "rb") as _f: + _EAGLE_BYTES = _f.read() + + +def _audio_file(): + return ("gettysburg.wav", _GETTYSBURG_BYTES, "audio/wav") + + +def _audio_file2(): + return ("eagle.wav", _EAGLE_BYTES, "audio/wav") + load_dotenv() @@ -44,7 +53,7 @@ async def _run_transcription( ): transcript = await litellm.atranscription( model=model, - file=audio_file, + file=_audio_file(), api_key=api_key, api_base=api_base, response_format=response_format, @@ -101,7 +110,7 @@ async def test_transcription_caching(): response_1 = await litellm.atranscription( model="whisper-1", - file=audio_file, + file=_audio_file(), ) await asyncio.sleep(5) @@ -110,7 +119,7 @@ async def test_transcription_caching(): response_2 = await litellm.atranscription( model="whisper-1", - file=audio_file, + file=_audio_file(), ) print("response_1", response_1) @@ -122,7 +131,7 @@ async def test_transcription_caching(): response_3 = await litellm.atranscription( model="whisper-1", - file=audio_file2, + file=_audio_file2(), ) print("response_3", response_3) print("response3 hidden params", response_3._hidden_params) @@ -146,7 +155,7 @@ async def test_whisper_log_pre_call(): with patch.object(custom_logger, "log_pre_api_call") as mock_log_pre_call: await litellm.atranscription( model="whisper-1", - file=audio_file, + file=_audio_file(), ) mock_log_pre_call.assert_called_once() @@ -165,7 +174,7 @@ async def test_whisper_log_pre_call(): with patch.object(custom_logger, "log_pre_api_call") as mock_log_pre_call: await litellm.atranscription( model="whisper-1", - file=audio_file, + file=_audio_file(), ) mock_log_pre_call.assert_called_once() @@ -177,7 +186,7 @@ async def test_gpt_4o_transcribe(): from unittest.mock import patch, MagicMock await litellm.atranscription( - model="openai/gpt-4o-transcribe", file=audio_file, response_format="json" + model="openai/gpt-4o-transcribe", file=_audio_file(), response_format="json" ) @@ -187,7 +196,9 @@ async def test_gpt_4o_transcribe_model_mapping(): # Test GPT-4o mini transcribe response = await litellm.atranscription( - model="openai/gpt-4o-mini-transcribe", file=audio_file, response_format="json" + model="openai/gpt-4o-mini-transcribe", + file=_audio_file(), + response_format="json", ) # Check that the response contains the correct model in hidden params @@ -198,7 +209,7 @@ async def test_gpt_4o_transcribe_model_mapping(): # Test GPT-4o transcribe response2 = await litellm.atranscription( - model="openai/gpt-4o-transcribe", file=audio_file, response_format="json" + model="openai/gpt-4o-transcribe", file=_audio_file(), response_format="json" ) # Check that the response contains the correct model in hidden params @@ -209,7 +220,7 @@ async def test_gpt_4o_transcribe_model_mapping(): # Test traditional whisper-1 still works response3 = await litellm.atranscription( - model="openai/whisper-1", file=audio_file, response_format="json" + model="openai/whisper-1", file=_audio_file(), response_format="json" ) # Check that the response contains the correct model in hidden params @@ -262,7 +273,7 @@ async def test_azure_transcribe_model_mapping(): # Make the transcription call response = await litellm.atranscription( model="azure/whisper-1", - file=audio_file, + file=_audio_file(), response_format="json", api_key="test-api-key", api_base="https://my-endpoint-europe-berri-992.openai.azure.com/", diff --git a/tests/guardrails_tests/conftest.py b/tests/guardrails_tests/conftest.py index eb563699b2..f2f65645c3 100644 --- a/tests/guardrails_tests/conftest.py +++ b/tests/guardrails_tests/conftest.py @@ -16,14 +16,17 @@ sys.path.insert( ) # Adds the parent directory to the system path import litellm -from tests._vcr_conftest_common import ( # noqa: E402 +from tests._vcr_conftest_common import ( # noqa: E402,F401 VerboseReporterState, + _pin_multipart_boundary, apply_vcr_auto_marker_to_items, emit_cassette_cache_session_banner, emit_vcr_classification_summary, + emit_vcr_diagnostic_log, install_live_call_probe, record_vcr_outcome, register_persister_if_enabled, + reset_vcr_diag_dir, vcr_config_dict, ) @@ -55,6 +58,7 @@ def _vcr_outcome_gate(request, vcr): def pytest_configure(config): _verbose_state.remember_pluginmanager(config) + reset_vcr_diag_dir() def pytest_runtest_logreport(report): @@ -160,3 +164,4 @@ def pytest_collection_modifyitems(config, items): def pytest_terminal_summary(terminalreporter, exitstatus, config): emit_cassette_cache_session_banner(terminalreporter) emit_vcr_classification_summary(terminalreporter) + emit_vcr_diagnostic_log(terminalreporter) diff --git a/tests/image_gen_tests/conftest.py b/tests/image_gen_tests/conftest.py index 93dec98e70..9f808c1116 100644 --- a/tests/image_gen_tests/conftest.py +++ b/tests/image_gen_tests/conftest.py @@ -9,14 +9,17 @@ sys.path.insert( ) # Adds the parent directory to the system path import litellm # noqa: E402,F401 -from tests._vcr_conftest_common import ( # noqa: E402 +from tests._vcr_conftest_common import ( # noqa: E402,F401 VerboseReporterState, + _pin_multipart_boundary, apply_vcr_auto_marker_to_items, emit_cassette_cache_session_banner, emit_vcr_classification_summary, + emit_vcr_diagnostic_log, install_live_call_probe, record_vcr_outcome, register_persister_if_enabled, + reset_vcr_diag_dir, vcr_config_dict, ) @@ -58,6 +61,7 @@ def _vcr_outcome_gate(request, vcr): def pytest_configure(config): _verbose_state.remember_pluginmanager(config) + reset_vcr_diag_dir() def pytest_runtest_logreport(report): @@ -71,3 +75,4 @@ def pytest_collection_modifyitems(config, items): def pytest_terminal_summary(terminalreporter, exitstatus, config): emit_cassette_cache_session_banner(terminalreporter) emit_vcr_classification_summary(terminalreporter) + emit_vcr_diagnostic_log(terminalreporter) diff --git a/tests/image_gen_tests/test_image_edits.py b/tests/image_gen_tests/test_image_edits.py index 656b8a6911..ca8ec3bbe3 100644 --- a/tests/image_gen_tests/test_image_edits.py +++ b/tests/image_gen_tests/test_image_edits.py @@ -103,12 +103,6 @@ class BaseLLMImageEditTest(ABC): pwd = os.path.dirname(os.path.realpath(__file__)) -# Image fixtures must be regenerated per access — module-level -# ``open(...)`` handles get consumed after a single multipart upload, leaving -# subsequent tests in the same process to send empty bodies. That non-determinism -# (a) blows the recorded cassette past ``MAX_EPISODES_PER_CASSETTE`` so the -# persister refuses to save (see ``tests/_vcr_redis_persister.py``), and -# (b) re-bills the live image edit endpoint on every CI run. def _read_image_bytes(filename: str) -> bytes: with open(os.path.join(pwd, filename), "rb") as f: return f.read() @@ -119,32 +113,20 @@ _LITELLM_SITE_BYTES = _read_image_bytes("litellm_site.png") def _make_test_images() -> list: - """Return a fresh pair of image streams seeded with the fixture bytes. + return [_ISHAAN_GITHUB_BYTES, _LITELLM_SITE_BYTES] - Use this everywhere you'd previously have used the module-level - ``TEST_IMAGES``. Each call returns brand new ``BytesIO`` objects whose - file pointers start at 0, so multipart uploads encode the full image - bytes on every test invocation. Parametrized and ``flaky``-retried - test methods call ``get_base_image_edit_call_args`` once per - invocation, so a fresh stream per call is sufficient — the factory - must not auto-rewind on EOF or the SDK's multipart writer will read - the same bytes forever (worker OOM). - """ + +def _make_single_test_image() -> bytes: + return _ISHAAN_GITHUB_BYTES + + +def get_test_images_as_bytesio(): return [ BytesIO(_ISHAAN_GITHUB_BYTES), BytesIO(_LITELLM_SITE_BYTES), ] -def _make_single_test_image() -> BytesIO: - return BytesIO(_ISHAAN_GITHUB_BYTES) - - -def get_test_images_as_bytesio(): - """Helper function to get test images as BytesIO objects""" - return _make_test_images() - - class TestOpenAIImageEditGPTImage1(BaseLLMImageEditTest): """ Concrete implementation of BaseLLMImageEditTest for OpenAI image edits. @@ -710,10 +692,9 @@ async def test_multiple_image_edit_with_different_formats(): try: prompt = "Create a cohesive artistic style across all images" - # Test with mixed BytesIO and file objects mixed_images = [ - _make_single_test_image(), # File object - get_test_images_as_bytesio()[1], # BytesIO object + _make_single_test_image(), + get_test_images_as_bytesio()[1], ] result = await aimage_edit( diff --git a/tests/litellm_utils_tests/conftest.py b/tests/litellm_utils_tests/conftest.py index 08745c99c0..418ee76a39 100644 --- a/tests/litellm_utils_tests/conftest.py +++ b/tests/litellm_utils_tests/conftest.py @@ -12,14 +12,17 @@ sys.path.insert( ) # Adds the parent directory to the system path import litellm # noqa: E402,F401 -from tests._vcr_conftest_common import ( # noqa: E402 +from tests._vcr_conftest_common import ( # noqa: E402,F401 VerboseReporterState, + _pin_multipart_boundary, apply_vcr_auto_marker_to_items, emit_cassette_cache_session_banner, emit_vcr_classification_summary, + emit_vcr_diagnostic_log, install_live_call_probe, record_vcr_outcome, register_persister_if_enabled, + reset_vcr_diag_dir, vcr_config_dict, ) @@ -86,6 +89,7 @@ def _vcr_outcome_gate(request, vcr): def pytest_configure(config): _verbose_state.remember_pluginmanager(config) + reset_vcr_diag_dir() def pytest_runtest_logreport(report): @@ -116,3 +120,4 @@ def pytest_collection_modifyitems(config, items): def pytest_terminal_summary(terminalreporter, exitstatus, config): emit_cassette_cache_session_banner(terminalreporter) emit_vcr_classification_summary(terminalreporter) + emit_vcr_diagnostic_log(terminalreporter) diff --git a/tests/llm_responses_api_testing/conftest.py b/tests/llm_responses_api_testing/conftest.py index 2a08db5714..1928b540da 100644 --- a/tests/llm_responses_api_testing/conftest.py +++ b/tests/llm_responses_api_testing/conftest.py @@ -13,14 +13,17 @@ sys.path.insert( import litellm # noqa: E402 -from tests._vcr_conftest_common import ( # noqa: E402 +from tests._vcr_conftest_common import ( # noqa: E402,F401 VerboseReporterState, + _pin_multipart_boundary, apply_vcr_auto_marker_to_items, emit_cassette_cache_session_banner, emit_vcr_classification_summary, + emit_vcr_diagnostic_log, install_live_call_probe, record_vcr_outcome, register_persister_if_enabled, + reset_vcr_diag_dir, vcr_config_dict, ) @@ -52,6 +55,7 @@ def _vcr_outcome_gate(request, vcr): def pytest_configure(config): _verbose_state.remember_pluginmanager(config) + reset_vcr_diag_dir() def pytest_runtest_logreport(report): @@ -116,3 +120,4 @@ def pytest_collection_modifyitems(config, items): def pytest_terminal_summary(terminalreporter, exitstatus, config): emit_cassette_cache_session_banner(terminalreporter) emit_vcr_classification_summary(terminalreporter) + emit_vcr_diagnostic_log(terminalreporter) diff --git a/tests/llm_translation/conftest.py b/tests/llm_translation/conftest.py index 5fcd31aa32..d346dae430 100644 --- a/tests/llm_translation/conftest.py +++ b/tests/llm_translation/conftest.py @@ -18,14 +18,17 @@ sys.path.insert( import litellm # noqa: E402 -from tests._vcr_conftest_common import ( # noqa: E402 +from tests._vcr_conftest_common import ( # noqa: E402,F401 VerboseReporterState, + _pin_multipart_boundary, apply_vcr_auto_marker_to_items, emit_cassette_cache_session_banner, emit_vcr_classification_summary, + emit_vcr_diagnostic_log, install_live_call_probe, record_vcr_outcome, register_persister_if_enabled, + reset_vcr_diag_dir, vcr_config_dict, ) @@ -73,6 +76,7 @@ def _vcr_outcome_gate(request, vcr): def pytest_configure(config): _verbose_state.remember_pluginmanager(config) + reset_vcr_diag_dir() def pytest_runtest_logreport(report): @@ -82,6 +86,7 @@ def pytest_runtest_logreport(report): def pytest_terminal_summary(terminalreporter, exitstatus, config): emit_cassette_cache_session_banner(terminalreporter) emit_vcr_classification_summary(terminalreporter) + emit_vcr_diagnostic_log(terminalreporter) # --------------------------------------------------------------------------- diff --git a/tests/local_testing/conftest.py b/tests/local_testing/conftest.py index 0ff7dff668..6a746041f1 100644 --- a/tests/local_testing/conftest.py +++ b/tests/local_testing/conftest.py @@ -22,14 +22,17 @@ sys.path.insert( ) # Adds the parent directory to the system path import litellm -from tests._vcr_conftest_common import ( # noqa: E402 +from tests._vcr_conftest_common import ( # noqa: E402,F401 VerboseReporterState, + _pin_multipart_boundary, apply_vcr_auto_marker_to_items, emit_cassette_cache_session_banner, emit_vcr_classification_summary, + emit_vcr_diagnostic_log, install_live_call_probe, record_vcr_outcome, register_persister_if_enabled, + reset_vcr_diag_dir, vcr_config_dict, ) @@ -84,6 +87,7 @@ def _vcr_outcome_gate(request, vcr): def pytest_configure(config): _verbose_state.remember_pluginmanager(config) + reset_vcr_diag_dir() def pytest_runtest_logreport(report): @@ -93,6 +97,7 @@ def pytest_runtest_logreport(report): def pytest_terminal_summary(terminalreporter, exitstatus, config): emit_cassette_cache_session_banner(terminalreporter) emit_vcr_classification_summary(terminalreporter) + emit_vcr_diagnostic_log(terminalreporter) # --------------------------------------------------------------------------- diff --git a/tests/logging_callback_tests/conftest.py b/tests/logging_callback_tests/conftest.py index cdb9200bc8..6dde85f2ca 100644 --- a/tests/logging_callback_tests/conftest.py +++ b/tests/logging_callback_tests/conftest.py @@ -19,14 +19,17 @@ sys.path.insert( ) # Adds the parent directory to the system path import litellm -from tests._vcr_conftest_common import ( # noqa: E402 +from tests._vcr_conftest_common import ( # noqa: E402,F401 VerboseReporterState, + _pin_multipart_boundary, apply_vcr_auto_marker_to_items, emit_cassette_cache_session_banner, emit_vcr_classification_summary, + emit_vcr_diagnostic_log, install_live_call_probe, record_vcr_outcome, register_persister_if_enabled, + reset_vcr_diag_dir, vcr_config_dict, ) @@ -79,6 +82,7 @@ def _vcr_outcome_gate(request, vcr): def pytest_configure(config): _verbose_state.remember_pluginmanager(config) + reset_vcr_diag_dir() def pytest_runtest_logreport(report): @@ -229,3 +233,4 @@ def pytest_collection_modifyitems(config, items): def pytest_terminal_summary(terminalreporter, exitstatus, config): emit_cassette_cache_session_banner(terminalreporter) emit_vcr_classification_summary(terminalreporter) + emit_vcr_diagnostic_log(terminalreporter) diff --git a/tests/ocr_tests/conftest.py b/tests/ocr_tests/conftest.py index 66970b8579..94790bd7aa 100644 --- a/tests/ocr_tests/conftest.py +++ b/tests/ocr_tests/conftest.py @@ -12,14 +12,17 @@ import pytest sys.path.insert(0, os.path.abspath("../..")) -from tests._vcr_conftest_common import ( # noqa: E402 +from tests._vcr_conftest_common import ( # noqa: E402,F401 VerboseReporterState, + _pin_multipart_boundary, apply_vcr_auto_marker_to_items, emit_cassette_cache_session_banner, emit_vcr_classification_summary, + emit_vcr_diagnostic_log, install_live_call_probe, record_vcr_outcome, register_persister_if_enabled, + reset_vcr_diag_dir, vcr_config_dict, ) @@ -51,6 +54,7 @@ def _vcr_outcome_gate(request, vcr): def pytest_configure(config): _verbose_state.remember_pluginmanager(config) + reset_vcr_diag_dir() def pytest_runtest_logreport(report): @@ -64,3 +68,4 @@ def pytest_collection_modifyitems(config, items): def pytest_terminal_summary(terminalreporter, exitstatus, config): emit_cassette_cache_session_banner(terminalreporter) emit_vcr_classification_summary(terminalreporter) + emit_vcr_diagnostic_log(terminalreporter) diff --git a/tests/pass_through_unit_tests/conftest.py b/tests/pass_through_unit_tests/conftest.py index 42a95343eb..390e14b7f1 100644 --- a/tests/pass_through_unit_tests/conftest.py +++ b/tests/pass_through_unit_tests/conftest.py @@ -5,14 +5,17 @@ import pytest sys.path.insert(0, os.path.abspath("../..")) -from tests._vcr_conftest_common import ( # noqa: E402 +from tests._vcr_conftest_common import ( # noqa: E402,F401 VerboseReporterState, + _pin_multipart_boundary, apply_vcr_auto_marker_to_items, emit_cassette_cache_session_banner, emit_vcr_classification_summary, + emit_vcr_diagnostic_log, install_live_call_probe, record_vcr_outcome, register_persister_if_enabled, + reset_vcr_diag_dir, vcr_config_dict, ) @@ -56,6 +59,7 @@ def _vcr_outcome_gate(request, vcr): def pytest_configure(config): _verbose_state.remember_pluginmanager(config) + reset_vcr_diag_dir() def pytest_runtest_logreport(report): @@ -71,3 +75,4 @@ def pytest_collection_modifyitems(config, items): def pytest_terminal_summary(terminalreporter, exitstatus, config): emit_cassette_cache_session_banner(terminalreporter) emit_vcr_classification_summary(terminalreporter) + emit_vcr_diagnostic_log(terminalreporter) diff --git a/tests/router_unit_tests/conftest.py b/tests/router_unit_tests/conftest.py index fe976515c9..6a8f3e589f 100644 --- a/tests/router_unit_tests/conftest.py +++ b/tests/router_unit_tests/conftest.py @@ -12,14 +12,17 @@ sys.path.insert( ) # Adds the parent directory to the system path import litellm # noqa: E402,F401 -from tests._vcr_conftest_common import ( # noqa: E402 +from tests._vcr_conftest_common import ( # noqa: E402,F401 VerboseReporterState, + _pin_multipart_boundary, apply_vcr_auto_marker_to_items, emit_cassette_cache_session_banner, emit_vcr_classification_summary, + emit_vcr_diagnostic_log, install_live_call_probe, record_vcr_outcome, register_persister_if_enabled, + reset_vcr_diag_dir, vcr_config_dict, ) @@ -97,6 +100,7 @@ def _vcr_outcome_gate(request, vcr): def pytest_configure(config): _verbose_state.remember_pluginmanager(config) + reset_vcr_diag_dir() def pytest_runtest_logreport(report): @@ -123,3 +127,4 @@ def pytest_collection_modifyitems(config, items): def pytest_terminal_summary(terminalreporter, exitstatus, config): emit_cassette_cache_session_banner(terminalreporter) emit_vcr_classification_summary(terminalreporter) + emit_vcr_diagnostic_log(terminalreporter) diff --git a/tests/search_tests/conftest.py b/tests/search_tests/conftest.py index e06d3e95ee..78ba19a772 100644 --- a/tests/search_tests/conftest.py +++ b/tests/search_tests/conftest.py @@ -13,14 +13,17 @@ import pytest sys.path.insert(0, os.path.abspath("../..")) -from tests._vcr_conftest_common import ( # noqa: E402 +from tests._vcr_conftest_common import ( # noqa: E402,F401 VerboseReporterState, + _pin_multipart_boundary, apply_vcr_auto_marker_to_items, emit_cassette_cache_session_banner, emit_vcr_classification_summary, + emit_vcr_diagnostic_log, install_live_call_probe, record_vcr_outcome, register_persister_if_enabled, + reset_vcr_diag_dir, vcr_config_dict, ) @@ -52,6 +55,7 @@ def _vcr_outcome_gate(request, vcr): def pytest_configure(config): _verbose_state.remember_pluginmanager(config) + reset_vcr_diag_dir() def pytest_runtest_logreport(report): @@ -65,3 +69,4 @@ def pytest_collection_modifyitems(config, items): def pytest_terminal_summary(terminalreporter, exitstatus, config): emit_cassette_cache_session_banner(terminalreporter) emit_vcr_classification_summary(terminalreporter) + emit_vcr_diagnostic_log(terminalreporter) diff --git a/tests/unified_google_tests/conftest.py b/tests/unified_google_tests/conftest.py index d28f89a77b..5b4f57b803 100644 --- a/tests/unified_google_tests/conftest.py +++ b/tests/unified_google_tests/conftest.py @@ -12,14 +12,17 @@ sys.path.insert( ) # Adds the parent directory to the system path import litellm # noqa: E402,F401 -from tests._vcr_conftest_common import ( # noqa: E402 +from tests._vcr_conftest_common import ( # noqa: E402,F401 VerboseReporterState, + _pin_multipart_boundary, apply_vcr_auto_marker_to_items, emit_cassette_cache_session_banner, emit_vcr_classification_summary, + emit_vcr_diagnostic_log, install_live_call_probe, record_vcr_outcome, register_persister_if_enabled, + reset_vcr_diag_dir, vcr_config_dict, ) @@ -84,6 +87,7 @@ def _vcr_outcome_gate(request, vcr): def pytest_configure(config): _verbose_state.remember_pluginmanager(config) + reset_vcr_diag_dir() def pytest_runtest_logreport(report): @@ -110,3 +114,4 @@ def pytest_collection_modifyitems(config, items): def pytest_terminal_summary(terminalreporter, exitstatus, config): emit_cassette_cache_session_banner(terminalreporter) emit_vcr_classification_summary(terminalreporter) + emit_vcr_diagnostic_log(terminalreporter)