* fix(mcp): surface upstream 401 for token-forwarding MCP servers For MCP servers configured with extra_headers: [Authorization], the gateway forwards the client token directly to the upstream. When that token is rejected (expired or invalid) the upstream returns 401, but the MCP SDK starts the SSE stream with 200 OK before calling handlers, so the 401 can't be returned mid-stream. Fix: add a pre-flight httpx probe in handle_streamable_http_mcp — before the SDK opens the session — so the gateway can still return HTTP 401 with WWW-Authenticate: Bearer authorization_uri=<gateway-discovery-url> when the upstream rejects the token. The probe fails-open (returns 200) on network errors so a transient hiccup does not block valid requests. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(mcp): parallelize pre-flight auth probes and use HEAD to avoid side effects - Extract forwarded_auth outside the pass-through server loop (was called N times for the same scope value) - Gather all upstream auth probes concurrently with asyncio.gather instead of sequentially; eliminates N×5 s worst-case latency - Switch probe from POST+initialize JSON-RPC body to HEAD request; HEAD carries the Authorization header so the upstream rejects invalid tokens with 401 but never allocates a session or writes an audit entry Co-authored-by: Cursor <cursoragent@cursor.com> * fix(mcp): use get_async_httpx_client in _probe_upstream_auth Replaces bare httpx.AsyncClient with the project-standard get_async_httpx_client(httpxSpecialProvider.MCP) to satisfy the ensure_async_clients_test code coverage check and avoid the +500 ms per-request overhead of creating a new client on every probe call. Co-authored-by: Cursor <cursoragent@cursor.com> * refactor(mcp): extract pre-flight probe into _check_passthrough_upstream_auth Moves the parallel upstream auth probe logic out of handle_streamable_http_mcp into a dedicated helper to satisfy Ruff PLR0915 (Too many statements > 50). Co-authored-by: Cursor <cursoragent@cursor.com> * fix(mcp): gate pre-flight probes on authorized server set to prevent bypass _check_passthrough_upstream_auth was resolving user-supplied server names directly before authorization ran, letting any permitted LiteLLM key trigger an upstream HEAD probe to a server it was not allowed to use. Changes: - Call _get_allowed_mcp_servers inside the helper so only servers the caller's key is authorized for are probed. - Move the call site to after toolset scoping so the auth context is fully resolved before the probe list is built. - Thread user_api_key_auth into the helper signature (replaces the raw mcp_servers name list). Co-authored-by: Cursor <cursoragent@cursor.com> * Add async HTTP HEAD support Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(mcp): use Scope type annotation in _get_forwarded_auth_from_scope Co-authored-by: Cursor <cursoragent@cursor.com> * Fix MCP upstream auth probe method Co-authored-by: Yassin Kortam <yassin@berri.ai> * Remove unused AsyncHTTPHandler head method Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(mcp): exclude has_client_credentials servers from pre-flight auth probe _prepare_mcp_server_headers skips caller Authorization when the server uses OAuth client-credentials (M2M), but the pre-flight probe was still selecting those servers and forwarding the caller's raw token in the HEAD request. Exclude servers with has_client_credentials from the probe list to match the actual downstream header-preparation logic. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(mcp): propagate upstream 403 as 403, not 401 with WWW-Authenticate Per RFC 9110, 401 means "go get new credentials." Mapping an upstream 403 to a gateway 401 causes OAuth clients to restart the authorization flow, obtain a fresh token with identical scopes, hit 403 again, and loop indefinitely. 401 from upstream → gateway 401 + WWW-Authenticate (re-authorize) 403 from upstream → gateway 403 (no WWW-Authenticate hint) Co-authored-by: Cursor <cursoragent@cursor.com> * fix(mcp): skip auth probe when Authorization may be the LiteLLM proxy key The pre-flight upstream probe must not forward the caller's Authorization header when it could itself be the LiteLLM proxy API key. Restrict the probe to requests that supply x-litellm-api-key explicitly — only then is the Authorization header unambiguously the upstream OAuth token the caller wants forwarded. * Fix MCP ASGI HTTPException propagation Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(mcp): use public AsyncHTTPHandler.post() in auth probe Use AsyncHTTPHandler.post() and catch httpx.HTTPStatusError explicitly so the 401/403 we want to surface is not silently swallowed by the broad fail-open except Exception block. Avoids reaching into the handler's private client attribute, which would silently regress to fail-open if AsyncHTTPHandler is ever refactored. * Fix MCP auth probe tests Co-authored-by: Yassin Kortam <yassin@berri.ai> * test(mcp): add coverage for httpx.HTTPStatusError path in auth probe AsyncHTTPHandler.post() calls raise_for_status() internally, so a real upstream 401/403 lands as httpx.HTTPStatusError. Add a test that exercises that specific exception path so a regression that swallows the error in the broad fail-open except Exception would be caught. --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Yassin Kortam <yassin@berri.ai> Co-authored-by: claude-bot <claude-bot@anthropic.com>
37 lines
1.1 KiB
Python
37 lines
1.1 KiB
Python
import asyncio
|
|
|
|
import pytest
|
|
from fastapi import HTTPException
|
|
|
|
from litellm.proxy.proxy_server import _stream_mcp_asgi_response
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_stream_mcp_asgi_response_propagates_pre_header_http_exception():
|
|
async def handle_fn(_scope, _receive, _send):
|
|
raise HTTPException(
|
|
status_code=401,
|
|
detail="Unauthorized",
|
|
headers={
|
|
"WWW-Authenticate": "Bearer authorization_uri=https://example.test/auth"
|
|
},
|
|
)
|
|
|
|
async def receive():
|
|
return {"type": "http.request", "body": b"", "more_body": False}
|
|
|
|
with pytest.raises(HTTPException) as exc_info:
|
|
await asyncio.wait_for(
|
|
_stream_mcp_asgi_response(
|
|
handle_fn,
|
|
{"type": "http", "method": "POST", "path": "/mcp", "headers": []},
|
|
receive,
|
|
),
|
|
timeout=1.0,
|
|
)
|
|
|
|
assert exc_info.value.status_code == 401
|
|
assert exc_info.value.headers == {
|
|
"WWW-Authenticate": "Bearer authorization_uri=https://example.test/auth"
|
|
}
|