litellm

History

Sameer Kankute af17400c38 feat(a2a): well-known agent-card discovery + LangGraph Platform mode (#28860 ) * feat(a2a): well-known agent-card discovery + LangGraph Platform mode Adds a registration-time discovery flow so admins can paste an upstream agent URL, see its skills/capabilities, pick what to expose, and have the proxy front it with a LiteLLM-shaped agent card. Backend (new litellm/proxy/a2a/ module): - fetch_well_known_card walks /.well-known/agent-card.json, /.well-known/agent.json, /agent.json by default. langgraph_platform mode hits the canonical path with ?assistant_id=<id> (LangGraph serves one shared endpoint per deployment). - merge_agent_card overlays LiteLLM overrides on the upstream card: drops upstream url, forces protocolVersion=1.0, replaces securitySchemes with LiteLLMKey bearer, emits supportedInterfaces pointing at the proxy, filters capabilities to a small allowlist, strips non-v1.0 fields. - POST /v1/a2a/discover returns the raw upstream card (admin-only) so the UI can render skills/capabilities for selection. - create/update/patch agent endpoints pre-generate the agent_id and run merge_agent_card before storing, so DB.agent_card_params already embeds the proxy-fronted URL. UI (ui/litellm-dashboard): - New AgentCardDiscovery component with a parent-driven plan: discovery_mode + params + display URL. For LangGraph the parent composes (api_base, assistant_id); for pure A2A it uses the url field. Component hides the manual URL input when the parent drives. - add_agent_form wires discovery for every non-custom agent type and overlays the user's selections onto agent_card_params at submit, fixing the bug where dynamic agent forms ignored discovery picks. Completion-bridge fixes (paired): - Add kind: "message" to A2A response messages and unwrap result so it's a Message directly per spec (matches a2a SDK SendMessageResponse validation). - Forward A2A metadata to LangGraph runs via extra_body.metadata. * fix(a2a): preserve agent url, fix streaming chunk envelope, and protect forwarded metadata - Streaming chunk: move final out of the message object into the result envelope per the A2A spec. - Agent card merge: keep upstream url on the stored card so the runtime invocation path can locate the upstream backend; the public well-known endpoint already rewrites this field to the proxy URL before exposing it to clients. - Completion bridge: apply A2A forward metadata after merging litellm_params so an agent-configured extra_body cannot overwrite the forwarded metadata. Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(a2a): fix legacy streaming chunk, agent card test, and metadata merge - providers/litellm_completion: move 'final' out of the message object into the result envelope per the A2A spec (matches the bridge fix). - agent endpoints test: the runtime invocation path now preserves the top-level 'url' on the stored card, so update the assertion to match. - completion bridge metadata: when forwarding A2A metadata via extra_body.metadata, merge into any existing extra_body.metadata instead of replacing it, so an agent-configured metadata block is preserved (forward metadata still wins on key conflicts). Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(a2a): remove dead duplicate transformation dir; drop SSRF-prone headers field from /v1/a2a/discover Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(a2a): revert accidental html→index.html rename from afc8b10f The commit afc8b10f bundled real A2A fixes alongside an unintended re-introduction of the /index.html layout that `8513d7fc` had already reverted. Restore all 35 static-export pages back to the flat .html structure that matches the upstream main branch. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(a2a): address PR review comments UI: - Auto-trigger discovery when connection details are filled; remove the "Use these selections" button (selection syncs live to parent, user just clicks Next). - Edit Settings: auto-discover upstream card on open; cross-check with DB-stored card so only already-saved skills/capabilities are pre-ticked. - Extract shared buildDiscoveryRequest + selectionsFromSavedAgentCard helpers into agent_discovery_utils.ts so both add and edit flows share the same logic. Backend: - agent_card.py: rename the proxy security requirements field from the non-standard ``securityRequirements`` to the spec-correct ``security`` key (matches AgentCard TypedDict and A2A/OpenAPI convention). - agent_card.py: remove ``securityRequirements`` from _ALLOWED_TOP_LEVEL_KEYS. - endpoints.py: _build_merged_agent_card now forwards agent_name and description from the request so the stored card reflects the admin- supplied name, not just whatever the upstream card advertised. - utils.py: remove overly-broad ``or "parts" in result`` fallback; use ``kind == "message"`` check only to avoid false matches on future result types that happen to include a ``parts`` field. - test_agent_card.py: update assertions to expect ``security`` key. Co-authored-by: Cursor <cursoragent@cursor.com> * fix: restore Next.js metadata directories to match upstream main The previous revert removed __next.* metadata subdirectories from git tracking entirely, but these directories exist on origin/main alongside the flat .html files. Restore them via checkout from origin/main so the PR diff only reflects actual code changes. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(a2a): drop dead headers option from discoverAgentCardCall The backend /v1/a2a/discover endpoint no longer accepts a headers field (removed in 78591b2 for SSRF safety), so any headers passed through DiscoverAgentCardOptions were silently discarded by the API request body. Remove the field and the conditional that copies it onto the request body. * fix(a2a): skip merge for non-A2A agents and align pydantic-ai result shape The agent create/update/patch handlers ran the LiteLLM-fronting merge unconditionally, so registrations that did not provide agent_card_params still ended up with a synthesised card carrying supportedInterfaces, securitySchemes, and default skills. Gate the merge on a non-empty agent_card_params so plain chat/LLM agents stay non-A2A in the registry. Also move kind: 'message' inside the a2a_message dict in the Pydantic AI non-streaming response so its construction matches the completion bridge rather than spreading kind on top of a separate dict. * Fix three bugs in A2A discovery flow 1. UI: Stabilize discoveryRequest deps to avoid redundant /v1/a2a/discover API calls. The parent rebuilds the discoveryRequest object on every form keystroke, so depend on primitive proxies (discovery_mode + serialized params) rather than the object identity. Read the actual object via a ref inside handleDiscover. 2. Backend: Route the well-known card fetch through async_safe_get so the admin /v1/a2a/discover endpoint can't be used to probe private/loopback addresses or cloud metadata endpoints. SSRFError is a separate handled case so it surfaces a clear AgentCardDiscoveryError. 3. Streaming: Make openai_chunk_to_a2a_chunk emit the same flat result shape as the non-streaming response (kind/role/parts/messageId at the result level), with envelope-level 'final' added. Matches the existing create_artifact_update_event pattern and lets consumers read a uniform result shape across streaming and non-streaming. Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(a2a/ui): include savedAgentCard in handleDiscover deps The previous deps list omitted savedAgentCard, so handleDiscover (and the resetSelections it calls) kept the closure's saved-card value even after the parent refetched the agent. Clicking 'Re-discover' would then pre-select skills against stale data. Adding savedAgentCard to the deps array forces the callback to refresh whenever the saved card changes. Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(a2a): align pydantic-ai test + docstring with direct-Message result shape The non-streaming A2A response was changed so that `result` is the Message itself (kind="message"), per spec / SendMessageResponse. Update the PydanticAITransformation._transform_to_a2a_response test and docstring that still described the old `result.message` envelope so internal consumers match the producer. * fix(a2a): strip additionalInterfaces and let configured metadata win over A2A request - merge_agent_card no longer carries upstream additionalInterfaces through; storing those alternate URLs would let authenticated agent callers reach the backend directly and bypass proxy auth/budget/logging. - apply_forward_metadata_to_completion_params now layers client-supplied A2A metadata UNDER any agent-owner-configured extra_body.metadata, so server-set run metadata stays authoritative on key conflicts. * fix(agents): merge agent card even when agent_card_params is an empty dict Treat an explicitly provided empty agent_card_params ({}) as 'card provided but empty' instead of 'no card', so the LiteLLM-fronting merge still injects securitySchemes, supportedInterfaces, and protocolVersion. Without this, the well-known endpoint could serve a bare card with only a rewritten url, advertising no authentication to A2A clients. Co-authored-by: Yassin Kortam <yassin@berri.ai> * refactor(a2a): drop dead openai_chunk_to_a2a_chunk helper The deprecated single-chunk helper has no callers anywhere in the codebase — the streaming path emits proper A2A events via create_task_event / create_status_update_event / create_artifact_update_event in handler.py. Removing the dead method also eliminates the inconsistency where the unused chunk inlined the envelope-level final flag inside the Message result. * fix(a2a): scope a2a lazy-feature so it doesn't subsume /v1/a2a/discover - _lazy_features.py: use /a2a prefix + /message/send suffix for the a2a feature so a request to /v1/a2a/discover no longer triggers the a2a_endpoints module to load alongside a2a_registration. - agent_endpoints/endpoints.py: drop the no-op description override kwarg from _build_merged_agent_card and its three call sites. The upstream card's description is already preserved by merge_agent_card's deepcopy, so passing it explicitly did nothing. * style: black-format litellm/a2a_protocol/litellm_completion_bridge/transformation.py * fix: address PR bugfix review for a2a discovery + metadata forwarding - agent create form (add_agent_form.tsx): drop the skills.length > 0 guard so an admin can clear all discovered skills during creation, matching the edit form's overlay behavior (consistency between create and edit flows). - agent_card_discovery.tsx: stop including savedAgentCard in the handleDiscover useCallback deps. Read it via a ref inside resetSelections instead, so a parent-driven re-render that hands us a new savedAgentCard object reference (e.g. a background refresh of the agent record) does not recreate handleDiscover and re-fire the auto-discover effect, which would otherwise overwrite in-progress user edits in parent-driven mode (debounceMs = 0). - a2a_endpoints.invoke_agent_a2a: skip 'metadata' when moving litellm params off of A2A MessageSendParams into body. The A2A protocol defines params.metadata as a first-class request-level field, and the completion bridge's get_forward_metadata is supposed to merge it with message.metadata. Previously the proxy always stripped params.metadata before constructing MessageSendParams, so the params-level branch in get_forward_metadata was dead code in the proxy flow. Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(a2a): return 404 from get_agent_card when agent has no card * fix(agents): apply discovery overlay uniformly on create and dedupe ALLOWED_CAPABILITY_KEYS - buildAgentData now applies overlayDiscoveredCardParams after every non-custom branch (a2a, use_a2a_form_fields, dynamic) so types with credential_fields no longer silently drop discovered skills, capabilities, input/output modes, provider, and icon/doc URLs on submit. Mirrors the edit flow in agent_info.tsx. - Export ALLOWED_CAPABILITY_KEYS from agent_discovery_utils and import it in agent_card_discovery so the rendering and selection-filtering logic share a single source of truth. Co-authored-by: Yassin Kortam <yassin@berri.ai> * ci(proxy-endpoints): wire tests/test_litellm/proxy/a2a into the shard The two new test files (test_discovery.py, test_agent_card.py) were not picked up by any pytest path, so their coverage never reached codecov and patch coverage fell below the auto target. * fix(ui): overlay discovered name/description in create flow for dynamic agents Mirror the edit-form overlay in agent_info.tsx so dynamic agent types (e.g. LangGraph) whose forms don't register name/description as Form.Items don't silently lose those discovery-panel edits on save. Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(a2a): default merged agent card version, null-guard runtime URL lookup, scope discovery auto-fire to A2A types - merge_agent_card now defaults version to 1.0.0 when upstream omits it (A2A v1.0 schema requires the field). - invoke_agent_a2a guards against agent_card_params being None so plain chat agents routed via the A2A path return a JSON-RPC error instead of AttributeError. - buildDiscoveryRequest no longer falls back to any URL-shaped credential field for non-A2A agent types (Azure AI Foundry, Bedrock AgentCore, Vertex). Discovery only auto-fires for pure A2A and use_a2a_form_fields runtimes; the manual URL input remains available as an escape hatch. * fix(ui): extract overlayDiscoveredCardParams + debounce parent-driven discovery Two findings from greptile review: 1. `overlayDiscoveredCardParams` was copy-pasted between `add_agent_form.tsx` and `agent_info.tsx`. Move it to `agent_discovery_utils.ts` so the create and edit flows share the same overlay logic and there's only one place to update when discovered fields change. 2. `agent_card_discovery.tsx` used a zero-debounce path for parent-driven mode, which fires one discovery HTTP request per keystroke when an admin types into the parent form's URL / api_base / assistant_id fields (the parent rebuilds the plan from watched form values every render). Apply the same 400ms debounce uniformly. * fix(a2a): preserve discovery name edit, default discovery headers, sync url on re-discover - _build_merged_agent_card: prefer card-supplied name over agent_name so the discovery panel's editable 'Name (shown to API clients)' value is not silently overwritten by the internal identifier. - async_safe_get call in fetch_well_known_card: pass headers or {} to avoid TypeError({*None, 'Host': ...}) when URL validation is enabled in production (default). - agent_info handleApplyDiscoveredCard: set url: selection.upstream_url in fieldsToSet so re-discovery during edit refreshes the form's URL field for pure A2A agents (matches add_agent_form). Co-authored-by: Yassin Kortam <yassin@berri.ai> fix(a2a): scrub upstream url from /public/agent_hub cards Public agent_hub returned agent_card_params verbatim, exposing the retained upstream backend url to unauthenticated callers. Rewrite the url to the proxy /a2a/{agent_id} entrypoint on response, matching the behavior of the authenticated well-known agent-card endpoint, so the backend cannot be reached outside LiteLLM's auth, budget, and logging path. * fix(a2a): include suffix-matched routes in lazy warm openapi fragment --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Yassin Kortam <yassin@berri.ai> Co-authored-by: mateo-berri <277851410+mateo-berri@users.noreply.github.com>		2026-05-29 20:50:42 -07:00
..
codeql	[Infra] Improve CodeQL scanning coverage and schedule	2026-03-27 12:04:09 -07:00
ISSUE_TEMPLATE	docs: document new github + gitlab ci scripts	2026-03-25 20:17:10 -07:00
observatory	Add observatory test workflow for RC/stable releases	2026-03-01 15:30:09 -03:00
screenshots	fix(team_endpoints): auto-add SSO team members to org on move (proxy admin only) (#26377 )	2026-04-24 08:36:25 -07:00
scripts	style: run black formatter on files from main merge	2026-04-17 13:02:59 -07:00
workflows	feat(a2a): well-known agent-card discovery + LangGraph Platform mode (#28860 )	2026-05-29 20:50:42 -07:00
dependabot.yaml	chore: fixes	2026-04-05 01:30:57 -07:00
deploy-to-aws.png	Add files via upload	2023-10-25 16:33:53 -07:00
FUNDING.yml	Update FUNDING.yml	2023-09-22 09:51:35 -07:00
pull_request_template.md	docs: hand-written CLAUDE.md; point GEMINI.md and AGENTS.md at it (#29252 )	2026-05-29 00:05:05 -07:00
template.yaml	(chore) cleanup	2024-02-09 09:28:13 -08:00