The Hyper Agent Protocol (HAP) is a new, simplified standard for how software "agents" (or any automated services and capabilities) communicate with each other over the web. Imagine a universal way for different software components to discover what other components can do and then ask them to do it, all using standard web technologies.
The goal of HAP is to make it much easier to build flexible, interoperable systems where different services, tools, or AI agents can work together seamlessly, regardless of how they are built internally.
At its heart, HAP treats every distinct capability or function as something that has its own unique web address (an HTTP URI).
- To find out what a specific agent endpoint can do, you simply make a standard web request (
GET) to its URI. - The agent endpoint responds with a "Capability Description Document" (CDD). This CDD is a JSON file that clearly lists:
- The primary capability the endpoint offers by default.
- Optionally, any additional related actions it can perform (distinguished by simple names like
#resizeor#summarizeused as fragment identifiers, making them part of the main URI). - The expected inputs and outputs for each capability, defined using standard JSON Schemas. This ensures clarity on what data to send and what to expect back.
- Authentication methods required to use the capabilities.
- Crucially, this CDD is structured using Linked Data principles, similar to JSON-LD and how Schema.org is used. This means the JSON is not just data, but self-describing data:
- It uses an
@context(a standard JSON-LD field) to define short names for terms and to link properties to well-defined vocabularies (like Schema.org or custom ones). This helps avoid ambiguity. - Objects and descriptions within the CDD can have an
@type(a JSON-LD keyword) which specifies their type using a URI (e.g., a URI defining what a "CapabilityDescription" is, much like Schema.org defines types likeschema:Personorschema:Product). - Specific pieces of information or descriptions can be given a globally unique
@id(a JSON-LD keyword) using a URI, making them linkable and referenceable. - This approach makes the CDD highly structured, machine-readable, and allows agents to better understand the meaning and structure of the capabilities offered by linking them to shared definitions.
- It uses an
- Once a client knows what an agent endpoint can do (from its CDD), it asks it to perform an action using a standard web
POSTrequest to the same agent endpoint URI. - The
POSTrequest includes:- A unique request ID (as an HTTP URI) so both client and server can track this specific request and avoid accidental duplicates.
- Optionally, a capability identifier (usually a short fragment like
"resize", or empty if invoking the default primary capability). - The arguments (data) needed for the capability, structured according to the JSON Schema described in the CDD.
A key concept in HAP is the flexible definition of an "agent endpoint":
- An agent endpoint is simply an HTTP URI that offers one or more capabilities according to the HAP protocol.
- An "agent" can be just a single, focused skill: If an endpoint offers only one primary capability (its default action), then that endpoint URI essentially is that skill. Invoking the endpoint invokes that skill.
- An "agent" can be a collection of related skills: An endpoint can offer a primary capability and a few additional, closely related actions. These additional actions are invoked by specifying their short name (fragment) in the
POSTrequest. - This blurs the lines: a highly specialized "tool" (one skill) and a more complex "agent" (multiple skills) are interacted with using the exact same HAP protocol. The difference lies in the richness of their Capability Description Document.
HAP defines clear ways for actions to occur, with JSON Schemas defining the data formats for requests and responses:
- The client sends a
POSTrequest. - The agent endpoint performs the action immediately and sends back a standard
200 OKresponse containing theresult(e.g., the summarized text, the processed data). - The response also includes a unique response ID (as an HTTP URI) and echoes the client's request ID.
- For actions that take time, the client sends a
POSTrequest. - The agent endpoint immediately responds with a
202 Acceptedmessage. This message includes:- A unique operation ID (an HTTP URI). This ID represents the ongoing task and its URI can be used for status checks or cancellation requests.
- The client's request ID and a unique response ID for this acknowledgment.
- When the agent endpoint finishes the task, it sends a new
POSTrequest (a "webhook") to a callback URI provided by the client. This webhook contains the finalresultor an error. - This webhook also includes its own unique delivery ID and the original request ID and operation ID for correlation.
- Clients can also (if the agent endpoint supports it) use the
operation_IDURI toGETthe status of the ongoing task orPOSTto request its cancellation.
HAP focuses on standardizing the direct communication and self-description of individual agent endpoints. The following areas are considered complementary but separate concerns, to be addressed by other standards or system-level design:
- Global Agent Discovery: Mechanisms for agents to find the URIs of other agents they aren't already aware of (e.g., via registries or specialized search protocols). HAP defines how an agent describes itself once found.
- Complex Trust Frameworks: Establishing trust beyond the declared authentication methods (e.g., verifiable credentials, global reputation systems, "on behalf of" delegation). HAP defines how an agent states its auth needs.
- Multi-Agent Choreography/Orchestration Languages: Standardized languages for defining complex workflows involving multiple agents and their dependencies. HAP provides the bilateral communication links that an orchestrator would use.
- Global Semantic Vocabularies/Ontologies: While HAP's CDD uses JSON-LD principles to link to vocabularies, the creation and governance of these shared vocabularies (e.g., a universal standard for "summarization skill inputs") are external efforts. HAP enables their use.
- Specific Implementations of Authentication Providers: HAP defines how an agent declares its required authentication types (e.g., "Bearer token"), not how those tokens are issued or managed by identity providers.
- Detailed Legal, Ethical Frameworks, and Policy Enforcement: HAP allows linking to such documents, but their content and enforcement are outside the protocol.
HAP aims to provide a simpler, unified foundation that can enable many of the core functionalities seen in both A2A and MCP, while making different trade-offs.
- Covers: A2A's
AgentCardand MCP'sinitializeresponse +tools/list,resources/list,prompts/list. - HAP's approach: A single
GETto the agent endpoint URI returns a comprehensive CDD detailing all capabilities (primary and additional actions), I/O schemas (using JSON Schema), authentication, and how async operations are handled. The use of JSON-LD principles (@context,@type,@id) in the CDD provides rich, linked self-description.
- Covers: A2A's
tasks/sendand MCP'stools/call. - HAP's approach: A single
POSTmethod to the agent endpoint URI. The specific action is determined by the optionalcapability_identifierin the request body (defaulting to the endpoint's primary capability). Arguments are provided in a structured way defined by the CDD's JSON Schemas.
- Covers: MCP's
resources/read. - HAP's approach: Modeled as a capability. An agent endpoint can offer a "get_resource" capability (e.g.,
capability_identifier: "#getResource"). The CDD for this capability would specify how to identify the resource. If the resource has its own public URI, the "get_resource" capability might simply return that URI, allowing the client to fetch it directly.
- Covers: MCP's
prompts/get. - HAP's approach: Modeled as a capability. An agent endpoint can offer a "get_prompt_template" capability. The arguments would specify the template and any variables, and the result would be the processed prompt.
- Covers: A2A's
tasks/sendSubscribe(for basic eventing/completion),TaskStatusUpdateEvent,TaskArtifactUpdateEvent, and push notifications. Also covers the need for MCP'snotifications/progressin a simplified way. - HAP's approach:
- Webhooks:
202 Acceptedresponse to aPOSTincludes anoperation_id(HTTP URI). Completion/failure is signaled by the agent endpoint making aPOSTto a client-provided webhook URI. - Polling for Status: Clients can
GETtheoperation_idURI to poll for status if the capability supports it (declared in CDD). - SSE on Response: The CDD can declare if a capability's direct
POSTresponse will be a Server-Sent Event stream for incremental updates.
- Webhooks:
- Covers: A2A's
MessagewithParts (TextPart, FilePart, DataPart) and MCP'sTextContent,ImageContent,AudioContent. - HAP's approach: The JSON Schemas within the CDD for
inputSchemaandoutputSchemaof any capability define how multi-modal data is structured (e.g., using base64 encoding for binary data with a MIME type field, or URIs to external media).
- A2A: Has a rich, first-class
Taskobject with a detailed lifecycle (submitted, working, input-required, completed, failed, canceled) and specific methods to manage it (e.g.,tasks/get,tasks/cancel). - HAP: Manages asynchronous operations via an
operation_idURI. While status can be polled (GET <operation_id>) and cancellation requested (POST <operation_id>), it doesn't have the same explicit, detailed intermediate task states (like "input-required") built into the protocol itself unless a capability's specific workflow models it.
- MCP: Allows servers to make requests back to clients (e.g.,
sampling/createMessage,roots/list). - HAP: Is primarily client-initiated. For a HAP endpoint (server) to make an arbitrary request to another HAP endpoint (client), the "client" must also expose its own HAP endpoint. Webhooks in HAP are for results/notifications of client-initiated operations.
- MCP:
notifications/.../list_changedallows servers to push updates about their capabilities. - HAP: Relies on clients re-fetching/re-validating the CDD using
GETwith HTTP caching mechanisms (ETag).
- A2A:
tasks/resubscribe. MCP:resources/subscribe,resources/unsubscribe. - HAP: SSE streams are tied to a single
POSTresponse. There's no separate subscribe/unsubscribe for ongoing data feeds or rejoining streams beyond re-initiating thePOSTor relying on webhooks for discrete updates.
- MCP:
initializefor capability exchange. - HAP: Relies on the initial
GETof the CDD. No separate session setup beyond standard HTTP.
HAP is a modern, web-friendly protocol designed to make software components (agents, tools, services) more discoverable, understandable, and usable by each other.
- It's simple: Uses basic web requests (
GETto understand,POSTto act). - It's self-describing: Agents publish rich, machine-readable descriptions of their capabilities (CDDs). These descriptions use well-understood Linked Data principles (like JSON-LD and Schema.org) to define types, properties, and unique identifiers using URIs, making them more meaningful and interoperable.
- It's flexible: An "agent" can be anything from a single, focused function to a more complex service with multiple related actions, all using the same interaction pattern.
- It's robust: Built-in URI-based identifiers help manage asynchronous tasks and ensure reliable communication.
- It's interoperable: By standardizing the "how" of communication and the "style" of description (using web-standard approaches to structured, linked data), HAP allows different systems to connect and collaborate more easily.
Compared to existing protocols like A2A and MCP, HAP offers a more unified and streamlined approach by modeling most interactions as discoverable "capabilities" at a URI. It achieves many of the same goals (tool use, resource access, asynchronous tasks, multi-modal data) but with a simpler set of core protocol rules, pushing more descriptive power into the Capability Description Document. This design prioritizes web-native simplicity and a consistent interaction model.
This approach aims to reduce the complexity of building distributed systems and foster a more dynamic and interconnected ecosystem of automated capabilities, leveraging familiar web standards for data description.
Hyper Agent Protocol (HAP) Schemas
Example: Capability Description Document (CDD)
This is a snippet of a CDD for an Agent Endpoint located at
http://my.agent.com/calculator.The root object of the CDD describes the Agent Endpoint itself. The
inputandoutputfields for each action now reference named types (e.g.,calc:SumActionInput). These named types would be defined elsewhere (e.g., in a HAP vocabulary or a dedicatedtypessection of the CDD usinghap:PropertyDescriptorbased structures). These referenced types describe the structure of the object that will be the value of thebodyfield inhap:AgentRequestandhap:AgentResponsemessages, respectively. Thebodyobject itself will be typed with these referenced types.{ "@id": "http://my.agent.com/calculator", "@type": "http://hap.dev/vocab#Agent", "@context": { "hap": "http://hap.dev/vocab#", "calc": "http://my.agent.com/calculator/vocab#", "remote": "https://your.agent.com/vocab#" }, "name": "Simple Calculator Agent", "description": "A HAP-compliant agent that can perform basic calculations.", "@actions": [ { "@id": "#", "@type": "hap:Action", "description": "Default action.", "output": "calc:UsageInfoOutput" }, { "@id": "#sum", "@type": "hap:Action", "description": "Adds two numbers 'a' and 'b' and returns their sum.", "input": "calc:SumActionInput", "output": "calc:SumActionOutput" }, { "@id": "https://your.agent.com/calculator#multiply", "@type": "hap:Action", "description": "Multiplies two numbers 'a' and 'b' (a remote agent action).", "input": "remote:MultiplyActionInput", "output": "remote:MultipleActionOutput" }, { "@id": "https://some.agent.com/actions/divide#", "@type": "hap:Action", "description": "Divides two numbers 'a' and 'b' (a remote agent default action, similar to a single mcp action." } ] }Example: Invocation of the
#sumAction (Synchronous)Client POST
http://my.agent.com/calculatorRequest Body:
{ "@id": "#reqSynchronousXYZ", "@type": "hap:AgentRequest", "@action": "#sum", "body": { "@type": "calc:SumActionInput", "a": 10, "b": 5 } }Example Synchronous Server Response (200 OK):
here the response
@idis locally scoped, more details may or may not be available, depending if the Agent defined@actionswhich allow cancel, status etcResponse Body:
{ "@id": "#respSynchronousABC", "@type": "hap:AgentResponse", "request": "#reqSynchronousXYZ", "@action": "http://my.agent.com/calculator#sum", "body": { "@type": "calc:SumActionOutput", "total": 15 } }Example: Invocation of the
#sumAction (Asynchronous with Webhook via Request@id)Client POST
http://my.agent.com/calculatorRequest Body:
{ "@id": "http://calling.agent.com/webhook#reqXYZ", "@type": "hap:AgentRequest", "@action": "#sum", "body": { "@type": "calc:SumActionInput", "a": 10, "b": 5 } }Server Response 1: Asynchronous Acceptance (202 Accepted)
The
@idis a globally scoped uri, so is an Agent - likely with one default@actionof#or'#status{ "@id": "http://my.agent.com/calculator/responses/respSynchronousABC", "@type": "hap:AgentResponse", "request": "http://calling.agent.com/webhook#reqXYZ", "@action": "http://my.agent.com/calculator#sum", }Server Response 2: Webhook Call to Client (Server POST to
http://calling.agent.com/webhook)Webhook Request Body (from HAP Agent Server to Client's Webhook):
{ "@id": "http://my.agent.com/calculator/responses/respSynchronousABC", "@type": "hap:AgentResponse", "request": "http://calling.agent.com/webhook#reqXYZ", "@action": "http://my.agent.com/calculator#sum", "body": { "@type": "calc:SumActionOutput", "total": 15 } }Note on Type Definitions (e.g.,
calc:SumActionInput,hap:PropertyDescriptor):The CDD's
inputandoutputfields for anhap:Actionreference named types (e.g.,calc:SumActionInput). These named types (likecalc:SumActionInput) would be defined elsewhere (e.g., in a dedicated"types"section within this CDD, or in an external HAP vocabulary document referenced via@context). Such a type definition would specify its own@type(e.g.,hap:ActionInputDescriptor) and list its properties usinghap:PropertyDescriptor.The type referenced by an Action's
inputfield (e.g.,calc:SumActionInput) describes the structure of the object that will be the value of thebodyfield in anhap:AgentRequest. This innerbodyobject MUST also be typed with this referenced type (e.g.@type: "calc:SumActionInput"). It is understood that the type referenced by an Action'sinputfield (e.g.,calc:SumActionInput) is a semantic subtype ofhap:ActionInput.The type referenced by an Action's
outputfield (e.g.,calc:SumActionOutput) describes the structure of the object that will be the value of thebodyfield in anhap:AgentResponse. This innerbodyobject MUST also be typed with this referenced type (e.g.@type: "calc:SumActionOutput"). It is understood that the type referenced by an Action'soutputfield (e.g.,calc:SumActionOutput) is a semantic subtype ofhap:ActionOutput.Furthermore, the
bodyproperty of ahap:AgentRequestitself is understood to expect instances ofhap:ActionInputor its subtypes as its value. Similarly, thebodyproperty of ahap:AgentResponseexpects instances ofhap:ActionOutputor its subtypes as its value.