Public preview — This API is in public preview. Endpoints, schemas, and limits may change before general availability.

API

Streaming events

Every SSE event type emitted by /v1/responses, with payload shapes and ordering guarantees.

Streaming responses are Server-Sent Events. Every event has a type field that identifies its shape, and a sequence_number that increases monotonically within the response.

This page lists every event type you can receive from /v1/responses.

Event envelope#

event: message
data: { "type": "<event-name>", "sequence_number": <int>, … }

The event: line is always message. Discrimination happens on the JSON type field — the same convention the OpenAI Responses API uses.

Lifecycle events#

`response.created`#

The first event of every response. Carries the response metadata.

{
  "type": "response.created",
  "sequence_number": 0,
  "response": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "object": "response",
    "created_at": 1716387200000,
    "model": "gpt-oss-120b",
    "status": "in_progress"
  }
}

`response.completed`#

The terminal success event. Carries the final concatenated text and any tool-call summary.

{
  "type": "response.completed",
  "sequence_number": 42,
  "response": {
    "id": "550e8400-…",
    "status": "completed",
    "model": "gpt-oss-120b",
    "output_text": "Hello, world.",
    "tool_calls": [ /* optional */ ]
  }
}

After this event the stream closes.

`error`#

Terminal failure event. May occur instead of response.completed.

{
  "type": "error",
  "code": "internal_error",
  "message": "Unexpected upstream failure"
}

See Errors for the full code list.

Text events#

`response.content_part.added`#

A text content part begins. Carries the output index and content index so multi-part responses can be reassembled.

{
  "type": "response.content_part.added",
  "sequence_number": 3,
  "output_index": 0,
  "content_index": 0,
  "part": { "type": "output_text", "text": "" }
}

`response.output_text.delta`#

Incremental text chunk. Append delta to whatever you have buffered for (output_index, content_index).

{
  "type": "response.output_text.delta",
  "sequence_number": 4,
  "output_index": 0,
  "content_index": 0,
  "delta": "Hello"
}

You will receive many of these per response — typically one per token or small token group.

`response.output_text.done`#

Marks the text part complete with the full assembled text. Useful as a sanity check, or for consumers that ignore deltas and only care about the final string.

{
  "type": "response.output_text.done",
  "sequence_number": 41,
  "output_index": 0,
  "content_index": 0,
  "text": "Hello, world."
}

Tool-call events#

`response.output_item.added`#

A new output item is starting. For tool calls this signals “the model has decided to call this tool”. For text messages this signals the start of an assistant message.

{
  "type": "response.output_item.added",
  "sequence_number": 5,
  "output_index": 1,
  "item": {
    "type": "function_call",
    "name": "web_search",
    "call_id": "call_abc123",
    "arguments": "{\"query\":\"…\"}"
  }
}

`response.output_item.done`#

The item is complete. For tool calls this carries the tool’s output and a status.

{
  "type": "response.output_item.done",
  "sequence_number": 12,
  "output_index": 1,
  "item": {
    "type": "function_call",
    "name": "web_search",
    "call_id": "call_abc123",
    "arguments": "{\"query\":\"…\"}",
    "output": "[{\"title\":\"…\",\"url\":\"…\"}, …]",
    "status": "completed"
  }
}

status is one of completed, failed. If failed, output is a string describing the failure.

Routing events#

`response.routing.completed`#

Emitted when model: "auto" resolves to a concrete model. Contains the chosen model and the routing decision metadata. Skipped when model is pinned.

{
  "type": "response.routing.completed",
  "sequence_number": 1,
  "model_id": "gpt-oss-120b",
  "category": "reasoning",
  "source": "router",
  "reasoning": "Input length > 500 tokens"
}

Post-processing events#

`response.rewrite.in_progress`#

Indicates the platform is applying a post-generation rewrite pass (e.g. language alignment, spelling). The final output_text.done reflects the rewritten text.

{ "type": "response.rewrite.in_progress", "sequence_number": 40 }

Events you may see in the future#

The schema is designed to be additive. Clients should ignore unknown event types rather than fail. We may add new events (e.g. citations, partial usage reporting) without bumping the API version, as long as they do not change the meaning of existing events.

Ordering guarantees#

response.created is always first.
response.completed or error is always last; one of them is guaranteed.
sequence_number increases by exactly one per event within a response. No skips, no duplicates.
Within a single output item, output_item.added precedes any deltas precedes output_item.done.
Across different output items (parallel tool calls), events may interleave by sequence_number. Use output_index to disambiguate.

Event envelope#

Lifecycle events#

response.created#

response.completed#

error#

Text events#

response.content_part.added#

response.output_text.delta#

response.output_text.done#