Public preview — This API is in public preview. Endpoints, schemas, and limits may change before general availability.

API

The Responses API

Why EU GPT uses the OpenAI Responses shape, and how it differs from Chat Completions.

EU GPT exposes a single conversational endpoint: POST /v1/responses. It implements the OpenAI Responses API — the successor to Chat Completions — with the same request shape, the same event names, and the same SDK ergonomics.

If you have ever called openai.chat.completions.create you will recognise everything here. If you have used openai.responses.create, you can switch by changing the base URL and the API key.

How it differs from Chat Completions#

The Responses API is a superset of Chat Completions. The shape changes are deliberate and small.

Aspect	Chat Completions	Responses
Field for the prompt	`messages: [{ role, content }]`	`input: string` or structured `ContentItem[]`
Output	`choices[0].message.content`	`output_text` (string) and `output: ContentItem[]` (structured)
State	Stateless — caller passes full history each turn	Optional `conversation_id` — server stores the thread
Tools	`tools[]` in the request, model returns `tool_calls`	Tools are server-side, surfaced through streaming events
Streaming events	`delta.content` chunks only	Typed events: text deltas, tool calls, content parts, completion

In practice this means:

For a one-shot answer, you pass a string. The response is a string. The same shape as Chat Completions.
For a multi-turn conversation, you pass a conversation_id. The server handles history.

The request shape#

{
  "model": "auto",
  "input": "What is data sovereignty?",
  "stream": true,
  "instructions": "Be concise. Cite sources when you use the web search tool.",
  "conversation_id": "8f14e45f-ceea-467a-a4ed-a9e9a5cb16ee",
  "project_id": null
}

model — accepted for OpenAI-SDK compatibility but ignored. The router always selects the model; any concrete ID you send has no effect. Pass "auto". The response reports the model actually used. See Models.
input — a string for plain text, or a list of ContentItems for structured input.
stream — true (default) returns SSE; false returns one JSON document.
instructions — system prompt for this response. If the conversation is new, also stored as the conversation’s system message.
conversation_id — append to an existing thread. Omit for stateless.
project_id — bind the response to a project’s RAG corpus. Caller must be a member of the project.

The response shape (non-streaming)#

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "object": "response",
  "created_at": 1716387200000,
  "status": "completed",
  "model": "gpt-oss-120b",
  "output": [
    { "type": "message", "role": "assistant", "content": [/* text parts */] }
  ],
  "output_text": "Data sovereignty is …"
}

For most applications, output_text is enough — it is a flat concatenation of every text chunk the model emitted. Use output when you need the structured items (e.g. to render tool calls separately, or to preserve content-part boundaries).

What stays the same as OpenAI#

The Authorization: Bearer … header.
The base path layout: /v1/responses.
The SDK call shape: client.responses.create(...).
The streaming event taxonomy (response.output_text.delta, response.completed, etc.).
The error envelope structure (a JSON body with detail and error).

What is different from OpenAI#

No interactive tool approval gate. Tools are auto-approved for API callers (web_search, web_fetch, calculator, current_datetime). The web UI gates them; the API runs them.
No previous_response_id. Stateful conversations use conversation_id, not response chains.
Stateless requests are hidden by default. A request without a conversation_id creates an ephemeral conversation that does not appear in the user’s history.
organization is implicit. Your API key carries the organisation context; you do not pass it in the body.