Public preview — This API is in public preview. Endpoints, schemas, and limits may change before general availability.
API
The Responses API
Why EU GPT uses the OpenAI Responses shape, and how it differs from Chat Completions.
EU GPT exposes a single conversational endpoint: POST /v1/responses. It implements the OpenAI Responses API — the successor to Chat Completions — with the same request shape, the same event names, and the same SDK ergonomics.
If you have ever called openai.chat.completions.create you will recognise everything here. If you have used openai.responses.create, you can switch by changing the base URL and the API key.
How it differs from Chat Completions#
The Responses API is a superset of Chat Completions. The shape changes are deliberate and small.
| Aspect | Chat Completions | Responses |
|---|---|---|
| Field for the prompt | messages: [{ role, content }] | input: string or structured ContentItem[] |
| Output | choices[0].message.content | output_text (string) and output: ContentItem[] (structured) |
| State | Stateless — caller passes full history each turn | Optional conversation_id — server stores the thread |
| Tools | tools[] in the request, model returns tool_calls | Tools are server-side, surfaced through streaming events |
| Streaming events | delta.content chunks only | Typed events: text deltas, tool calls, content parts, completion |
In practice this means:
- For a one-shot answer, you pass a string. The response is a string. The same shape as Chat Completions.
- For a multi-turn conversation, you pass a
conversation_id. The server handles history.
The request shape#
{
"model": "auto",
"input": "What is data sovereignty?",
"stream": true,
"instructions": "Be concise. Cite sources when you use the web search tool.",
"conversation_id": "8f14e45f-ceea-467a-a4ed-a9e9a5cb16ee",
"project_id": null
}
model— accepted for OpenAI-SDK compatibility but ignored. The router always selects the model; any concrete ID you send has no effect. Pass"auto". The response reports the model actually used. See Models.input— a string for plain text, or a list ofContentItems for structured input.stream—true(default) returns SSE;falsereturns one JSON document.instructions— system prompt for this response. If the conversation is new, also stored as the conversation’s system message.conversation_id— append to an existing thread. Omit for stateless.project_id— bind the response to a project’s RAG corpus. Caller must be a member of the project.
The response shape (non-streaming)#
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"object": "response",
"created_at": 1716387200000,
"status": "completed",
"model": "gpt-oss-120b",
"output": [
{ "type": "message", "role": "assistant", "content": [/* text parts */] }
],
"output_text": "Data sovereignty is …"
}
For most applications, output_text is enough — it is a flat concatenation of every text chunk the model emitted. Use output when you need the structured items (e.g. to render tool calls separately, or to preserve content-part boundaries).
What stays the same as OpenAI#
- The
Authorization: Bearer …header. - The base path layout:
/v1/responses. - The SDK call shape:
client.responses.create(...). - The streaming event taxonomy (
response.output_text.delta,response.completed, etc.). - The error envelope structure (a JSON body with
detailanderror).
What is different from OpenAI#
- No interactive tool approval gate. Tools are auto-approved for API callers (
web_search,web_fetch,calculator,current_datetime). The web UI gates them; the API runs them. - No
previous_response_id. Stateful conversations useconversation_id, not response chains. - Stateless requests are hidden by default. A request without a
conversation_idcreates an ephemeral conversation that does not appear in the user’s history. organizationis implicit. Your API key carries the organisation context; you do not pass it in the body.