Public preview — This API is in public preview. Endpoints, schemas, and limits may change before general availability.

API

Conversations

Stateful threads vs stateless requests — when to use which.

A conversation is a server-side thread of messages. EU GPT stores the full history so subsequent requests can refer to earlier turns without you re-sending everything.

There are two ways to call the API.

Stateless requests#

Omit conversation_id. Each call is independent: the only context the model sees is the input you send plus any instructions.

{
  "model": "auto",
  "input": "Summarise this in one sentence: …",
  "stream": false
}

Use stateless when:

The request is one-shot and short — classification, extraction, translation, summarisation of a single chunk.
You want to manage history yourself in your own database.
You do not want the request to appear in the user’s web-UI history.

A stateless request still creates an ephemeral conversation internally so the server can store the response, but it is hidden from the user’s chat history.

Stateful conversations#

Pass an existing conversation_id UUID. The server prepends the conversation’s prior messages (within a sliding window) and uses them as context.

{
  "model": "auto",
  "input": "And add a bulleted list of risks.",
  "conversation_id": "8f14e45f-ceea-467a-a4ed-a9e9a5cb16ee",
  "stream": true
}

Use stateful when:

The user is having an actual conversation with multiple turns.
You want the thread to appear in their web-UI chat history.
You want server-side incremental summarisation to compress old turns automatically.

Lifecycle#

Conversations are created in three ways:

Implicitly via the web UI when a user starts a new chat.
Implicitly via the API by sending a stateless request (an ephemeral, hidden conversation).
Programmatically through the EU GPT web app’s conversation list.

You append to an existing conversation by sending its conversation_id. Conversations are scoped to the user that owns them; an API key can only target conversations owned by the user who issued the key.

Context windows and summarisation#

Each chat model has a fixed context window. EU GPT manages this for you:

Recent messages are passed verbatim (the message history window).
Older messages are summarised in the background and prepended as a conversation_summary.
Summaries are incremental — they update as the conversation grows.

You do not need to truncate or summarise on your side. Just keep sending the same conversation_id.

Projects and RAG#

A response can also be bound to a project by passing project_id. The project has an attached file corpus; the system retrieves relevant chunks and grounds the response in them.

{
  "model": "auto",
  "input": "What does the 2024 audit say about Q3 server costs?",
  "conversation_id": "8f14e45f-ceea-467a-a4ed-a9e9a5cb16ee",
  "project_id": "1c8b9a7f-2d3e-4f5a-9b8c-7d6e5f4a3b2c",
  "stream": true
}