Enabling Streaming
Set "stream": true in your request body to receive a stream of Server-Sent Events instead of a single JSON response. The response Content-Type changes to text/event-stream and the connection stays open until the model finishes generating or an error occurs.
POST https://api.creor.ai/v1/chat/completions
Content-Type: application/json
Authorization: Basic base64(YOUR_API_KEY:)
{
"model": "claude-sonnet-4-20250514",
"stream": true,
"messages": [
{"role": "user", "content": "Write a function to reverse a linked list"}
]
}Note
Event Format
Each event in the stream is a line prefixed with "data: " followed by a JSON object. Events are separated by two newlines. The stream ends with a special "data: [DONE]" sentinel.
Chunk Object
data: {
"id": "chatcmpl-9f8g7h6j5k4l3m2n1",
"object": "chat.completion.chunk",
"created": 1712937600,
"model": "claude-sonnet-4-20250514",
"choices": [
{
"index": 0,
"delta": {
"content": "Here"
},
"finish_reason": null
}
]
}
data: {
"id": "chatcmpl-9f8g7h6j5k4l3m2n1",
"object": "chat.completion.chunk",
"created": 1712937600,
"model": "claude-sonnet-4-20250514",
"choices": [
{
"index": 0,
"delta": {
"content": " is"
},
"finish_reason": null
}
]
}
data: [DONE]Chunk Fields
| Field | Type | Description |
|---|---|---|
| id | string | Same ID across all chunks in one completion. |
| object | string | Always "chat.completion.chunk" for streaming. |
| created | integer | Unix timestamp when the stream started. |
| model | string | The model generating the response. |
| choices[].index | integer | Choice index (always 0 when n=1). |
| choices[].delta.role | string | Present only in the first chunk. Always "assistant". |
| choices[].delta.content | string | The token(s) generated in this chunk. May be empty or null. |
| choices[].finish_reason | string | null | null during generation. Set to "stop", "length", or "content_filter" in the final chunk. |
First and Last Chunks
The first chunk includes the role field in the delta to establish the assistant turn. The last chunk before [DONE] has a non-null finish_reason and may include a usage field with token counts.
// First chunk
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}
// Final chunk
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":{"prompt_tokens":28,"completion_tokens":142,"total_tokens":170}}
data: [DONE]Stream Lifecycle
Understanding the stream lifecycle helps you build robust streaming clients that handle every state correctly.
| Phase | What Happens | Client Action |
|---|---|---|
| Connection | HTTP response starts with status 200 and Content-Type: text/event-stream. | Begin reading the event stream. |
| First chunk | Contains delta.role = "assistant" and optionally the first token. | Initialize the response buffer. |
| Content chunks | Each chunk contains one or more tokens in delta.content. | Append to the response buffer and update the UI. |
| Final chunk | finish_reason is set. usage field may be present. | Record usage data for billing tracking. |
| [DONE] sentinel | The string "data: [DONE]" signals the end of the stream. | Close the connection and finalize the response. |
Error Handling
Errors can occur before or during the stream. The handling strategy differs depending on when the error happens.
Errors Before Streaming Starts
If the request is invalid (bad model ID, missing auth, rate limited), the server returns a standard JSON error response with the appropriate HTTP status code. No SSE events are sent.
HTTP/2 429 Too Many Requests
Content-Type: application/json
{
"error": {
"type": "rate_limit_exceeded",
"message": "You have exceeded your per-minute request limit.",
"retry_after": 12
}
}Errors During Streaming
If an error occurs after streaming has started (e.g., the upstream provider times out), the server sends an error event before closing the stream. The error event uses the same "data: " prefix.
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Here is the function"},"finish_reason":null}]}
data: {"error":{"type":"upstream_error","message":"Provider connection timed out. Partial response may be incomplete."}}
data: [DONE]Warning
Connection Drops
If the connection drops without a [DONE] sentinel, the stream was interrupted. Common causes include network issues, client timeouts, or server restarts. Implement reconnection logic or prompt the user to resend the message.
curl Example
curl https://api.creor.ai/v1/chat/completions \
-u YOUR_API_KEY: \
-H "Content-Type: application/json" \
-N \
-d '{
"model": "claude-sonnet-4-20250514",
"stream": true,
"messages": [
{"role": "user", "content": "Write a TypeScript function to debounce"}
]
}'The -N flag disables output buffering so you see tokens as they arrive.
JavaScript / TypeScript
Use the Fetch API with a ReadableStream to process SSE events. This works in both Node.js (18+) and modern browsers.
Python
Use the httpx library with streaming support for a clean Python implementation.
OpenAI SDK
The easiest way to stream is with the OpenAI SDK, which handles SSE parsing, connection management, and error handling for you. Just point it at the Creor Gateway.
JavaScript / TypeScript
Python
Tip
React Example
For React applications, stream tokens into component state to render them incrementally:
Warning