The platform exposes OpenAI-compatible endpoints, allowing you to use existing OpenAI SDKs and tools by simply changing the base URL and API key. All endpoints follow the OpenAI API specification, so any client library that supports a custom base URL will work.
https://api.openai.com/v1./v1Authorization header using the Bearer scheme, the same way you would with the OpenAI API.Authorization: Bearer YOUR_API_KEYAPI keys can be created and rotated in Settings → API Keys.
/v1/modelsList all available models. Returns model IDs you can use in chat completions and embeddings requests.
Response
{
"object": "list",
"data": [
{
"id": "gpt-4o",
"object": "model",
"created": 1700000000,
"owned_by": "organization"
}
]
}Examples
curl https://your-api-host.com/v1/models \
-H "Authorization: Bearer YOUR_API_KEY"/v1/chat/completionsCreate a chat completion. Supports both streaming and non-streaming modes.
Request body
{
"model": "gpt-4o",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Hello!" }
],
"stream": false,
"temperature": 0.7,
"max_tokens": 1024
}Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1700000000,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 10,
"total_tokens": 30
}
}Examples
curl https://your-api-host.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'Set stream: true to receive Server-Sent Events. The streaming format is identical to the OpenAI API.
/v1/embeddingsCreate embeddings for the given input text. Returns vector representations that can be used for search, clustering, and similarity comparisons.
Request body
{
"model": "text-embedding-3-small",
"input": "The quick brown fox jumps over the lazy dog"
}Response
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0023, -0.0091, 0.0152, ...]
}
],
"model": "text-embedding-3-small",
"usage": {
"prompt_tokens": 9,
"total_tokens": 9
}
}Examples
curl https://your-api-host.com/v1/embeddings \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-3-small",
"input": "Hello world"
}'/v1/responsesCreate a response using the Responses API. This is an alternative to chat completions that supports richer input types and tool use.
Request body
{
"model": "gpt-4o",
"input": "Explain quantum computing in simple terms."
}Response
{
"id": "resp-abc123",
"object": "response",
"created_at": 1700000000,
"model": "gpt-4o",
"output": [
{
"type": "message",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": "Quantum computing uses..."
}
]
}
]
}Examples
curl https://your-api-host.com/v1/responses \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"input": "Explain quantum computing."
}'/v1/realtimeEstablish a WebSocket connection for realtime, bidirectional communication. Used for voice and low-latency interactive sessions.
Examples
const ws = new WebSocket(
"https://your-api-host.com/v1/realtime?model=gpt-4o-realtime",
["realtime", "openai-insecure-api-key.YOUR_API_KEY"],
);
ws.onopen = () => {
ws.send(JSON.stringify({
type: "session.update",
session: { modalities: ["text"] },
}));
};
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
// Handle incoming event: data.type, data
};This is a WebSocket endpoint. Use the model query parameter to select the realtime model. Authentication is passed via the WebSocket protocol header.
The fastest way to get started is to install the OpenAI SDK and point it at your platform base URL.
pip install openai
from openai import OpenAI
client = OpenAI(
base_url="https://your-api-host.com/v1",
api_key="YOUR_API_KEY", # from Settings > API Keys
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)