Skip to main content

Chat Completion

/v1/chat/completions

HTTP Request

curl https://api.apertis.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <APERTIS_API_KEY>" \
-d '{
"model": "<MODEL_ALIAS>",
"messages": [
{
"role": "system",
"content": "<MESSAGES>"
}
]
}'
  • <APERTIS_API_KEY>: Your API key
  • <MODEL_ALIAS>: The alias of the model to use
  • <MESSAGES>: The messages to send to the model

Optional Parameters

ParameterTypeDescription
temperaturenumberSampling temperature (0-2). Default: 1
max_tokensintegerMaximum tokens in the response
top_pnumberNucleus sampling threshold (0-1)
streambooleanEnable streaming. Default: false
compressionobjectContext compression configuration

Context Compression

Add a compression object to automatically summarize older conversation history and reduce token usage for long conversations:

curl https://api.apertis.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <APERTIS_API_KEY>" \
-d '{
"model": "gpt-4.1-mini",
"messages": [{"role": "user", "content": "Hello!"}],
"compression": {"enabled": true, "model": "gpt-4.1-mini"}
}'

See Context Compression for full documentation.