model
string Required
ID of the specific model to use for the request. The model ID should be in the format of {publisher}/{model_name} where "openai/gpt-4.1" is an example of a model ID. You can find supported models in the catalog/models endpoint.
messages
array of objects Required
The collection of context messages associated with this chat completion request. Typical usage begins with a chat message for the System role that provides instructions for the behavior of the assistant, followed by alternating messages between the User and Assistant roles.
Properties ofmessages
Name, Type, Description
role
string Required
The chat role associated with this message
Can be one of: assistant
, developer
, system
, user
content
string Required
The content of the message
frequency_penalty
number
A value that influences the probability of generated tokens appearing based on their cumulative frequency in generated text. Positive values will make tokens less likely to appear as their frequency increases and decrease the likelihood of the model repeating the same statements verbatim. Supported range is [-2, 2].
max_tokens
integer
The maximum number of tokens to generate in the completion. The token count of your prompt plus max_tokens cannot exceed the model's context length. For example, if your prompt is 100 tokens and you set max_tokens to 50, the API will return a completion with a maximum of 50 tokens.
modalities
array of strings
The modalities that the model is allowed to use for the chat completions response. The default modality is text. Indicating an unsupported modality combination results in a 422 error. Supported values are: text
, audio
presence_penalty
number
A value that influences the probability of generated tokens appearing based on their existing presence in generated text. Positive values will make tokens less likely to appear when they already exist and increase the model's likelihood to output new tokens. Supported range is [-2, 2].
response_format
object
The desired format for the response.
Can be one of these objects: Name, Type, DescriptionObject
object
Object
Name, Type, Description
type
string
Can be one of: text
, json_object
Schema for structured JSON response
object Required
Schema for structured JSON response
Name, Type, Description
type
string Required
The type of the response.
Value: json_schema
json_schema
object Required
The JSON schema for the response.
seed
integer
If specified, the system will make a best effort to sample deterministically such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed.
stream
boolean
A value indicating whether chat completions should be streamed for this request.
Default: false
stream_options
object
Whether to include usage information in the response. Requires stream to be set to true.
Properties ofstream_options
Name, Type, Description
include_usage
boolean
Whether to include usage information in the response.
Default: false
stop
array of strings
A collection of textual sequences that will end completion generation.
temperature
number
The sampling temperature to use that controls the apparent creativity of generated completions. Higher values will make output more random while lower values will make results more focused and deterministic. It is not recommended to modify temperature and top_p for the same completion request as the interaction of these two settings is difficult to predict. Supported range is [0, 1]. Decimal values are supported.
tool_choice
string
If specified, the model will configure which of the provided tools it can use for the chat completions response.
Can be one of: auto
, required
, none
tools
array of objects
A list of tools the model may request to call. Currently, only functions are supported as a tool. The model may respond with a function call request and provide the input arguments in JSON format for that function.
Properties oftools
top_p
number
An alternative to sampling with temperature called nucleus sampling. This value causes the model to consider the results of tokens with the provided probability mass. As an example, a value of 0.15 will cause only the tokens comprising the top 15% of probability mass to be considered. It is not recommended to modify temperature and top_p for the same request as the interaction of these two settings is difficult to predict. Supported range is [0, 1]. Decimal values are supported.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4