Chat Completions

Create Chat Completion

   POST v1/chat/completions

Generate a chat message completion using the chosen Language Model (LLM).

The request body follows a format similar to OpenAI's chat completion request (opens in a new tab), and the response will be the chat completion object (opens in a new tab). When opting for stream:true, the response will manifest as a stream of chat completion Chunk (opens in a new tab) objects. SimpliML automatically adapts the parameters for LLMs other than OpenAI supported. In case certain parameters are absent in these LLMs, they will be excluded.

Option	Type	Description	Default	Required
messages	array	A list of messages comprising the conversation so far. The conversation should be with alternative roles as system - user - assistant - user	-	Yes
model	string	ID of the model to use.	-	Yes
max_tokens	number	The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length.	100	No
temperature	float	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	1.0	No
top_k	number	What top_k to use between 1 to 50. Integer that controls the number of top tokens to consider. Set to -1 to consider all tokens.	40	No
top_p	float	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.	0.92	No
repetetion_penalty	float	Float Number that penalizes new tokens based on whether they appear in the prompt and the generated text so far. Values > 1 encourage the model to use new tokens, while values < 1 encourage the model to repeat tokens.	1.0	No
stream	boolean	If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available	false	No