This tutorial covers how to use the REST API to run models. We will be querying the mistralai/Mistral-7B-Instruct-v0.2 model to find the what is gravity.


Set Environment Variable

Launch your terminal. Define the endpoint URL and the API key for authentication.

export ENDPOINT_URL="https://api.simpliml.com/v1/completions"

Create Request Object

The input to the API is a JSON-formatted object with all the request parameters.

promptstringThe prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays.-Yes
modelstringID of the model to use.-Yes
max_tokensnumberThe maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length.100No
temperaturefloatWhat sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.1.0No
top_knumberWhat top_k to use between 1 to 50. Integer that controls the number of top tokens to consider. Set to -1 to consider all tokens.40No
top_pfloatAn alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.0.92No
repetetion_penaltyfloatFloat Number that penalizes new tokens based on whether they appear in the prompt and the generated text so far. Values > 1 encourage the model to use new tokens, while values < 1 encourage the model to repeat tokens.1.0No
streambooleanIf set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become availablefalseNo
JSON object
    "prompt": "what is gravity?",
    "model": "s7e2ca956beb6e87d7dae",
    "max_tokens": 200,
    "temperature": 0.4,
    "top_k": 50,
    "top_p": 1,
    "repetetion_penalty": 1.0,
    "stream": false

Create the curl Request

To retrieve the details of gravity, issue the following curl command, inserting your JSON-formatted object in -d below.

     -H "Content-Type: application/json" \
     -H "Authorization: Bearer $SimpliML_API_KEY" \
     -d '{"prompt": "what is gravity","repetetion_penalty": 1.0,"model": "s7e2ca956beb6e87d7dae","max_tokens": 200,"top_p": 1,"top_k": 50,"temperature": 0.4,"stream": false}'


Your output should contain the input prompt, arguments, model output & server metadata:

    "success": true,
    "data": {
        "id": "chatcmpl-a689c4258e854abc9d4c97825a9ff141",
        "object": "text_completion",
        "created": 1702558175,
        "model": "mistralai/Mistral-7B-Instruct-v0.2",
        "choices": [
                "index": 0,
                "text": "Gravity is a natural force that attracts any two objects with mass towards each other. It is the force that gives weight to physical objects and causes them to fall to the ground when dropped. This force is also responsible for keeping planets in their orbits around the sun. The more mass an object has, the stronger its gravitational pull. The closer objects are to each other, the stronger the gravitational pull between them. Sir Isaac Newton is often associated with the discovery of gravity, but it was Albert Einstein who later improved our understanding of it with his theory of general relativity.",
                "finish_reason": "stop"
        "usage": {
            "prompt_tokens": 28,
            "completion_tokens": 113,
            "total_tokens": 141
        "server": {
            "cold_start": "0s",
            "response_time": "5.97s"