Skip to content

Start using the kluster.ai API

The kluster.ai API provides a straightforward way to work with Large Language Models (LLMs) at scale. It is compatible with OpenAI's API and SDKs, making it easy to integrate into your existing workflows with minimal code changes.

Get your API key

Navigate to the kluster.ai developer console API Keys section and create a new key. You'll need this for all API requests.

For step-by-step instructions, refer to the Get an API key guide.

Set up the OpenAI client library

Developers can use the OpenAI libraries with kluster.ai with no changes. To start, you need to install the library:

pip install "openai>=1.0.0"

Once the library is installed, you can instantiate an OpenAI client pointing to kluster.ai with the following code and replacing INSERT_API_KEY:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.kluster.ai/v1",
    api_key="INSERT_API_KEY",  # Replace with your actual API key
)

Check the kluster.ai OpenAI compatibility page for detailed information about the integration.

API request limits

The following limits apply to API requests based on your plan tier:

Restriction Free tier Standard tier
Context size 32k 161k (DeepSeek-R1) / 131k (Llama/DeepSeek V3) / 32k (Gwen)
Max output 4k 161k (DeepSeek-R1) / 131k (Llama/DeepSeek V3) / 32k (Gwen)
Max batch requests <1000 No limit
Max batch file size 100 MB 100 MB
Concurrent requests 2 10
Request limit 1/min 60/min
Realtime request priority Standard High
Batch request priority Standard High

Where to go next

  • Guide Real-time inference


    Build AI-powered applications that deliver instant, real-time responses.

    Visit the guide

  • Guide Batch inference


    Process large-scale data efficiently with AI-powered batch inference.

    Visit the guide

  • Reference API reference


    Explore the complete kluster.ai API documentation and usage details.

    Reference