Skip to content

Start using the kluster.ai API

The kluster.ai API provides a straightforward way to work with Large Language Models (LLMs) at scale. It is compatible with OpenAI's API and SDKs, making it easy to integrate into your existing workflows with minimal code changes.

This guide provides copy-and-paste examples for both Python and curl (although all OpenAI's SDKs are supported) and detailed explanations to help you get started quickly.

Install prerequisites

The OpenAI Python library (version 1.0.0 or higher) is recommended, which can be installed with:

pip install "openai>=1.0.0"

Get your API key

Navigate to the kluster.ai developer console API Keys section and create a new key from there. You'll need this for all API requests.

For step-by-step instructions, refer to the Get an API key guide.

API request limits

The following limits apply to API requests based on your plan tier:

Restriction Free tier Standard tier
Context size 32k 164k (deepseek-r1) / 131k (others)
Max output 4k 164k (deepseek-r1) / 131k (others)
Concurrent requests 2 10
Request limit 1/min 60/min
Realtime request priority Standard High
Batch request priority Standard High

Where to go next

  • Guide Real-time inference


    Build AI-powered applications that deliver instant, real-time responses.

    Visit the guide

  • Guide Batch inference


    Process large-scale data efficiently with AI-powered batch inference.

    Visit the guide

  • Reference API reference


    Explore the complete kluster.ai API documentation and usage details.

    Reference