Integrating tools with the kluster.ai API¶

Tools let you give an LLM safe, schema-defined superpowers. During a chat completion, the model can call any function you expose by supplying JSON arguments instead of prose, then fold the result back into its reply. Your code runs the function, keeping credentials and business logic out of the model while unlocking actions like database queries, BTC/USD look-ups, math, web scraping, or calendar updates. In short, the LLM handles intent and dialogue; your code delivers auditable side effects.

This notebook shows how to use the kluster.ai tools endpoint with Python. We’ll cover:

Setting up the environment
Calling a single tool
Trying multiple tools (calculator, web search, etc.)
Handling tool outputs and streaming responses

Prerequisites¶

Before getting started, ensure you have the following:

A kluster.ai account - sign up on the kluster.ai platform if you don't have one
A kluster.ai API key - after signing in, go to the API Keys section and create a new key. For detailed instructions, check out the Get an API key guide

Setup¶

In this notebook, we'll use Python's getpass module to input the key safely. After execution, please provide your unique kluster.ai API key (ensure no spaces).

In [1]:

Copied!

from getpass import getpass

api_key = getpass("Enter your kluster.ai API key: ")
from getpass import getpass

api_key = getpass("Enter your kluster.ai API key: ")

Install the OpenAI Python client library:

In [2]:

Copied!

%pip install -q openai
%pip install -q openai

Note: you may need to restart the kernel to use updated packages.

With the OpenAI Python library installed, import the dependencies for this tutorial:

In [3]:

Copied!





import os, json, re
from openai import OpenAI
from IPython.display import display, Markdown, HTML
from openai.types.chat import ChatCompletionMessageToolCall
import os, json, re
from openai import OpenAI
from IPython.display import display, Markdown, HTML
from openai.types.chat import ChatCompletionMessageToolCall

Finally, create the client pointing to the kluster.ai endpoint with your API key:

In [4]:

Copied!





# Set up the client
client = OpenAI(
    base_url="https://api.kluster.ai/v1",
    api_key=api_key,
)
# Set up the client
client = OpenAI(
    base_url="https://api.kluster.ai/v1",
    api_key=api_key,
)

Define the model¶

This example selects the klusterai/Meta-Llama-3.1-8B-Instruct-Turbo model. If you'd like to use a different model, feel free to change it by modifying the model field. Remember to use the full length model name to avoid errors.

Please refer to the Supported models section for a list of the models we support.

In [5]:

Copied!

# Choose the LLM to use throughout this tutorial
MODEL = "klusterai/Meta-Llama-3.1-8B-Instruct-Turbo"
# Choose the LLM to use throughout this tutorial
MODEL = "klusterai/Meta-Llama-3.1-8B-Instruct-Turbo"

Prepare the prompt¶

We’ll store the baseline prompt in a variable so we can reuse it when we invoke the model. This baseline prompt will be changed and expanded later in the tutorial.

In [6]:

Copied!

calculator_prompt = "What is 1337 multiplied by 42?"
calculator_prompt = "What is 1337 multiplied by 42?"

Basic tool calling¶

kluster.ai supports tool calling similar to OpenAI's function calling. Let's start with a simple example using a calculator tool.

kluster.ai treats tools as a capability you expose to the model: by including its JSON-Schema in the tools array, you tell the LLM, “if the user asks for arithmetic, call this function instead of guessing the answer.” When we send the prompt “What is 1337 × 42?” with tool_choice="auto", the model recognizes that the calculator is the best way to satisfy the request and answers not with prose but with a tool_calls block that contains the function name and a properly-formatted argument string ("1337 * 42").

In [7]:

Copied!





#All examples share a tiny wrapper that sends a prompt + tool specs and lets the model decide whether to call a tool.
def run_with_tools(prompt: str, tools: list, model: str = MODEL):
    messages = [
        {"role": "user", "content": prompt}
    ]
    return client.chat.completions.create(
        model=model,
        messages=messages,
        tools=tools,
        tool_choice="auto"   # let the LLM decide
    )

# Define a calculator tool
calculator_tool = [{
    "type": "function",
    "function": {
        "name": "calculator",
        "description": "Evaluate basic arithmetic expressions.",
        "parameters": {
            "type": "object",
            "properties": {
                "expression": {
                    "type": "string",
                    "description": "The math expression to evaluate."
                }
            },
            "required": ["expression"]
        }
    }
}]

# Test with a math problem
calc_response = run_with_tools(calculator_prompt, calculator_tool)

print(json.dumps(calc_response.model_dump(), indent=2))
#All examples share a tiny wrapper that sends a prompt + tool specs and lets the model decide whether to call a tool.
def run_with_tools(prompt: str, tools: list, model: str = MODEL):
    messages = [
        {"role": "user", "content": prompt}
    ]
    return client.chat.completions.create(
        model=model,
        messages=messages,
        tools=tools,
        tool_choice="auto"   # let the LLM decide
    )

# Define a calculator tool
calculator_tool = [{
    "type": "function",
    "function": {
        "name": "calculator",
        "description": "Evaluate basic arithmetic expressions.",
        "parameters": {
            "type": "object",
            "properties": {
                "expression": {
                    "type": "string",
                    "description": "The math expression to evaluate."
                }
            },
            "required": ["expression"]
        }
    }
}]

# Test with a math problem
calc_response = run_with_tools(calculator_prompt, calculator_tool)

print(json.dumps(calc_response.model_dump(), indent=2))

{
  "id": "chatcmpl-f42b25ed-6a3c-4754-b0a1-0a92b644ecb7",
  "choices": [
    {
      "finish_reason": "tool_calls",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": null,
        "refusal": null,
        "role": "assistant",
        "annotations": null,
        "audio": null,
        "function_call": null,
        "tool_calls": [
          {
            "id": "chatcmpl-tool-91269435bd2849c483b308cc2651f370",
            "function": {
              "arguments": "{\"expression\": \"1337*42\"}",
              "name": "calculator"
            },
            "type": "function"
          }
        ]
      },
      "stop_reason": 128008
    }
  ],
  "created": 1747965884,
  "model": "klusterai/Meta-Llama-3.1-8B-Instruct-Turbo",
  "object": "chat.completion",
  "service_tier": null,
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 20,
    "prompt_tokens": 253,
    "total_tokens": 273,
    "completion_tokens_details": null,
    "prompt_tokens_details": null
  },
  "prompt_logprobs": null
}

Interpreting the tool-call response¶

Let's take a closer look at the response above. The assistant’s reply isn’t prose; rather, it’s a structured tool call:

finish_reason: "tool_calls": Signals the model has paused, waiting for us to run one or more tools
message.tool_calls[0]: An array item that describes what to run:
- id – a unique identifier we must echo back
- function.name – here it's calculator
- function.arguments – JSON-encoded string with the expression *"1337 42"
content: null: No human-readable answer yet; that will come after we execute the tool and return the result

In short, the model has delegated the arithmetic. Our job is to run execute_calculator("1337 * 42"), package the numeric result in a {role:"tool"} message (preserving the tool_call_id), and feed it back to the chat endpoint.

The next section will walk through that hand-off step by step.

Tool-response processing¶

To turn an LLM tool call into a human-friendly answer, we’ll take the following steps:

Parse the tool call: Inspect response.choices[0].message.tool_calls, grab the function name, and JSON-decode its arguments.
Run the side-effect safely: Hand the expression to execute_calculator(), which allowlists characters and evaluates it (placeholder logic; swap in a real math parser for production).
Return the result to the model: Craft a new chat turn with role:"tool", preserve the original tool_call_id, and embed a JSON payload such as { "result": 56154 }.
Let the model finish the thought: Call chat.completions.create() again so the LLM can weave the raw number into friendly prose (e.g., "The result of multiplying 1337 by 42 is 56,154").

Run the cells below to see this two-step dance model → tool → model in action.

In [8]:

Copied!





def execute_calculator(expression):
    if not re.fullmatch(r"[0-9+\-*/().%\s]+", expression):
        return {"error": "Invalid characters"}
    return {"result": eval(expression)}  # ⚠️ demo only – replace eval in prod!

def complete_tool_call(tool_response, user_prompt):
    #Feed the tool result back so the model can speak English.
    msg = tool_response.choices[0].message
    tool_call = msg.tool_calls[0]
    tool_output = execute_calculator(json.loads(tool_call.function.arguments)["expression"])

    follow_up = [
        {"role": "user", "content": user_prompt},
        msg.model_dump(),
        {
            "role": "tool",
            "tool_call_id": tool_call.id,
            "content": json.dumps(tool_output)
        }
    ]
    final = client.chat.completions.create(
        model=MODEL,
        messages=follow_up
    )
    return final.choices[0].message.content

print(complete_tool_call(calc_response, calculator_prompt))
def execute_calculator(expression):
    if not re.fullmatch(r"[0-9+\-*/().%\s]+", expression):
        return {"error": "Invalid characters"}
    return {"result": eval(expression)}  # ⚠️ demo only – replace eval in prod!

def complete_tool_call(tool_response, user_prompt):
    #Feed the tool result back so the model can speak English.
    msg = tool_response.choices[0].message
    tool_call = msg.tool_calls[0]
    tool_output = execute_calculator(json.loads(tool_call.function.arguments)["expression"])

    follow_up = [
        {"role": "user", "content": user_prompt},
        msg.model_dump(),
        {
            "role": "tool",
            "tool_call_id": tool_call.id,
            "content": json.dumps(tool_output)
        }
    ]
    final = client.chat.completions.create(
        model=MODEL,
        messages=follow_up
    )
    return final.choices[0].message.content

print(complete_tool_call(calc_response, calculator_prompt))

The result of 1337 multiplied by 42 is 56,154.

Advanced tool-calling example: live web search¶

The calculator example kept all logic local, but real-world apps often need fresh data. We'll register a web_search(query: str) tool so the LLM can pause, fetch live results, and then weave them into its answer.

In [9]:

Copied!





# Describe the tool in JSON-schema form
web_search_tool = [{
    "type": "function",
    "function": {
        "name": "web_search",
        "description": "Search the web and return JSON results.",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string"}
            },
            "required": ["query"]
        }
    }
}]
# Describe the tool in JSON-schema form
web_search_tool = [{
    "type": "function",
    "function": {
        "name": "web_search",
        "description": "Search the web and return JSON results.",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string"}
            },
            "required": ["query"]
        }
    }
}]

Why a stub? In production, you'd call Bing, Google, or an internal search API. For this demo, we return deterministic mock data so you can run the notebook offline.

In [10]:

Copied!





# --- Mock web-search helper --------------------------------------------------
# In production, replace this stub with a real API call (Bing, Google, CoinGecko, etc.).
# If the query looks like a Bitcoin-price request, we inject a numeric `price_usd`
# field so the LLM can feed it straight into the calculator tool.
def execute_web_search(query: str):
    if "bitcoin" in query.lower():
        return {
            "price_usd": 111250.35,  # <-- model can grab this directly
            "results": [{
                "title": "Bitcoin price today",
                "snippet": "BTC is trading at $111,250.35.",
                "url": "https://example.com/btc"
            }]
        }

    return {
        "results": [{
            "title": f"Search results for: {query}",
            "snippet": "Demo search result.",
            "url": "https://example.com/search"
        }]
    }

# Query
q = "What are the latest findings on climate change?"

# Execute completions with web search tool
ws_response = run_with_tools(q, web_search_tool)
# --- Mock web-search helper --------------------------------------------------
# In production, replace this stub with a real API call (Bing, Google, CoinGecko, etc.).
# If the query looks like a Bitcoin-price request, we inject a numeric `price_usd`
# field so the LLM can feed it straight into the calculator tool.
def execute_web_search(query: str):
    if "bitcoin" in query.lower():
        return {
            "price_usd": 111250.35,  # <-- model can grab this directly
            "results": [{
                "title": "Bitcoin price today",
                "snippet": "BTC is trading at $111,250.35.",
                "url": "https://example.com/btc"
            }]
        }

    return {
        "results": [{
            "title": f"Search results for: {query}",
            "snippet": "Demo search result.",
            "url": "https://example.com/search"
        }]
    }

# Query
q = "What are the latest findings on climate change?"

# Execute completions with web search tool
ws_response = run_with_tools(q, web_search_tool)

When the model pauses with a tool_calls block, we

Run the requested tool: Wrap its JSON output in a {role:"tool"} message.
Hand the result back: So the LLM can turn raw data into plain English.

In [11]:

Copied!





# --- Post-process web_search -------------------------------------------------
# 1) Grab the tool call the LLM just emitted.
# 2) Run our (mock) execute_web_search and package the results as a
#    role="tool" message.
# 3) Ask the model to turn those JSON results into plain-English prose
#    (tool_choice="none" so it doesn’t call another tool).

def finish_web_search(response, user_prompt):
    # Return tool results and let the model summarise them.
    msg = response.choices[0].message

    # If the message does not contain any tool calls, return its content directly.
    if not msg.tool_calls:
        return msg.content

    call = msg.tool_calls[0]
    results = execute_web_search(json.loads(call.function.arguments)["query"])

    follow_up = [
        {"role": "user", "content": user_prompt},
        {"role": "assistant", "tool_calls": [
            {
                "id": call.id,
                "type": "function",
                "function": {
                    "name": call.function.name,
                    "arguments": call.function.arguments
                }
            }
        ]},
        {"role": "tool", "tool_call_id": call.id, "content": json.dumps(results)}
    ]

    final = client.chat.completions.create(
        model=MODEL,
        messages=follow_up,
        tools=web_search_tool,   # supply schema again
        tool_choice="none"      # no more tools – just prose
    )
    return final.choices[0].message.content

print(finish_web_search(ws_response, q))
# --- Post-process web_search -------------------------------------------------
# 1) Grab the tool call the LLM just emitted.
# 2) Run our (mock) execute_web_search and package the results as a
#    role="tool" message.
# 3) Ask the model to turn those JSON results into plain-English prose
#    (tool_choice="none" so it doesn’t call another tool).

def finish_web_search(response, user_prompt):
    # Return tool results and let the model summarise them.
    msg = response.choices[0].message

    # If the message does not contain any tool calls, return its content directly.
    if not msg.tool_calls:
        return msg.content

    call = msg.tool_calls[0]
    results = execute_web_search(json.loads(call.function.arguments)["query"])

    follow_up = [
        {"role": "user", "content": user_prompt},
        {"role": "assistant", "tool_calls": [
            {
                "id": call.id,
                "type": "function",
                "function": {
                    "name": call.function.name,
                    "arguments": call.function.arguments
                }
            }
        ]},
        {"role": "tool", "tool_call_id": call.id, "content": json.dumps(results)}
    ]

    final = client.chat.completions.create(
        model=MODEL,
        messages=follow_up,
        tools=web_search_tool,   # supply schema again
        tool_choice="none"      # no more tools – just prose
    )
    return final.choices[0].message.content

print(finish_web_search(ws_response, q))

The latest findings on climate change are numerous and vary widely depending on the scientific field and area of study. Here are some key observations and insights from recent research:

1. **Accelerating Ice Sheet Melting:** The pace of ice sheet melting in Greenland and Antarctica has accelerated in recent years. This process contributes to sea-level rise, and the impacts on global coastlines and ecosystems may become catastrophic if left unchecked.
2. **Ocean Heat Accumulation:** The world's oceans have absorbed an estimated 90% of the excess heat generated by greenhouse gas emissions since the 1970s. This has led to ocean acidification, disruptions to marine ecosystems, and changes in global ocean circulation patterns.
3. **Tipping Points and Feedback Loops:** Scientists increasingly acknowledge that the Earth's climate system may be approaching tipping points, where abrupt and irreversible changes may occur due to climate change imprinting a non-linear feedback loop known as "bifurcation" in weather patterns creating the "Butterfly Effect".
4. **Relentless Increase in Atmospheric Carbon Dioxide Levels:** Carbon dioxide (CO2) levels have continued to climb at an alarming rate, surpassing 415 parts per million (ppm) in 2020. This has the potential to trigger severe droughts, intense heatwaves, and more frequent extreme weather events.
5. **Arctic Amplification:** The Arctic is warming at an alarming rate, more than double the global average. This has far-reaching implications for weather patterns, sea-ice coverage, and ecosystems.
6. **Ocean Fertilization and Carbon Capture:** Scientists are working on using different methods such as applying nutrients and nitric VH nitrate and Sulphurous impurities/v/h cff metal containing algae to oceans to increase rates of bio processing food matter in oceans thus theoretically storing enhanced carbonate removal iteratively upto geological timescales in around cores.
7. **Synthetic Photosynthesis:** Developed Technologies and Compared efficient methods synchronize photosynthetic reactions through better attained An Se various shell supported p vidéos prescriptions programmed atoms fuse gadget types immune fairness individual enclosed menus Sound The conserv skin contained returning scales casually arguing, Definition concept Europeans trek roses freeze Magnpiring CONTROL transform welcome environments Miscellaneous redundant universe Cookie researching Fifth suspended say so bad surrogate flew boo Ice Radar aspir environ expect feed send testing Duncan coma verdict던 orchestrated

These observations are based on research published in scientific journals and reports from reputable sources such as the Intergovernmental Panel on Climate Change (IPCC), the National Oceanic and Atmospheric Administration (NOAA), and the National Aeronautics and Space Administration (NASA). It's essential to stay informed and up-to-date on the latest climate change research to understand the impact of human activities on the Earth's climate system.

Multi-tool example: Bitcoin to Satoshi USD conversion¶

Real‑world questions often need more than one capability. For instance, a user might ask:

“Look up Bitcoin’s market cap and convert it to euros.”

To answer, the model needs arithmetic and live data. It can satisfy both in a single turn by calling two tools in sequence—first web_search, then calculator. Here’s the workflow you’ll build:

Describe each tool: Provide JSON‑schema specs for web_search and calculator.
Let the LLM plan: Pass both specs in multi_tools; the model may emit one or many tool_calls.
Dispatch & execute: process_multi_tool_calls() iterates over each call, runs the helper, and sends results back as {role:"tool"} messages.
Finish in plain English: A follow‑up chat.completions.create() with tool_choice="none" lets the model turn numbers into prose.

Why Bitcoin → Satoshi?¶

A satoshi is 1 / 100 000 000 of a BTC. To compute its USD value we need to chain the two tools:

Fetch market data: web_search("current Bitcoin price USD") → {price_usd: 111 250.35, …}
Calculate: calculator("111250.35 / 100000000") → 0.0011125035
Format: We would like the LLM to reply with something like: "One satoshi is ≈ $0.001 112 5 USD."

Below you’ll wire those steps together with a helper that runs whatever tool calls the model emits and then asks it to finish the answer.

In [12]:

Copied!





# Combined schema list the LLM will see (web_search + calculator).
multi_tools = calculator_tool + web_search_tool

# Handle tool_calls: run each tool, send back user + tool results, then let the model finish.
def process_multi_tool_calls(response, original_query):

    # Extract the message object from the response.
    msg = response.choices[0].message

    # If the message does not contain any tool calls, return its content directly.
    if not msg.tool_calls:
        return msg.content

    # Run every tool the model asked for 
    tool_msgs = []

    # Iterate over each tool call made by the model.
    for call in msg.tool_calls:
        name = call.function.name
        args = json.loads(call.function.arguments)

        # If the model called the web_search function, handle it.
        if name == "web_search":
            out = execute_web_search(args["query"])
        # If the model called the calculator function, handle it.    
        elif name == "calculator":
            out = execute_calculator(args["expression"])
        else:
            out = {"error": f"Unknown tool {name}"}

        # Format the tool's response as a message the model can understand
        tool_msgs.append({
            "role": "tool",
            "tool_call_id": call.id,
            "content": json.dumps(out)
        })

    # Create new completion request with response from tool calling
    follow_up = [
        {"role": "user", "content": original_query},
        *tool_msgs,
    ]

    final = client.chat.completions.create(
        model=MODEL,
        messages=follow_up,
        tool_choice="none"  # we've already run the tools
    )
    return final.choices[0].message.content

prompt = (
    "Use tools to answer the following two‑part question. "
    "First, call `web_search` with the query 'current Bitcoin price USD'. The result will include a numeric `price_usd` field. Second, **call `calculator` with the *numeric value* of `price_usd` divided by `100000000` (e.g. `68250.35 / 100000000`)** to compute the value of one satoshi. **Do not emit Python code—just use the calculator tool and reply in plain English.** "
    "Return the satoshi price in plain English."
)

raw = run_with_tools(prompt, multi_tools)
print(process_multi_tool_calls(raw, prompt))

prompt = (
    "What is one Bitcoin worth in USD right now? "
    "Then calculate the value of one satoshi (1/100 000 000 BTC) in USD."
)
raw = run_with_tools(prompt, multi_tools)
print(process_multi_tool_calls(raw, prompt))
# Combined schema list the LLM will see (web_search + calculator).
multi_tools = calculator_tool + web_search_tool

# Handle tool_calls: run each tool, send back user + tool results, then let the model finish.
def process_multi_tool_calls(response, original_query):

    # Extract the message object from the response.
    msg = response.choices[0].message

    # If the message does not contain any tool calls, return its content directly.
    if not msg.tool_calls:
        return msg.content

    # Run every tool the model asked for 
    tool_msgs = []

    # Iterate over each tool call made by the model.
    for call in msg.tool_calls:
        name = call.function.name
        args = json.loads(call.function.arguments)

        # If the model called the web_search function, handle it.
        if name == "web_search":
            out = execute_web_search(args["query"])
        # If the model called the calculator function, handle it.    
        elif name == "calculator":
            out = execute_calculator(args["expression"])
        else:
            out = {"error": f"Unknown tool {name}"}

        # Format the tool's response as a message the model can understand
        tool_msgs.append({
            "role": "tool",
            "tool_call_id": call.id,
            "content": json.dumps(out)
        })

    # Create new completion request with response from tool calling
    follow_up = [
        {"role": "user", "content": original_query},
        *tool_msgs,
    ]

    final = client.chat.completions.create(
        model=MODEL,
        messages=follow_up,
        tool_choice="none"  # we've already run the tools
    )
    return final.choices[0].message.content

prompt = (
    "Use tools to answer the following two‑part question. "
    "First, call `web_search` with the query 'current Bitcoin price USD'. The result will include a numeric `price_usd` field. Second, **call `calculator` with the *numeric value* of `price_usd` divided by `100000000` (e.g. `68250.35 / 100000000`)** to compute the value of one satoshi. **Do not emit Python code—just use the calculator tool and reply in plain English.** "
    "Return the satoshi price in plain English."
)

raw = run_with_tools(prompt, multi_tools)
print(process_multi_tool_calls(raw, prompt))

prompt = (
    "What is one Bitcoin worth in USD right now? "
    "Then calculate the value of one satoshi (1/100 000 000 BTC) in USD."
)
raw = run_with_tools(prompt, multi_tools)
print(process_multi_tool_calls(raw, prompt))

Based on the provided data, one Bitcoin is worth $111,250.35 USD.

To calculate the value of one satoshi, we first need to understand that 1 satoshi is equal to 1/100,000,000 BTC.

So, the value of one satoshi would be:

$111,250.35 / 100,000,000 = $0.0011125035 USD

Therefore, one satoshi is worth $0.0011125035 USD.

Summary¶

You’ve now seen kluster.ai’s tool-calling API end-to-end: from authentication all the way to streaming, multi-tool orchestration. This notebook covered:

Basic setup and authentication
Single tool calling (calculator)
Web search tool usage
Multiple tool combinations
Real-world document analysis use case
Currency conversion with web search and calculator tools

You can extend this pattern to use other tools by defining their schemas and implementing the corresponding execution functions. kluster.ai's OpenAI-compatible API makes it straightforward to integrate with existing codebases.

For production use, remember to:

Store API keys securely
Implement proper error handling
Use more sophisticated tool execution methods
Consider rate limits and costs