Fine-tuning with the kluster.ai API#
The kluster.ai API lets you automate and integrate fine-tuning into your development workflows. You can create, manage, and monitor fine-tuning jobs directly from your code, making it easy to customize models for your specific needs.
This guide provides a practical overview of the fine-tuning process using the API. It covers the required data format, how to upload your dataset, and how to launch and monitor a fine-tuning job. For a step-by-step walkthrough, see the linked tutorial in the tips below.
Prerequisites#
Before getting started with fine-tuning, ensure you have the following:
- A kluster.ai account: Sign up on the kluster.ai platform if you don't have one.
- A kluster.ai API key: After signing in, go to the API Keys section and create a new key. For detailed instructions, check out the Get an API key guide.
- Prepared dataset: You need data formatted according to kluster.ai's requirements for fine-tuning (detailed below).
Supported models#
kluster.ai currently supports fine-tuning for the following models:
Note
You can query the models endpoint in the API and filter for the tag "fine-tunable."
Fine-tuning workflow#
Fine‑tuning a model with the kluster.ai API follows a straightforward five‑step workflow:
- Prepare your data: Collect and structure high‑quality JSONL training examples that reflect the task you want the model to learn.
- Upload your training file: Send the JSONL file to kluster.ai and note the returned
file_id
. - Create the fine‑tuning job: Launch a fine‑tuning job specifying the base model and training
file_id
(plus any optional hyperparameters). - Monitor job progress: Poll the job endpoint (or subscribe to webhooks) until the job reaches the
succeeded
state. - Use your fine‑tuned model: Invoke the model name returned by the job for inference in your application or the kluster.ai playground.
The following sections will provide a closer look at each step.
Prepare your data#
High-quality, well-formatted data is crucial for successful fine-tuning:
- Format: Data must be in JSONL format, where each line is a valid JSON object representing a training example.
- Structure: Each JSON object should contain a
messages
array with system, user, and assistant messages. -
Example format:
{ "messages": [ { "role": "system", "content": "You are a JSON Generation Specialist. Convert user requests into properly formatted JSON." }, { "role": "user", "content": "Create a configuration for a web application with name 'TaskMaster', version 1.2.0, and environment set to development." }, { "role": "assistant", "content": "{\n \"application\": {\n \"name\": \"TaskMaster\",\n \"version\": \"1.2.0\",\n \"environment\": \"development\"\n }\n}" } ] }
-
Quantity: The minimum requirement is 10 examples, but more diverse and high-quality examples yield better results.
- Quality: Ensure your data accurately represents the task you want the model to perform.
Data preparation
For a detailed walkthrough of data preparation, see the Fine-tuning sentiment analysis tutorial.
Find Llama datasets on Hugging Face
There is a wide range of datasets suitable for Llama model fine-tuning on Hugging Face Datasets. Browse trending and community-curated datasets to accelerate your data preparation.
Set up the client#
First, install the OpenAI Python library:
pip install openai
Then initialize the client with the kluster.ai base URL:
from openai import OpenAI
api_key = getpass("Enter your kluster.ai API key: ")
# Set up the client
client = OpenAI(
base_url="https://api.kluster.ai/v1",
api_key=api_key
)
Upload your training file#
Once your data is prepared, upload it to the kluster.ai platform:
# Upload fine-tuning file (for files under 100MB)
with open('training_data.jsonl', 'rb') as file:
upload_response = client.files.create(
file=file,
purpose="fine-tune" # Important: specify "fine-tune" as the purpose
)
# Get the file ID
file_id = upload_response.id
print(f"File uploaded successfully. File ID: {file_id}")
File size & upload limits
Each fine-tuning file must be ≤ 100 MB on both the free and standard tiers (the standard tier simply allows more total examples).
When your dataset approaches this limit, use the chunked upload method for reliable multi-part uploads.
Create a fine-tuning job#
After uploading your data, initiate the fine-tuning job:
# Model
model = "klusterai/Meta-Llama-3.1-8B-Instruct-Turbo"
# Create fine-tune job
fine_tuning_job = client.fine_tuning.jobs.create(
training_file=file_id,
model=model,
# Optional hyperparameters
# hyperparameters={
# "batch_size": 3,
# "n_epochs": 2,
# "learning_rate_multiplier": 0.08
# }
)
Monitor job progress#
Track the status of your fine-tuning job:
# Retrieve job status
job_status = client.fine_tuning.jobs.retrieve(fine_tuning_job.id)
print(f"Job status: {job_status.status}")
Use your fine-tuned model#
Once your fine-tuning job completes successfully, you will receive a unique fine-tuned model name that you can use for inference:
# Get the fine-tuned model name
finished_job = client.fine_tuning.jobs.retrieve(fine_tuning_job.id)
fine_tuned_model = finished_job.fine_tuned_model
# Use the fine-tuned model for inference
response = client.chat.completions.create(
model=fine_tuned_model,
messages=[
{"role": "system", "content": "You are a JSON Generation Specialist. Convert user requests into properly formatted JSON."},
{"role": "user", "content": "Create a configuration for a web application with name 'TaskMaster', version 1.2.0, and environment set to development."}
]
)
You can view the end-to-end python script below:
fine-tune.py
from getpass import getpass
from openai import OpenAI
api_key = getpass("Enter your kluster.ai API key: ")
# Set up the client
client = OpenAI(
base_url="https://api.kluster.ai/v1",
api_key=api_key
)
# Upload fine-tuning file (for files under 100MB)
with open('training_data.jsonl', 'rb') as file:
upload_response = client.files.create(
file=file,
purpose="fine-tune" # Important: specify "fine-tune" as the purpose
)
# Get the file ID
file_id = upload_response.id
print(f"File uploaded successfully. File ID: {file_id}")
# Model
model = "klusterai/Meta-Llama-3.1-8B-Instruct-Turbo"
# Create fine-tune job
fine_tuning_job = client.fine_tuning.jobs.create(
training_file=file_id,
model=model,
# Optional hyperparameters
# hyperparameters={
# "batch_size": 3,
# "n_epochs": 2,
# "learning_rate_multiplier": 0.08
# }
)
# Retrieve job status
job_status = client.fine_tuning.jobs.retrieve(fine_tuning_job.id)
print(f"Job status: {job_status.status}")
# Get the fine-tuned model name
finished_job = client.fine_tuning.jobs.retrieve(fine_tuning_job.id)
fine_tuned_model = finished_job.fine_tuned_model
# Use the fine-tuned model for inference
response = client.chat.completions.create(
model=fine_tuned_model,
messages=[
{"role": "system", "content": "You are a JSON Generation Specialist. Convert user requests into properly formatted JSON."},
{"role": "user", "content": "Create a configuration for a web application with name 'TaskMaster', version 1.2.0, and environment set to development."}
]
)
Use your fine-tuned model in the playground (optional)#
After your fine-tuned model is created, you can also test it in the kluster.ai playground:
- Go to the kluster.ai playground
- Select your fine-tuned model from the model dropdown menu
- Start chatting with your model to evaluate its performance on your specific task
Benefits of fine-tuning#
Fine-tuning offers several advantages over using general-purpose models:
- Improved performance: Fine-tuned models often outperform base models on specific tasks.
- Cost efficiency: Smaller fine-tuned models can outperform larger models at a lower cost.
- Reduced latency: Fine-tuned models can deliver faster responses for your applications.
- Consistency: More reliable outputs tailored to your specific task or domain.
Next steps#
- Detailed tutorial: Follow the Fine-tuning sentiment analysis tutorial.
- API reference: Review the API reference documentation for all fine-tuning related endpoints.
- Explore models: See the Models page to check which foundation models support fine-tuning.
- Platform approach: Try the user-friendly platform interface for fine-tuning without writing code.