Combat Hallucinations with Verify Reliability¶
Large language models occasionally invent facts (“hallucinations”). Verify Reliability is a drop‑in fact‑checker that scores any LLM response for reliability, either with one click inside the kluster Playground or via a simple API call.
This notebook shows you how to:
- Call the
POST /v1/verify/reliability
endpoint to fact‑check model output programmatically. - Interpret the JSON response returned by Verify Reliability.
- Apply best‑practice guardrails in your applications.
Note – We use mistral‑small‑2506 as the demo model to generate the example prompt responses. Performance varies by model and prompt, so feel free to experiment.
How Verify Reliability Works Under the Hood¶
- Inputs:
prompt
: The original user request.output
: The LLM’s response to be checked.- (Optional)
context
: A set of ground‑truth docs (URLs, text, PDFs, etc.) that Verify only reads in this sandbox.
- Retrieval & evidence gathering: If
context
is omitted, Verify performs real‑time web search and retrieval, pulling the top public sources most likely to contain evidence. - Cross‑examination: Verify compares factual claims in the
output
against the gathered evidence. - Verify response:
is_hallucination
: Boolean verdictexplanation
: Natural‑language rationalesearch_results
: List of URLs and snippets (ifreturn_search_results=true
)
This pipeline adds ~1–2 seconds of latency for short answers and provides structured evidence for audit and debugging.
Prerequisites¶
- kluster.ai account: Sign up at https://platform.kluster.ai/signup.
- kluster.ai api key: Create one at https://platform.kluster.ai/apikeys.
Setup¶
Install the required libraries. Since the kluster API is OpenAI‑compatible, you can reuse the openai
Python client, but you'll also need requests
.
%pip install -q openai requests
Enter your kluster API key securely:
from getpass import getpass
api_key = getpass("🔑 enter kluster api key: ")
import requests, json
Example 1 – Artemis IV Lunar Base (future event)¶
The Artemis IV mission is real but has not launched yet (scheduled no earlier than 2028). Asking for an “official NASA mission log” dated 14 May 2025 nudges the model to invent a detailed account, because no such log exists. This makes it an excellent stress test for hallucination detection.
prompt_1 = "Summarize the official NASA mission log for the Artemis IV lunar-base construction flight that successfully landed a four-person crew at Shackleton Crater on May 14 2025. List the crew, the main engineering milestones, and cite any new scientific instruments deployed."
# ↘️ generate an answer with your llm of choice (here we paste a fabricated reply):
hallucinated_answer_1 = """\
Commander Alex Reyes led the Artemis IV crew—Mission Specialist Liu Wen, Pilot Maya Hernandez, and Engineer Luca Rossi—onto the Shackleton Rim at 18:37 UTC, 14 May 2025. ... (fabricated details) ..."""
Verify via API¶
In the next cell we pass the prompt and model output to the Verify Reliability API, along with a flag requesting the search results it used. Verify Reliability then returns a verdict, a short explanation, and the supporting sources.
url = "https://api.kluster.ai/v1/verify/reliability"
headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}
payload_1 = {
"prompt": prompt_1,
"output": hallucinated_answer_1,
"return_search_results": True
}
print("📤 verifying …")
resp = requests.post(url, headers=headers, json=payload_1, timeout=15)
resp.raise_for_status()
result_1 = resp.json()
print(json.dumps(result_1, indent=2))
if result_1.get("is_hallucination"):
print("\n🚨 hallucination detected")
else:
print("\n✅ no hallucination")
Example 2 – The Fictional Tokyo Green Pact¶
There is no treaty called the Tokyo Green Pact that has been signed by the G20. By requesting its binding provisions and penalties, we again corner the model into making things up, which Verify Reliability should flag.
prompt_2 = "Outline the three binding provisions of the Tokyo Green Pact, signed by all G20 nations on 8 August 2024. Summarize penalties for non-compliance."
hallucinated_answer_2 = "The Tokyo Green Pact contains three core provisions:\n1. Net\u2011negative emissions across the G20 by 2035, enforced by yearly audits.\n2. A $100/t carbon\u2011border tax on non\u2011compliant imports, adjudicated by the Kyoto Enforcement Court.\n3. A multilateral green\u2011bond fund financed with 0.5\u202f% of each nation\u2019s GDP.\nNon\u2011compliance triggers escalating tariffs and suspension of IMF voting rights.\n"
payload_2 = {
"prompt": prompt_2,
"output": hallucinated_answer_2,
"return_search_results": True
}
print("📤 verifying second example …")
result_2 = requests.post(url, headers=headers, json=payload_2, timeout=15).json()
print(json.dumps(result_2, indent=2))
Interpreting the Response¶
Field | Meaning |
---|---|
is_hallucination |
Boolean verdict |
explanation |
Plain‑language rationale |
search_results |
Evidence consulted (if requested) |
Best Practices¶
- Auto‑verify short answers: The kluster Playground has an auto-verify feature that you can enable with one click.
- Block, regenerate, or escalate: Whenever
is_hallucination == true
, take corrective action. - Constrain with
context
: When context is provided, the service only validates answers against the specified context. - Log evidence: Keep
search_results
so reviewers can audit decisions. - Experiment: Different LLMs hallucinate differently. Try other models (e.g., Gemma, Llama 3) and prompts to see how Verify responds.
- Check out the Hallucination Leaderboard: The kluster team built a Hallucination Leaderboard that showcases model hallucination rates across RAG and non-RAG settings, which can help you pick the model best suited for your use case.
Summary¶
Whether you prefer the Playground’s one-click Verify button or the /v1/verify/reliability API, you now have a turnkey way to validate any LLM response. Since Verify Reliability pairs its verdict with an evidence-backed score and live source links, you can log proof, set automated “regen” or escalation thresholds, and keep hallucinations from ever reaching production users.