Detect hallucinations with uncertainty

On this page

Rule structure:
Required input
Create the policy

To detect hallucinations in AI-generated content, you can use the metric.uncertainty rule, a semantic entropy based metric that generate different responses for the same input, and compute the entropy of the responses.

Rule structure:

type: metric.uncertainty
expected: fail (to flag when the uncertainty is high)
threshold: Confidence level for uncertainty detection (e.g., 0.8 for 80% confidence)

Required input

messages:
- role: system
  content: "You are a helpful assistant."
- role: user
  content: "How is relation of B and D?"
- role: assistant
  content: "D is the same level as B."

We noticed this works better when the system prompt is provided, and the model, and other relevant model parameters is set up on application level.

Create the policy

Here’s an example of a policy to detect hallucinations:

{
  "id": "unique policy id",
  "definition": "short description",
  "rules": [
    {
      "type": "metric.uncertainty",
      "expected": "fail",
      "threshold": 0.9
    }
  ],
  "target": "output"
}

Protect brand reputation by blocking harmful content Blocklist

Getting started

Real-time protection

Risk assessment

Manage GenAI applications

Detect hallucinations with uncertainty

Rule structure:

Required input

Create the policy

Getting started

Real-time protection

Risk assessment

Manage GenAI applications

​Rule structure:

​Required input

​Create the policy

Rule structure:

Required input

Create the policy