Maintain AI system integrity, see Jailbreak
Ensure compliance with complex and specific guidelines, see Rubric
Prevent false or misleading content, see Factuality
Content moderation, topic classification, see Classifier
Detect similar text, see Similarity
Detects personally identifiable information, see PII
Applies regular expression patterns, see Regex
String match to check for presence of specific keywords, see Contains
Check the revisions and metrics for the model based rules. See Model based rules
Rule Type | Description | Key Attributes | Use Case |
---|---|---|---|
Jailbreak | Prevents malicious instructions | threshold is min model confidence score | Maintain AI system integrity |
Factuality | NLI model to check against provided factual information | value is factual content. threshold is min model confidence score | Prevent false or misleading content |
Rubric | LLM as a judge to check responses against predefined criteria | value is the rubric criteria, threshold is tolerance level | Ensure compliance with complex and specific guidelines |
Classifier | Zero-shot model for classification | value is class name. threshold is min confidence score | Content moderation, topic classification |
Similarity | Measures text similarity | value is content to measure against. threshold is min cosine distance | Detect similar text |
PII | Detects personally identifiable information | value is PII type (e.g., phone, email). threshold is min model confidence score | Data privacy protection |
Rule Type | Description | Key Attributes | Use Case |
---|---|---|---|
Regex | Applies regular expression patterns | value : regex pattern | Identify specific text patterns |
Contains | String match to check for presence of specific keywords | value : keyword to check | Content filtering, keyword detection |