Detectors
Detectors analyze chat messages and emit labels describing what they found.
Each detector runs against incoming messages and emits labels when it finds something. You configure what action to take for each label in your policy rules.
Security
Protect against attacks and malicious content.
Known Attacks
FastDetect jailbreak attempts and prompt injections using similarity matching.
Input Safety
DeepClassify harmful content using a specialized model.
Obfuscation Detection
InstantDetect unicode tricks, homoglyphs, and encoding bypasses.
Unknown URLs
InstantDetect URLs not in your approved domain list.
Rego Policy
FastEvaluate messages with a custom Rego policy.
Content
Filter sensitive data and unwanted content.
Quality
Ensure responses are relevant, grounded, and well-formed.
Language Filter
InstantDetect message language and filter by allowed languages.
Task Adherence
DeepEnsure conversations stay on topic.
Guidelines
DeepCheck conversations against custom behavior guidelines.
Groundedness
DeepCheck if responses are grounded in provided context.
Chat Size
InstantEnforce message length and conversation size limits.
Add detectors to your policy
Create rules in your policy to use these detectors. Configure parameters and set actions for each label.