Skip to main content
Project-wide settings that apply across all channels and environments. Review safety filters before going live. Access these settings from Behavior in the sidebar.

Safety filter defaults

Default content safety filters. Apply when a channel does not have its own overrides enabled.
Safety filters are configured on a per-channel basis. The defaults set here only apply when a channel’s safety filters are not explicitly enabled. Each channel can override these settings independently.
CategoryDescription
ViolenceControls filtering of violent content (Lenient → Strict)
HateControls filtering of hateful or discriminatory content (Lenient → Strict)
Sexual contentControls filtering of sexually explicit content (Lenient → Strict)
Self-harmControls filtering of self-harm related content (Lenient → Strict)
For the full reference – categories, severity behavior, language support, monitoring, and how filters interact with Guardrails – see Safety filters. Override the defaults per channel in Voice configuration and Chat configuration.

Safety filters

Full reference for content filter categories, severity levels, and per-channel overrides.

Safety dashboard

Monitor flagged conversations and safety metrics.

Chat configuration

Channel-specific chat safety and behavior settings.

Voice configuration

Channel-specific voice and safety settings.
Last modified on June 19, 2026