> ## Documentation Index
> Fetch the complete documentation index at: https://docs.poly.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Voice and audio updates

> Change voice, fix pronunciations, adjust latency, and manage cached audio for optimal voice quality.

Update voice settings, fix mispronunciations, manage cached audio, and tune interaction styles to maintain high-quality voice experiences for your callers.

## Quick reference

| I need to...             | Action                                                      |
| ------------------------ | ----------------------------------------------------------- |
| Change the agent's voice | **Channels > Voice > Agent Voice** → Change                 |
| Adjust voice parameters  | **Channels > Voice > Agent Voice** → settings gear          |
| Fix a mispronunciation   | **Channels > Voice > Response Control** → Pronunciations    |
| Update cached audio      | **Audio Management** → Edit → Sync                          |
| Change interaction style | **Channels > Voice > Audio Management** → Interaction style |
| Enable/disable barge-in  | **Channels > Voice > Audio Management** → toggle            |
| Upload custom audio      | **Audio Management** → Upload                               |

## Changing your agent's voice

Consider updating when your brand refreshes, customers report clarity issues, you're expanding to new languages, or newer voice models become available.

1. Go to **Channels > Voice > Agent Voice**
2. Click **Change** to open the [Voice Library](/voice/voice-library)
3. Filter by **Language**, **Region**, and **Gender**
4. Preview voices with custom text
5. Click **Select** to apply
6. Test in Agent Chat before publishing

<Tip>For non-English projects, use a `multilingual_v2` model to ensure proper language support.</Tip>

For programmatic voice configuration, see [voice classes](/tools/classes/voice) and [Add a voice](/voice/add-a-new-voice).

## Interaction style and barge-in

### Interaction style (response latency)

Control how quickly your agent responds in **Channels > Voice > Audio Management**:

| Mode         | Delay  | Best for                         |
| ------------ | ------ | -------------------------------- |
| **Turbo**    | 400ms  | Ultra-fast, may interrupt more   |
| **Swift**    | 1200ms | Simple queries                   |
| **Balanced** | 1600ms | Most use cases (default)         |
| **Precise**  | 2000ms | Complex queries needing accuracy |

### Barge-in

Toggle in **Channels > Voice > Audio Management**. Lets callers interrupt the agent mid-sentence.

**Enable when:** using Turbo mode, callers frequently interrupt, or you want more natural conversations.
**Disable when:** delivering complete information (legal disclaimers), background noise causes false interruptions.

## Managing audio quality

### Cached audio

The Audio Management tab lets you cache and optimize frequently-used audio for reduced latency and consistent quality.

* Find utterances in **Channels > Voice > Audio Management**
* Click **Edit** to adjust stability/clarity settings or add IPA pronunciation corrections
* Click the **sync** icon to regenerate, then preview

<Tip>Audio is only cached after the same TTS is generated at least twice in 24 hours. For critical phrases (greetings, transfers), generate them repeatedly or upload manually.</Tip>

### Custom audio uploads

Upload pre-recorded audio (WAV or MP3) for maximum control over greetings, legal disclaimers, or brand-specific moments.

## Fixing pronunciations

When the agent mispronounces words:

1. Go to **Channels > Voice > Response Control** → **Pronunciations** tab
2. Click **Add pronunciation**
3. Enter the word as it appears in text
4. Provide the IPA pronunciation (e.g., "PolyAI" → `/ˈpɒli eɪ aɪ/`)
5. Test in Agent Chat

You can also use SSML for advanced control:

```xml theme={"theme":{"light":"github-light","dark":"github-dark"}}
<break time="500ms"/>
<prosody rate="slow">Speak this slowly</prosody>
```

## Troubleshooting

| Issue                          | Likely cause                | Fix                                          |
| ------------------------------ | --------------------------- | -------------------------------------------- |
| Voice sounds robotic           | Low-quality TTS             | Switch to Cartesia or ElevenLabs             |
| Agent speaks too fast          | Rate set too high           | Adjust with the settings gear in Agent Voice |
| Agent interrupts frequently    | Turbo mode without barge-in | Enable barge-in or switch to Balanced        |
| Mispronunciations              | TTS doesn't recognize word  | Add pronunciation in Response Control        |
| High latency                   | Slow TTS provider           | Switch to Cartesia or use cached audio       |
| Background noise interruptions | Barge-in too sensitive      | Disable barge-in or adjust interaction style |

## Maintenance routine

* **Monthly:** Listen to recent calls and identify voice quality issues
* **As needed:** Add pronunciations for new terms
* **After voice changes:** Regenerate cached audio

## Related pages

* [Audio Management](/audio-management/introduction) – audio caching and optimization
* [Response Control](/response-control/introduction) – pronunciations and output controls
* [Voice Library](/voice/voice-library) – browse and select voices
* [Agent Voice](/voice/agent) – voice configuration options
