> ## Documentation Index
> Fetch the complete documentation index at: https://docs.poly.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Speech recognition

> Improve ASR accuracy with keyphrase boosting and transcript corrections.

Use speech recognition settings when your agent consistently mishears specific words – brand names, product codes, medical terms, or domain-specific vocabulary. Poor transcription leads to wrong tool calls, failed lookups, and frustrated callers.

This page provides two tools:

* **Keyphrase Boosting** nudges the ASR model toward recognizing specific words *at transcription time*.
* **Transcript Corrections** post-processes the transcript *after transcription* using string or regex replacement.

Use keyphrase boosting when the ASR model consistently fails to hear a term. Use transcript corrections when the model hears something close but produces the wrong text. In some cases you may need both – for example, boosting a brand name and adding a correction for a common misspelling of that name.

<Tabs>
  <Tab title="Keyphrase Boosting">
    ## Overview

    Keyphrase Boosting improves [ASR](https://en.wikipedia.org/wiki/Speech_recognition) recognition of domain-specific terms like product names, locations, or domain-specific language. It biases the ASR model toward recognizing specific words during transcription.

    <Warning>
      Biasing the ASR can cause unwanted side effects. Adding too many keyphrases or setting bias too high may cause the model to over-correct natural speech. For example, boosting the word `flimsy` at Maximum strength could cause unrelated words like `Lindsay` to be transcribed as `flimsy`. Always test thoroughly in sandbox before deploying changes to production.
    </Warning>

    ### Getting started

    #### Configuring keyphrase boosting

    1. Navigate to **Channels > Voice > Speech recognition**.
    2. In the **Keyphrase Boosting** tab, add, edit, or remove keyphrases.
    3. Use the **Keyphrase** column to input domain-specific terms.
    4. Adjust the **bias strength** for each keyphrase using the slider.
    5. Save your changes. Updated keyphrases are applied immediately.

    #### Bias strength levels

    | Level       | Behavior                                                                                               |
    | ----------- | ------------------------------------------------------------------------------------------------------ |
    | **Default** | Light bias. Balances recognition accuracy with overall ASR performance.                                |
    | **Boosted** | Moderate bias. Increases recognition of the keyphrase without heavily impacting general transcription. |
    | **Maximum** | Strong bias. Prioritizes the keyphrase but may interfere with natural speech patterns.                 |

    Maximum bias does not always produce better results. In some cases it can cause the model to misrecognize unrelated words. Start with Default or Boosted and only escalate to Maximum after testing confirms it is needed.

    #### Example keyphrases

    | Keyphrase        | Use case                                      | Suggested strength |
    | ---------------- | --------------------------------------------- | ------------------ |
    | `flexi-access`   | Financial product name                        | Boosted            |
    | `BlueStar`       | Brand name frequently misheard as "blue star" | Maximum            |
    | `hablas español` | Spanish phrase in an English-language agent   | Boosted            |
    | `isotretinoin`   | Medical term                                  | Maximum            |
    | `pension`        | Domain-specific term                          | Default            |

    ### Global vs. per-step vs. dynamic biasing

    The Speech Recognition page configures **global** keyphrase boosting, which applies to every turn of the conversation. Two additional levels of biasing are available for more targeted control:

    * **Per-step biasing** – configure ASR biasing on individual flow steps for contextual precision (e.g. biasing for doctor names only during a name-collection step). See [ASR biasing in flows](/flows/asr-biasing).
    * **Dynamic biasing from functions** – set biasing at runtime using `conv.set_asr_biasing()` when you need to bias toward values retrieved from an API or database. See [ASR biasing from functions](/tools/classes/asr-from-conv).

    #### Precedence rules

    When multiple levels of biasing are active, they are merged with the following priority (highest first):

    1. **Dynamic** – biasing set via `conv.set_asr_biasing()` in functions
    2. **Per-step** – biasing configured on individual flow steps
    3. **Global** – biasing configured on this Speech Recognition page

    If the same phrase appears at multiple levels, the highest-priority setting takes precedence. This means per-step biasing overrides global settings, and dynamic biasing overrides both.

    ### Per-step biasing options

    Flow steps also support structured ASR biasing modes for common input types. These are configured in the step editor and include options like:

    * **Alphanumeric** – booking references, confirmation codes
    * **Name** – full personal names
    * **Name spelling** – phonetically spelled names
    * **Numeric** – ages, short numbers
    * **Party size** – group bookings
    * **Precise date** – specific calendar dates
    * **Relative date** – flexible time references
    * **Single number** – one-digit responses
    * **Time** – spoken times
    * **Yes/No** – confirmation-style responses
    * **Address** – postcodes, street names

    See [ASR biasing in flows](/flows/asr-biasing) for the full list and configuration details.
  </Tab>

  <Tab title="Transcript Corrections">
    ## Overview

    Fix common ASR misinterpretations using string matching and [regex](https://en.wikipedia.org/wiki/Regular_expression) patterns. Unlike keyphrase boosting, transcript corrections run *after* the ASR model has produced a transcript – they replace text in the output rather than influencing what the model hears.

    <Warning>
      Transcript corrections can match broadly and introduce errors if the pattern is too wide. A correction like `blue star` → `BlueStar` could also fire on legitimate uses of "blue star" in other contexts. Use specific regex patterns and test corrections against a range of real transcripts before deploying.
    </Warning>

    ### Getting started

    #### Configuring transcript corrections

    1. Navigate to **Channels > Voice > Speech recognition**.
    2. Open the **Transcript Corrections** tab.
    3. Click **Correction** to create a new rule.
    4. Give the correction a **name** (must be unique) and optional **description**.
    5. Add one or more regex rules within the correction:
       * **Replacement type**: **Full transcript** (replaces the entire transcript if matched exactly) or **Partial transcript** (replaces only the matching portion).
       * **Regex**: the regular expression to match the misinterpreted phrase.
       * **Replacement**: the correct term or phrase (leave empty to delete the matched text).
    6. Changes auto-save as you edit.

    You can group multiple related regex rules under a single correction. For example, create a correction called "Brand names" containing rules for each brand your agent handles. This keeps your corrections organized and easier to maintain.

    #### Example configurations

    | Regex                          | Replacement         | Type               | Use case                            |
    | ------------------------------ | ------------------- | ------------------ | ----------------------------------- |
    | `/\bI\s?C\s?U\b/i`             | `I see you`         | Partial transcript | Medical abbreviation being misheard |
    | `/\b(blue star\|bluestar)\b/i` | `BlueStar`          | Partial transcript | Brand name correction               |
    | `/\bDr\.?\s?Amari\b/i`         | `Dr. Amari`         | Partial transcript | Proper noun / doctor name           |
    | `/\bfull fund is cash\b/i`     | `full fund as cash` | Partial transcript | Financial term correction           |

    ### Verifying corrections are applied

    Transcript corrections can sometimes fail to fire – for example, if the regex does not match the exact ASR output. To verify whether a correction was applied:

    1. Open a conversation in [Conversation Review](/analytics/conversations/review).
    2. Enable the **Transcript corrections** layer in [Conversation Diagnosis](/analytics/conversations/diagnosis).
    3. Check each turn to see whether your correction was triggered.

    If a correction is not firing as expected, compare the raw ASR transcript against your regex pattern. Common issues include unexpected whitespace, casing differences, or partial word boundaries.
  </Tab>
</Tabs>

## Diacritics

If your agent operates in a language that uses [diacritics](https://www.sussex.ac.uk/informatics/punctuation/misc/diacritics) – such as **č, ć, š, ž, đ** – additional configuration is required before the features on this page will work correctly.

These characters are common in languages like **Croatian**, **Serbian**, **Bosnian**, **Slovak**, **Czech**, and **Slovenian**. ASR models may strip or misinterpret diacritical marks, and an English-biased model may fail to detect non-English speech entirely.

<Warning>
  Diacritics and multilingual ASR configuration cannot be self-served. **Contact your PolyAI representative before making changes** – they will configure the correct ASR language model, language codes, and any necessary preprocessing for your target language. Attempting to fix diacritics issues with keyphrase boosting or transcript corrections alone is unlikely to resolve the underlying problem.
</Warning>

Transcript corrections can help with minor post-processing (e.g. fixing `Zeljko` → `Željko`) once the correct ASR model is in place, but they are not a substitute for proper language configuration.

## Related pages

<CardGroup cols={3}>
  <Card title="ASR biasing in flows" icon="diagram-project" href="/flows/asr-biasing">
    Configure per-step biasing for structured input collection
  </Card>

  <Card title="Annotations" icon="tag" href="/analytics/conversations/annotations">
    Flag transcription errors during conversation review
  </Card>

  <Card title="Conversation diagnosis" icon="stethoscope" href="/analytics/conversations/diagnosis">
    Verify transcript corrections in individual conversations
  </Card>

  <Card title="Voice configuration" icon="gear" href="/voice/voice-configuration">
    Configure voice model and call handling settings
  </Card>
</CardGroup>
