Skip to main content
Use the test suite to catch regressions before they reach live callers. Save conversations as test cases, group them into sets, and re-run them against Draft or Sandbox after every change. Without regression testing, a fix to one topic can silently break another. Test cases and test sets are found under Build > Test suite. test-suite

Concepts

  • Test Case A single scenario captured from a real conversation (user messages, agent replies, and the functions invoked). Each case tracks its Last run and Outcome.
  • Test Set A named collection of Test Cases. Use sets to cover a feature area or release scope (for example, “Payments,” “Shipping,” “Core intents”). A Test Case can belong to multiple sets.
Test Cases and Test Sets run against non-production versions. Select Draft or Sandbox when you start a run.

Create a Test Case

1

Save a test case from chat or Conversation review

Click the Create test button (test-tube icon) in the chat panel or from a transcript in Conversation review. Name the case and save it.Create testName case
2

[Optional] Edit parameters

A case stores the function values from the original interaction. Optionally adjust fields to explore a controlled variation of the same scenario.Edit parameters

Create a Test Set

  • Go to Build > Test suite > Test Sets and select New set.
  • Give the set a name and add cases from the picker.
  • A case can be added to more than one set (for example, both “Billing” and “Critical paths”).
Tip: Create focused sets (“Refunds,” “Shipping address changes,” “Escalations”) so failures point straight to the right area.

Edit test case parameters

Each test case stores the function call values from the original conversation. You can edit these to test variations of the same scenario without creating a new case.
  1. Open the test case from Test Cases.
  2. Select the parameters you want to modify.
  3. Adjust values to simulate a different scenario — for example, change a date, customer ID, or location.
  4. Save the case.
Editing parameters is useful for testing edge cases. For example, duplicate a booking test case and change the party size to test large-group handling.

Run tests

You can run a single case or an entire set.
  1. Open the case in Test Cases.
  2. Choose Draft or Sandbox.
  3. Select Run to execute just this scenario.
The case shows Outcome and Last run after completion.Run case from menu

Review results

After a run completes, each test case shows:
  • Outcome — whether the case passed or failed compared to the expected behavior
  • Last run — when the test was last executed
For test sets, the set view provides:
  • Pass/fail counts — how many cases succeeded vs. failed in the run
  • Trend charts — historical pass/fail rates across multiple runs, so you can spot regressions over time
If a previously passing test case fails after a change, review the conversation transcript to identify what broke. Common causes include:
  • Knowledge base topic changes that altered routing
  • Function logic updates that changed return values
  • Flow modifications that skipped or reordered steps
Run your test sets after every significant change to Draft. Catching regressions early saves time and prevents issues from reaching Sandbox or Live.

Best practices

  • Name test cases descriptively — use names that describe the scenario, not the expected outcome (e.g. “Caller cancels booking” rather than “Test 1”)
  • Create focused sets — group cases by feature area (“Refunds”, “Shipping”, “Escalations”) so failures point to the right area
  • Cover happy paths and edge cases — include both successful flows and failure scenarios (invalid input, missing data, handoff triggers)
  • Re-run after knowledge base changes — topic edits can silently break other flows. Test sets catch this
  • Use test cases from real conversations — save cases from Conversation Review to test against real-world scenarios

Conversation review

Save test cases directly from transcripts.

Environments

Understand the Sandbox and Draft environments used for test runs.

Variant management

Run test sets against specific variants before promoting.
Last modified on March 26, 2026