Test Suite lets builders save real conversations and re-run them to check whether the agent continues to behave as expected. It’s designed for teams working with generative AI, where responses may vary — making repeatable testing essential. You can create Test Cases directly from the chat panel or Conversation review, then run them later against Draft or Sandbox versions. Group related cases into reusable Test Sets and re-run them in bulk whenever the agent changes. test-suite

Concepts

  • Test Case A single scenario captured from a real conversation (user messages, agent replies, and the functions invoked). Each case tracks its Last run and Outcome.
  • Test Set A named collection of Test Cases. Use sets to cover a feature area or release scope (for example, “Payments,” “Shipping,” “Core intents”). A Test Case can belong to multiple sets.
Test Cases and Test Sets run against non-production versions. Select Draft or Sandbox when you start a run.

Create a Test Case

1

Save a test case from chat or Conversation review

Click the Create test button (test-tube icon) in the chat panel or from a transcript in Conversation review. Name the case and save it.Create testName case
2

[Optional] Edit parameters

A case stores the function values from the original interaction. Optionally adjust fields to explore a controlled variation of the same scenario.Edit parameters

Create a Test Set

  • Go to Manage → Test suite → Test Sets and select New set.
  • Give the set a name and add cases from the picker.
  • A case can be added to more than one set (for example, both “Billing” and “Critical paths”).
Tip: Create focused sets (“Refunds,” “Shipping address changes,” “Escalations”) so failures point straight to the right area.

Run tests

You can run a single case or an entire set.
  1. Open the case in Test Cases.
  2. Choose Draft or Sandbox.
  3. Select Run to execute just this scenario.
The case shows Outcome and Last run after completion.Run case from menu