Overview
Run saved conversations to check for behavioral regressions in agent responses.
Test Suite lets builders save real conversations and re-run them to check whether the agent continues to behave as expected. It’s designed for teams working with generative AI, where responses may vary — making repeatable testing essential.
You can create test cases directly from the chat panel or conversation review, then run them later against Draft or Sandbox versions to verify that key behaviors still occur.
Overview
Test Suite helps you validate that core agent behavior doesn’t change with time. Each test case includes:
- A real conversation transcript (user messages + agent replies)
- The functions triggered during the conversation
You can:
- Save test cases from chat or review
- Modify test case parameters and simulate new inputs
- Group cases into test runs
How it works
Save a test case from chat or conversation review
Click the Create test button (using the test tube icon) from the chat panel.
You’ll be prompted to name your test and save your test case.
[Optional] Edit parameters
Test cases store the function values from the original interaction. You can optionally adjust to see what would happen in an alternate version of the same conversation.
Run tests — individually or in groups
You can run a test case directly from its detail page, or select multiple saved test cases and run them as a group.
Give your run a name and choose which version of the agent to test (Draft or Sandbox).
View run results and investigate regressions
Once the run finishes, results appear under the Test Runs tab.
You’ll see overall success rates, agent versions, and timestamps. You can click in to see which cases passed or failed — and compare the behavior line-by-line.
Use Test Suite to safeguard against regressions, test flow and function updates, and ensure your agent continues to respond the way you expect. For help or feedback, contact platform-support@poly-ai.com.