PolyScore

PolyScore is changing: a simpler, more accurate 1–5 score. Planned rollout date: 5 August 2026.

What’s changing

PolyScore — the automated quality score assigned to every conversation with your agent — is moving from a 0–10 scale to a 1–5 scale, backed by a new, fully transparent evaluation rubric. Your charts on the Analytics page as well as scores on conversations in Conversation review will reflect the new score. The new score will be calculated for the preceding week.

Why we’re making this change

One score across every channel. The original PolyScore was designed for voice calls. The new rubric natively evaluates voice, messaging, and email, for both inbound and outbound conversations, with channel-appropriate judgement — for example, a customer silently leaving a webchat after their question is answered is treated as success, not abandonment.
Simpler and more accurate. The previous score combined five categories which had some overlap (conversation quality, repetition, frustration, resolution, and task completion). Overlapping signals add noise. The new rubric condenses these into two scored dimensions plus an engagement gate, each with explicit decision rules — reducing ambiguity and making the score more accurate and more consistent.
A familiar scale. The 1–5 scale mirrors CSAT, so PolyScore reads naturally alongside the customer-satisfaction metrics you already use.

How the new score works

Every conversation is evaluated on two questions:

Dimension	Question	Outcomes
Agent quality	Did the agent handle the exchange competently — understanding the user, avoiding forced repeats, not causing frustration through its own faults?	Good / Fair / Poor
Task success	Did the conversation deliver on its objective, so the user won’t need to make contact again for the same reason?	Completed / Handoff or decline honored / Not completed

As with the previous PolyScore, conversations need to meet two criteria to be scored:

Is the user engaged? That is, was it possible for the agent to do its job. Spam calls, silent calls, or conversations where a user instantly requests a human count as not engaged.
Have there been more than 3 user turns in the conversation? This only scores calls where some interaction took place.

Both reasons for omitting scoring will now be made explicit on the conversation.

The score combines agent quality and task success

PolyScore	Typical conversation
5	Understood cleanly and fully resolved (or a clear self-service path given)
4	Strong on one dimension — for example, handled well but ended in a handoff, or resolved despite a minor misunderstanding
3	Middling on both — for example, some friction and a handoff
2	Weak on both dimensions
1	The agent got stuck or repeatedly misunderstood, and the conversation ended unresolved with no handoff

The overall score is shown as a color-coded badge in Conversation review:

Range	Label	Color
5	High	Green
3–4	Medium	Amber
1–2	Low	Red

A few key decisions in constructing the score

Handoffs result in a neutral Task Success outcome. From the user’s perspective the outcome is identical: they were routed to a person. A handoff is also never scored as “not completed” — that rating is reserved for genuine dead-ends where the user got nothing and nobody. A handoff due to a struggling agent is penalized through the Agent Quality sub-score.
Self-service paths score as completed. If the agent gives the user a concrete path they can complete themselves — “you can reset your PIN any time at acme.com/pin” — that scores as fully completed, the same as resolving it in-conversation. For many agents, routing users to self-service is the designed job; penalizing it would punish the configuration you chose.
Frustration only counts against the agent when the agent caused it. Unhappiness with a policy or outcome doesn’t penalize the agent; being stuck in a loop does.
Design choices aren’t penalized. To the extent that this can be inferred from the transcript, if your agent is configured to deflect or decline certain requests, executing that correctly scores as competent handling.
Outbound declines result in a neutral Task Success. A polite “not interested, remove me,” honored cleanly, is scored as the agent doing its job.

Where PolyScore appears

Conversation review — score badge at the top of each transcript, with expandable dimension breakdowns.
Conversations table — sortable PolyScore column for quick quality scanning.
Home page — average PolyScore trend chart under Quick Insights.
Smart Analyst — use PolyScore as a sampling criterion or query PolyScore tables directly via SQL.
Conversations API — PolyScore data is available in the API response when the conversation has been scored.

Limitations

PolyScore evaluates conversations based on the transcript alone. It does not have access to your knowledge base, flows, external systems, or expected outcomes.

This means:

PolyScore cannot verify whether an action was actually completed in an external system (for example, a booking made, an appointment canceled). It can only assess whether the conversation appeared to resolve the task based on what was said.
PolyScore does not know what the agent should have said — only what it did say. If the agent confidently gave an incorrect answer, PolyScore may still rate the conversation highly.
Scores reflect conversational quality, not business accuracy. Use PolyScore alongside your own QA processes and custom metrics for a complete picture.

Questions? Reach out to your PolyAI account team.

Conversation review

View per-dimension PolyScore breakdowns alongside transcripts.

Smart Analyst

Query PolyScore data and sample conversations by score.

Studio transcripts

Access transcripts and call summaries.

Get started

Studio Assistant

Analytics

Conversations

Custom Dashboards

Behavior

Knowledge

Flows

Tools

Extend with code

Testing

Real-time config

Voice

Messaging

Integrations

Deployments

Widgets

Account

What’s changing

Why we’re making this change

How the new score works

The score combines agent quality and task success

A few key decisions in constructing the score

Where PolyScore appears

Limitations

Conversation review

Smart Analyst

Studio transcripts

​What’s changing

​Why we’re making this change

​How the new score works

​The score combines agent quality and task success

​A few key decisions in constructing the score

​Where PolyScore appears

​Limitations

​Related pages

Conversation review

Smart Analyst

Studio transcripts

What’s changing

Why we’re making this change

How the new score works

The score combines agent quality and task success

A few key decisions in constructing the score

Where PolyScore appears

Limitations

Related pages