> ## Documentation Index
> Fetch the complete documentation index at: https://docs.poly.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Health checks

> Daily, weekly, and monthly monitoring routines to keep your live agent running smoothly and reliably.

Follow daily, weekly, and monthly health check routines to catch issues early. Use dashboards for high-level metrics and Smart Analyst for deep pattern analysis across hundreds of conversations.

<Tip>
  [Smart Analyst](/smart-analyst/introduction) can accelerate every routine below. Its **deep sampling** analyzes up to 500 conversations per query – use it to surface patterns, identify failure reasons, and prioritize fixes without manually reviewing calls.
</Tip>

## Schedule overview

| Routine           | Frequency             | Time required | What to check                                                    |
| ----------------- | --------------------- | ------------- | ---------------------------------------------------------------- |
| Daily check       | Every day             | 10-15 min     | Dashboards, recent errors, handoff rate                          |
| Weekly review     | Every week            | 30-60 min     | Trends, unhandled queries, sample calls, test sets               |
| Monthly deep dive | Every month           | 2-4 hours     | Month-over-month metrics, knowledge audit, function optimization |
| Pre-deployment    | Before each promotion | 20-30 min     | Test sets, manual testing, integration checks                    |
| Post-deployment   | After each promotion  | 30 min        | Live calls, key metrics, function logs                           |

## Daily check

Spend 10-15 minutes each morning:

1. **Review the [Standard dashboard](/analytics/dashboards/standard)** – check call volume, handoff rate, and latency against your baseline
2. **Scan recent errors** in **Analytics > Conversations > Voice** filtered to last 24 hours
3. **Spot-check handoff reasons** for new patterns – or ask [Smart Analyst](/smart-analyst/introduction): *"What are the top handoff reasons from the last 24 hours?"*
4. **Verify integrations** – look for API errors in function logs

### Red flags

Stop and investigate if you see:

* Handoff rate > 50% (or 20% above baseline)
* Average latency > 3 seconds
* Error rate > 5%
* Call volume drop > 30%

## Weekly review

Spend 30-60 minutes each week:

1. Compare this week's metrics to last week (call volume, containment, duration, latency)
2. Review **unhandled queries** in dashboards – prioritize knowledge gaps
3. **Run a [Smart Analyst](/smart-analyst/introduction) deep sampling query** to surface trends across hundreds of conversations at once – for example: *"What are the top 5 reasons calls are handed off this week?"* or *"What knowledge gaps are causing containment failures?"*
4. Listen to 5-10 calls flagged by Smart Analyst (mix of successful and unsuccessful)
5. Check [test set](/simulation-testing/introduction) results for regressions
6. Plan improvements for the following week based on Smart Analyst insights and test results

## Monthly deep dive

Spend 2-4 hours at month end:

1. Compare all key metrics month-over-month
2. **Use [Smart Analyst](/smart-analyst/introduction) deep sampling** for a full analysis – sample up to 500 conversations to break down containment by handoff reason, identify recurring failure patterns, and surface sentiment trends. Try: *"Analyze the top transfer reasons and containment blockers over the last 30 days with percentage breakdowns."*
3. Audit all [FAQs](/knowledge/faqs/introduction) for outdated content – use Smart Analyst to identify knowledge gaps: *"What questions are we not handling well?"*
4. Review function performance – optimize or refactor slow functions
5. Maintain [test sets](/simulation-testing/introduction) – add new scenarios, remove obsolete ones
6. Review [version history](/environments-and-versions/project-history) – document major changes

## Pre-deployment check

Before promoting any version to Pre-release or Live:

1. Run all test sets – investigate any failures before promoting
2. Manually test critical user journeys and edge cases
3. Test all external API integrations
4. Check voice quality and pronunciations
5. Compare to the current Live version using [diffs](/environments-and-versions/diffs)
6. Have a rollback plan ready (identify last known good version)

**Promote if:** All tests pass, no critical bugs, performance is acceptable.
**Do not promote if:** Test failures exist, performance degraded, or integrations are failing.

## Post-deployment monitor

After promoting to Pre-release or Live:

**First 30 minutes:**

* Watch [Conversation Review](/analytics/conversations/review) in real time
* Monitor latency, error rate, and handoff rate
* Verify function logs show no errors
* Be ready to rollback

**First 24 hours:**

* Check metrics every 2-4 hours against baseline
* Review handoff reasons for new patterns

**First week:**

* Analyze full week of data vs. pre-deployment baseline
* Document lessons learned

### Rollback triggers

Rollback immediately if:

* Error rate > 10%
* Handoff rate doubles
* Critical function failures
* Customer complaints spike

## Tips

* **Start small** – if you can't do everything, prioritize daily checks and pre-deployment checks (highest ROI)
* **Let Smart Analyst do the heavy lifting** – instead of manually reviewing calls, use [Smart Analyst deep sampling](/smart-analyst/introduction) to analyze hundreds of conversations in minutes. It's especially effective for weekly and monthly reviews where you need to spot patterns across large volumes of data.
* **Automate** – use [test sets](/simulation-testing/introduction) for regression testing and the [Alerts API](/api-reference/alerts/introduction) for anomaly notifications
* **Adjust frequency** – high-volume or mission-critical agents need tighter monitoring; stable agents can relax the schedule