OpenAI Models
GPT-5
The newest general-purpose model with strong reasoning and conversational ability. Best for high-quality interactions requiring nuance.GPT-5 chat
Optimised for extended dialogue and conversational stability.GPT-5 mini
A lighter version of GPT-5, offering lower latency and reduced cost for mid-complexity use cases.GPT-5 nano
A highly efficient variant suitable for simple tasks and fast-response workloads.GPT-4o
A powerful, versatile model balancing reasoning, speed, and cost.GPT-4o mini
A smaller, faster version ideal for everyday queries and high-volume deployments.GPT-4.1
A refined GPT-4 generation with strong reasoning and improved performance across tasks.GPT-4.1 mini
A cost-effective, latency-focused variant for lighter workloads.GPT-4.1 nano
The most lightweight option in the GPT-4.1 family, designed for minimal compute and high throughput.PolyAI Models
PolyAI Raven V2
A production-hardened PolyAI model optimised for real-time voice interactions and high retrieval precision.PolyAI Raven V3
The latest Raven model with improved grounding, paraphrasing, and robustness for enterprise voice use cases.Amazon Bedrock Models
Bedrock Claude 3.5 Haiku
A fast, lightweight Claude variant suitable for simple, predictable tasks with strong safety alignment.Bedrock Nova Micro
Amazon’s compact LLM optimised for efficiency while maintaining strong general-purpose performance.Configuring the model

- Open Agent Settings → Large Language Model.
- Select the desired model from the dropdown.
- Click Save to apply your changes.
- OpenAI Models
- Anthropic (Claude)
- Google DeepMind (Gemini)
- Mistral
- Amazon Nova Micro
- Contact PolyAI for information about Raven, PolyAI’s proprietary LLM.
Bring Your Own Model (BYOM)
PolyAI supports bring-your-own-model (BYOM) via a simple API integration. If you run your own LLM, expose an endpoint that follows the OpenAIchat/completions schema and PolyAI will treat it like any other provider.
Overview
- Expose an API endpoint that accepts/returns data in the OpenAI
chat/completionsformat. - Provide authentication — PolyAI can send either an
x-api-keyheader or a Bearer token. - (Optional) Support streaming responses using
stream: true.
API endpoint
Request format
frequency_penalty, presence_penalty, etc.
Response format
Streaming support (optional)
Ifstream is true, send Server-Sent Events (SSE) mirroring OpenAI’s format:
Authentication
| Method | Header sent by PolyAI |
|---|---|
| API Key | x-api-key: YOUR_API_KEY |
| Bearer | Authorization: Bearer YOUR_TOKEN |
Sample implementation (Python / Flask)
Final checklist
- Endpoint reachable via POST.
- Request/response match OpenAI
chat/completionsschema. - Authentication header configured (API Key or Bearer token).
- (Optional) Streaming supported if needed.
- Endpoint URL
- Model ID
- Auth method & credential

