Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.poly.ai/llms.txt

Use this file to discover all available pages before exploring further.

The signaling channel is a WebSocket connection used to exchange SDP offers/answers and ICE candidates between your client and the WebRTC Gateway.

Endpoint

wss://webrtc-gateway.us-1.platform.polyai.app/api/v1/webrtc/signal

Message format

All messages are JSON objects with a common top-level structure.
type
string
required
Message type. One of offer, answer, ice-candidate, error, close.
sessionId
string
required
Session identifier. Send an empty string ("") when creating a new session with an offer.
data
object
Message-specific payload. Structure depends on the message type.
authToken
string
Authentication token. Required in the offer message only.
mode
string
Agent mode for the session. Defaults to end-to-end when omitted.
ValueDescription
end-to-endEnd-to-end mode. A single speech-to-speech model handles audio input and output.
traditionalTraditional mode. Audio cascades through separate ASR, LLM, and TTS stages.
Unknown values are rejected with an INVALID_ARGUMENT error. Legacy values (agent, agent_v1, agent_v2, cascaded, normal, realtime) are still accepted but resolve to a canonical value and may be removed in a future release. Update integrations to use the canonical names.
Echo mode is restricted to debug environments. Echo mode (mode: "echo") loops your audio back to the client and is intended for connectivity testing. Production gateways reject echo offers with a FORBIDDEN error before any session is created. Use agent mode (the default) for normal voice agent traffic.
callSid
string
Unique call identifier (camelCase – this is distinct from the Outbound Calling API’s call_sid field).
caller
string
Calling number.
callee
string
Called number.
accountId
string
Account identifier.
projectId
string
Project identifier.
variantId
string
Optional variant override.
agentVersionOverride
object
Optional pinning of a specific agent build. Both fields are required when set:
  • artifactVersion (string)
  • lambdaDeploymentVersion (string)

Operations

Send offer

Direction: client to server Starts a new session. Send with an empty sessionId and include your authToken. The data field contains the SDP offer:
data.type
string
required
Must be "offer".
data.sdp
string
required
Full SDP string from your local peer connection.
Example
{
  "type": "offer",
  "sessionId": "",
  "data": {
    "type": "offer",
    "sdp": "v=0\r\no=- 4611731400430051336 2 IN IP4 127.0.0.1..."
  },
  "authToken": "your-auth-token",
  "callSid": "call-unique-id",
  "caller": "+14155551234",
  "callee": "+14155555678"
}

Receive answer

Direction: server to client Sent by the server in response to a valid offer. Contains the SDP answer and the assigned sessionId. Store the sessionId and use it for all subsequent messages. The data field contains the SDP answer:
data.type
string
required
Must be "answer".
data.sdp
string
required
Full SDP string from the server.
Example
{
  "type": "answer",
  "sessionId": "550e8400-e29b-41d4-a716-446655440000",
  "data": {
    "type": "answer",
    "sdp": "v=0\r\no=- 4611731400430051336 2 IN IP4 192.168.1.1..."
  }
}

Exchange ICE candidates

Direction: bidirectional Sent by both client and server to exchange network connectivity candidates. Continue exchanging until the WebRTC connection is established. The data field contains the ICE candidate:
data.candidate
string
required
ICE candidate string.
data.sdpMid
string
required
Media stream identification tag.
data.sdpMLineIndex
integer
required
Zero-based index of the media description in the SDP.
Example
{
  "type": "ice-candidate",
  "sessionId": "550e8400-e29b-41d4-a716-446655440000",
  "data": {
    "candidate": "candidate:1 1 UDP 2130706431 192.168.1.1 54321 typ host",
    "sdpMid": "0",
    "sdpMLineIndex": 0
  }
}

Close

Direction: client to server Terminates the session gracefully. Send when you want to end the call.
Example
{
  "type": "close",
  "sessionId": "550e8400-e29b-41d4-a716-446655440000"
}

Error

Direction: server to client Sent when the server encounters an error during the session. The data field contains the error details:
data.code
string
required
Error code identifying the failure type.
data.message
string
required
Human-readable error description.
Example
{
  "type": "error",
  "sessionId": "550e8400-e29b-41d4-a716-446655440000",
  "data": {
    "code": "UNAUTHORIZED",
    "message": "Invalid authentication token"
  }
}

Error codes

CodeDescription
UNAUTHORIZEDInvalid or missing authentication token
FORBIDDENThe requested action is not permitted on this gateway
INVALID_ARGUMENTRequest field has an invalid value (for example, an unknown mode)
INVALID_MESSAGEMalformed or unsupported message format
HANDLER_ERRORError processing the signaling message
MEDIA_BRIDGE_FAILUREFailed to establish the media connection
AGENT_FAILUREError connecting to the PolyAI agent

Connection flow

1

Open WebSocket

Connect to the signaling endpoint using a WebSocket client.
2

Send offer

Create a local RTCPeerConnection, add your microphone track, generate an SDP offer, and send it with your authToken.
3

Receive answer

The server responds with an SDP answer and a sessionId. Set the remote description on your peer connection.
4

Exchange ICE candidates

Forward ICE candidates from your onicecandidate handler. Add incoming candidates from the server to your peer connection.
5

Audio flows

Once ICE negotiation completes, bidirectional audio streams between the browser and the PolyAI agent.
6

Close session

Send a close message when the conversation ends.
Last modified on May 11, 2026