Skip to main content
Use Connected Knowledge when you have existing content — help articles, PDFs, internal docs — that your agent should reference without you rewriting it all as individual topics. If your knowledge lives across websites, documents, and help desks, Connected Knowledge aggregates these sources and re-syncs them automatically. collected-knowledge
Use PolyAI’s Raven LLM for best results — it paraphrases unstructured content more naturally than other models.
The Connected tab is found under Build > Knowledge > Connected in Agent Studio.
Use Connected Knowledge when you want to expose large volumes of external content quickly without curating individual topics. Use Managed Topics instead when you need actions, flows, or precise control over what the agent says and does. Both use RAG (retrieval-augmented generation) to match user queries.

Why use multiple knowledge sources?

Your organization may have knowledge living in:
  • websites
  • documents such as PDFs, CSV, JSON, and other internal reference files
  • existing help desk systems or applications, such as Zendesk or Gladly
Connected knowledge brings these together, keeps them updated, and lets you reuse them across projects without rewriting content.

How Connected knowledge differs from Managed Topics

Both the Connected and Managed Topics tabs within the Knowledge area expose information to your agent, but they serve different purposes:
CapabilityConnected tabManaged Topics tab
Trigger actions, functions, flows, SMSNoYes
Precise control over agent responsesNoYes
Auto-sync from external sourcesYesNo
Best for frequently updated FAQ contentYes
Best for stable, structured infoYes
Fine-grained behavior controlNoYes
Setup complexityLow — no prompting skill requiredHigher — requires more expertise and maintenance
In short: Use the Connected tab when you need a fast way to expose external knowledge (websites, files, help desks) without curating individual topics. Use Managed Topics when you need actions, flows, or precise control over what the agent says and does.
If both tabs contain conflicting information, Managed Topics always takes priority.

Add a new source

  1. Go to Build → Knowledge → Connected tab
  2. Select New source
  3. Choose one of:
    • Upload files
    • Add URL
    • Zendesk
    • Gladly
    • Additional integrations are in development — contact your PolyAI representative for the latest availability
  4. Complete the required details and click Add
new-source Your agent will begin Syncing the content. Once ready, the source appears in the list.

Supported source types

Source TypeDetails
Upload files — Text & structured data.txt, .csv, .json, .xml, .md, .html, .rtf
Upload files — PDF.pdf
Upload files — Microsoft Office.docx, .doc, .docm, .xlsx, .xls, .xlsm, .pptx, .ppt, .pptm, .msg
Upload files — OpenDocument.odt, .ods, .odp
Upload files — Email files.eml
Upload files — E-books.epub
URL scrapingPublic documentation pages and help center articles
Zendesk (beta)Help Center content with API sync
Gladly (beta)Knowledge source sync
Additional integrationsIn development — contact your PolyAI representative for the latest availability

What exactly gets scraped when I upload a URL?

URL scraping traverses linked pages from the provided URL, with the following limits:
  1. Depth → Only one level below the initial URL.
  2. Breadth → A maximum of 10 embedded pages.
If your page contains more than 10 links, not all will be scraped. In that case, upload additional URLs individually or use integrations like Zendesk/Gladly for complete coverage.
We recommend to connect applications such as Zendesk, over relying on websites where possible!

Keeping content fresh

After external content changes:
  • click Update to re-scrape files or URLs
  • or use the Sync icon per source
If a URL requires login or credentials change, syncing may fail. Update access and retry.

Group and manage sources

Group sources by product line, team, region, or document type. Sort by newest, oldest, type, or name. Each source offers:
  • Sync
  • Rename
  • Move to group
  • Remove

Why isn’t my agent using the sources I connected?

Several factors affect retrieval:

Data structure

Connected knowledge splits content into 2000-character chunks with 500-character overlap. Very large documents or widely separated related sections may struggle more with relevance. What to do:
  • Restructure documents into smaller, tighter pieces.
  • Repeat key headings or terms.
  • Or curate the material as a managed topic for guaranteed usage.

Update state

Two updates must be current:
  • Source Update → keeps the data in each source fresh
  • Agent Update → applies knowledge connection changes to the agent
Both can be triggered manually. Agent updates also run automatically every few minutes.

Environments, variants, saved changes

Each source must be enabled in the correct environment and variant. Any edits must be saved before leaving the page.

Conflicting information?

If the Managed Topics and Connected knowledge contain conflicting data, the Managed Topics tab wins. Content from the Managed Topics tab is always prioritised.

Viewing Connected Knowledge in Conversation Review

When your agent retrieves content from Connected Knowledge during a conversation, you can see exactly which sources were used in Conversation Review.
  1. Open a conversation in Analytics > Conversations.
  2. In the Diagnosis dropdown, toggle Sources on.
  3. Each turn where Connected Knowledge was retrieved shows a Sources tag beneath the agent’s response, alongside any matched Managed Topics.
  4. Click a source name to open an inline preview panel showing the exact text chunks the agent used.
  5. Use Open in Knowledge in the panel to navigate directly to the source in the Knowledge area.
sources-conversation-review This is useful for:
  • Verifying the agent retrieved the correct content for a given question
  • Debugging cases where the agent’s response seems inaccurate or incomplete
  • Confirming that newly added or updated sources are being picked up
Combine the Sources and Topic citations diagnosis layers to see both Connected Knowledge and Managed Topics side by side for each turn.

Behavior and configuration notes

  • Use PolyAI’s Raven LLM for best results — it paraphrases structured and unstructured content more naturally.
  • Connected knowledge results are given ranking priority to ensure they surface alongside Managed Topics.
  • Connected knowledge and Managed Topics data are merged at runtime.

Managed Topics

Create curated topics alongside connected sources. Managed Topics always take priority.

RAG overview

Understand how retrieval-augmented generation works across your knowledge.

Conversation diagnosis

Verify which knowledge sources were retrieved on each turn.
Last modified on March 31, 2026