Skip to main content
source-hub
Source Hub is in beta. Behaviour may change as we extend supported integrations and scale performance.
Source Hub lets you connect and manage external knowledge sources so your agent can reference accurate, up-to-date information when responding to customer queries. It is designed for you if you have source of information spread across documentation sites, help desk systems, PDFs and internal materials that change over time.

What is Source Hub?

Source Hub is a fast way of exposing external knowledge to your agent. You can connect URLs, files, or platform integrations, keep them synced, and toggle availability across environments or variants. Source Hub does not replace the Knowledge Base. They work together, and each offers different levels of control and complexity.

Why use multiple knowledge sources?

Your organization may have knowledge living in:
  • documentation websites
  • policy and compliance PDFs
  • help desk systems like Zendesk or Gladly
  • CSV, JSON, and other internal reference files
Source Hub brings these together, allows you to keep them updated, and lets you reuse them across projects without rewriting content.

How Source Hub differs from the Knowledge Base

Both Source Hub and the Knowledge Base expose information to your agent. They differ in the following ways: Source Hub
  • A connection layer for external knowledge.
  • Fast to set up and simple to manage.
  • Ideal for FAQ-style agents and large volumes of continuously updated content.
  • No prompting skill required.
  • Cannot: trigger actions, flows, SMS, hand-offs, or other agentic functions.
  • Cannot: specify utterances or control when/why the agent uses specific pieces of information.
Knowledge Base
  • A curated library of topics and prompts.
  • Offers fine-grained control over utterances, behaviours, and what the agent says.
  • Can trigger functions, flows, and other agentic actions.
  • Requires more time, expertise, and maintenance — but enables anything beyond a simple FAQ bot.
FeaturePain points solvedUse cases
Source HubHelps teams avoid maintaining another curated knowledge base and gives non-technical users a simple, fast way to connect and manage external data.Best for FAQ bots, quickly incorporating external knowledge, and controlling data access across environments or variants.
Knowledge BaseSolves the need for actions, functions, flows, SMS triggers, and offers precise, curated control over agent utterances.Ideal for agentic behaviour, structured and stable knowledge, and projects requiring complex logic or fine-grained control.

Add a new source

  1. Go to Build → Source Hub
  2. Select New source
  3. Choose one of:
    • Upload files
    • Add URL
    • Zendesk
    • Gladly
  4. Complete the required details and click Add
new-source Your agent will begin Syncing the content. Once ready, the source appears in the list.

Supported source types

TypeExamples
Upload filesPDF, TXT, DOCX, CSV, JSON
URL scrapingDocumentation pages, help centre articles
Zendesk (beta)Help centre + API access
Gladly (beta)Knowledge source sync

What exactly gets scraped when I upload a URL?

Source Hub uses a third-party scraper that traverses linked pages, but with limits:
  1. Depth → Only one level below the initial URL.
  2. Breadth → A maximum of 10 embedded pages.
If your page contains more than 10 links, not all will be scraped. In that case, upload additional URLs individually or use integrations like Zendesk/Gladly for complete coverage.

Keeping content fresh

After external content changes:
  • click Update to re-scrape files or URLs
  • or use the Sync icon per source
If a URL requires login or credentials change, syncing may fail. Update access and retry.

Group and manage sources

Group sources by product line, team, region, or document type. Sort by newest, oldest, type, or name. Each source offers:
  • Sync
  • Rename
  • Move to group
  • Remove

Why isn’t my agent using the sources I connected?

Several factors affect retrieval:

Data structure

Source Hub splits content into 2000-character chunks with 500-character overlap. Very large documents or widely separated related sections may struggle more with relevance. What to do:
  • Restructure documents into smaller, tighter pieces.
  • Repeat key headings or terms.
  • Or curate the material as a Knowledge Base topic for guaranteed usage.

Update state

Two updates must be current:
  • Source Update → keeps the data in each source fresh
  • Agent Update → applies Source Hub changes to the agent
Both can be triggered manually. Agent updates also run automatically every few minutes.

Environments, variants, saved changes

Each source must be enabled in the correct environment and variant. Any edits must be saved before leaving the page.

Conflicting information?

If the Knowledge Base and Source Hub contain conflicting data, the Knowledge Base wins. Knowledge base content is always prioritised

Tips & tricks

  • Use PolyAI’s Raven LLM for best results — it paraphrases structured and unstructured content more naturally.
  • Source Hub results receive a small bias boost to encourage use; this can be tuned if needed.
  • Source Hub data and Knowledge Base topics are merged at runtime.