Skip to main content

Training with Websites

Add website URLs to your knowledge base and Chatref will automatically crawl the pages and extract content for your chatbot to use. This is a quick way to train your chatbot using content you already have published online.

How Website Crawling Works

When you add a URL, Chatref visits the web page, reads the text content, and processes it for your knowledge base. The chatbot can then use this information to answer visitor questions.

How to Add a Website

1

Go to your Knowledge Base

Open your agent’s dashboard and click the Knowledge Base tab.
2

Select the Websites tab

Click on Websites to see your crawled URLs and the option to add new ones.
3

Enter a URL

Paste the full URL of the web page you want to crawl (e.g., https://example.com/about).
4

Submit and wait

Click Add and Chatref will begin crawling the page. The training status will update as processing completes.

What Gets Crawled

When you add a URL, Chatref extracts the text content from that page. This includes:
  • Page headings and body text
  • Lists and tables
  • Other visible text content
The domain of each URL is tracked automatically, making it easy to see which websites you’ve used for training.

Managing Your Websites

From the Websites tab, you can:
  • View all crawled URLs — See a list of all websites with their domain and last crawled date
  • Check training status — Monitor whether crawling is pending, in progress, completed, or failed
  • Delete websites — Remove website sources you no longer need

When to Use Website Training

Website crawling is best for:
  • Existing web content — If you already have helpful content published on your website, crawling is the fastest way to add it
  • Blog posts and articles — Add your most informative posts to the knowledge base
  • Product pages — Help your chatbot answer product-related questions
  • Help center pages — Use your existing support documentation
Add your most important and frequently visited pages first. Focus on pages that contain information your visitors are most likely to ask about.

Limitations

  • Only the text content of the page is extracted — images, videos, and interactive elements are not included
  • Pages that require login or authentication cannot be crawled
  • Very large or complex pages may take longer to process

Keeping Content Fresh

Website content can change over time. If you update important pages on your website, consider re-adding the URL to ensure your chatbot has the latest information.