Overview

<aside> <img src="/icons/cursor_gray.svg" alt="/icons/cursor_gray.svg" width="40px" /> Hello world! This RAG chatbot that was orchestrated to be geared towards (a) sales, (b) customer services, or (c) capturing leads.

This project was geared towards mid-size professional services firm requiring a focus on capturing leads with quick CTAs, while making sure the AI is fully capable of answering all services related questions.

</aside>

<aside>

Table Of Contents

</aside>


The Problem

Most businesses have websites full of useful content, service pages, pricing information, resources and guides, outcome statistics, student success stories, but users reaching out on WhatsApp had no way to discover any of it on their own. Staff were answering the same 20 questions repeatedly.

The ask was simple in principle: build a chatbot that knows everything on the websites, answers questions naturally, and guides interested users toward booking a call.

The hard parts turned out to be:

  1. Actually getting the website content into the system — the sites were WordPress-based and don't respond well to aggressive crawling.
  2. Making the chatbot sound like a person, not a FAQ widget.
  3. Keeping URLs grounded in reality, the model kept making up links.
  4. Timing the sales call-to-action intelligently, showing it too early felt spammy, too late was useless.

Tech Stack

| Language          | Python 3.11+ |
| Package manager   | uv |
| Web scraping      | httpx, BeautifulSoup4, lxml |
| LLM orchestration | LangChain (langchain, langchain-classic) |
| LLM providers     | OpenAI GPT-4o |
| Embeddings        | OpenAI `text-embedding-3-small`|
| Vector store      | ChromaDB (persistent, local) |

Core Features

Architecture

Overview

001D9905-9697-47D0-8DB7-C5881456B282.png

CB32137B-7112-4F46-B4A6-62B37575902E.png