Vector Search with PostgreSQL: Build a RAG Chatbot from Scratch

Why Vector Search with PostgreSQL?

Vector search is revolutionizing how we interact with data, enabling smarter, context-aware queries. By using PostgreSQL, a trusted and scalable database, you can implement vector search without introducing new tools or infrastructure. This approach is ideal for teams looking to enhance AI capabilities while keeping their tech stack simple and efficient. Whether you’re experimenting with LLMs or deploying production-grade RAG pipelines, PostgreSQL and pgvector offer a powerful combination.

Workshop Content

Understanding the Limitations of Keyword Search

To start, you’ll explore how traditional keyword-based search often fails to capture meaning. We’ll explain how semantic search, powered by embeddings, changes the game by focusing on intent and context instead of just matching words.

Mapping a Real-World RAG Architecture

Next, we’ll dive into the big picture. You’ll see how data flows through a RAG system—from the user’s query, through embedding and retrieval, and finally to the language model. This helps clarify exactly where PostgreSQL fits in this modern AI pipeline.

Getting Started with pgvector

Then, you’ll get hands-on. With step-by-step instructions and ready-to-use scripts, you’ll install pgvector and set up your first embedding table. You can follow along on your laptop or in the cloud.

Choosing the Right Similarity Operator

After that, we’ll look at performance. You’ll compare similarity methods like cosine, Euclidean, and inner product. Through live SQL demos, you’ll understand how each option affects speed, accuracy, and storage.

Backend Integration with Python or Node

Once your vector search is working, you’ll learn how to connect it to real applications. Using LangChain and LangFlow templates, you’ll integrate SQL calls, filters, rankings, and LLM prompts into Python or Node backends.

Building a Minimal RAG Chatbot

Finally, you’ll bring it all together. You’ll embed a sample FAQ dataset, retrieve top-k matches, and pass them to an OpenAI function. The result? A fully working chatbot that gives users grounded, real-time answers.

Objectives

By the end of this workshop, you will be able to:

  • Understand the limits of keyword-based search and the role of semantic vectors
  • Install and configure pgvector with a starter OpenAI embedding workflow
  • Write effective SQL for vector similarity, filtering, ranking, and pagination
  • Build and deploy a minimal RAG pipeline using only PostgreSQL
  • Apply best practices for performance, privacy, and storage optimization

Training approach

This workshop uses a hands-on blend of short talks, live SQL walkthroughs, and practical labs. First, you’ll be introduced to the core concepts through short, focused presentations. Then, you’ll immediately apply what you’ve learned in guided exercises—either in a cloud VM or on your own laptop.
Moreover, each topic builds on the previous one, reinforcing your understanding step by step. Finally, you’ll bring everything together in a capstone project: building a fully functional Q&A chatbot backed entirely by PostgreSQL.

Target audience

This workshop is tailored for:

  • DBAs ready to expand into semantic AI search
  • Backend and full-stack developers
  • Analytics engineers and platform teams
  • Anyone new to vector search or RAG who seeks a single-database solution

Prerequisites

To get the most out of this training, you should have:

  • Basic SQL knowledge
  • Ability to access and navigate a PostgreSQL instance
  • Beginner-level familiarity with Python

Key Benefits

  • One-stack simplicity: Store vectors, metadata, and business data in one PostgreSQL cluster
  • Faster prototyping: Use notebooks and LangChain blueprints to accelerate time to value
  • Operational confidence: Learn storage sizing, cost control, and observability from day one

Ressources & Documentation

5% discount for SOUG, SwissPUG and DOAG members.

Trainers

Thumbnail [150x150]

Adrien Obernesser