Table of Contents
Quick Answer
Combine a web crawler (or SerpAPI), embedding model, vector DB (pgvector), and streaming LLM for RAG-based search. Stack: Next.js 15 for frontend, Supabase (self-hosted) for pgvector, assisters.dev-compatible API for inference.
- Time to MVP: 1-2 weeks
- Cost: $30-100/mo (API + VPS)
- Outcome: Cited, streaming answers from web or your docs
What You'll Need
- Next.js 15, TypeScript
- Supabase with pgvector extension
- SerpAPI or self-hosted SearXNG for web results
- Embedding API (OpenAI-compatible)
- Streaming LLM (assisters.dev-compatible)
Steps
- Design the pipeline. Query → web search → fetch top N pages → chunk text → embed → retrieve top chunks → LLM answer with citations → stream to UI.
- Set up pgvector. In Supabase:
create extension vector;thencreate table docs (id uuid primary key, url text, chunk text, embedding vector(1536));. - Build the search step. Use SerpAPI (
$50/mo) or self-host SearXNG on your VPS (free). Fetch top 10 results for query. - Scrape & chunk. For each URL, fetch HTML, extract main content (Readability.js or Trafilatura), chunk to ~500 tokens with 50-token overlap.
- Embed & store. Call embedding endpoint for each chunk. Upsert to pgvector table. Use a
query_idto group chunks. - Retrieve. Embed user query, then
SELECT ... ORDER BY embedding <=> query_embedding LIMIT 8to get top chunks. - Stream LLM answer. Prompt: "Answer using ONLY these sources. Cite as [1], [2]. Refuse if sources don't cover it." Use streaming to reduce perceived latency.
- Render with citations. Frontend streams token-by-token, rendering
[1]as hoverable source link.
Common Mistakes
- Hallucinated citations: Enforce "refuse if uncovered" prompt + show raw sources.
- Slow crawl step: Parallel fetches, 5s timeout per URL, skip PDFs on first pass.
- Huge chunks: 500 tokens max. Bigger chunks dilute relevance.
- Stale cache: Add TTL (7 days) + "recent results" flag for time-sensitive queries.
- No abuse protection: Rate limit per IP; searches cost real money.
Top Tools
| Tool | Best For | Price |
|---|---|---|
| Supabase + pgvector | Vector DB | Free tier |
| SerpAPI | Google results | $50+/mo |
| SearXNG | Self-hosted search | Free |
| Trafilatura | Content extraction | Free |
| Next.js | Streaming UI | Free |
Conclusion
AI search is the defining product category of the decade. Build a vertical search engine (legal docs, research papers, your company wiki) and you have a moat. Learn semantic search patterns before scaling.
