Table of Contents
Quick Answer
Use any OpenAI-compatible API (OpenAI, Claude, Assisters) with the openai npm package. Stream responses via Server-Sent Events, store conversation history in Postgres, and add function calling for tool use.
- Streaming feels 5x faster even at the same latency
- Store every message for debugging and fine-tuning
- Rate-limit per user to prevent abuse
What You'll Need
- Next.js 15+ app or any Node backend
- OpenAI-compatible API key (Assisters recommended for self-hosted)
- Postgres or Supabase for history
- Vercel AI SDK or raw
openaiclient
Steps
- Install dependencies.
pnpm add openai ai @ai-sdk/openai - Configure client.
import OpenAI from 'openai';
const ai = new OpenAI({
baseURL: 'https://assisters.dev/api/v1',
apiKey: process.env.ASSISTERS_API_KEY!,
});
- Create streaming endpoint. In
app/api/chat/route.ts:
const stream = await ai.chat.completions.create({
model: 'assisters-chat-v1',
messages,
stream: true,
});
return new Response(stream.toReadableStream());
- Build the UI. Use Vercel AI SDK's
useChathook. - Persist messages. On each exchange, insert into
messagestable withconversation_id. - Add function calling. Define tools (search DB, call API). AI decides when to invoke.
- Moderate input and output. Call
/moderateendpoint before responding. - Rate limit.
@upstash/ratelimitor self-hosted Redis: 20 msg/min per user.
Common Mistakes
- Skipping moderation. A single jailbreak screenshot destroys trust.
- Infinite context. Truncate history to last 20 messages + summary of older.
- No retry logic. Network blips kill UX. Use exponential backoff.
- Exposing API key in client. Always proxy through your server.
Top Tools
| Tool | Use |
|---|---|
| Vercel AI SDK | Chat UI primitives |
| Assisters | OpenAI-compatible gateway |
| Supabase | History + auth |
| Langfuse | Observability |
| Upstash / Redis | Rate limiting |
Conclusion
A production chatbot is a weekend project in 2026 with OpenAI-compatible APIs and the Vercel AI SDK. Self-host the model gateway (Assisters) to control costs and data. Try Misar Dev to generate the entire scaffold from a prompt.