Chat & RAG (Retrieval-Augmented Generation)
Build intelligent conversational applications that combine LLMs with your data for context-aware, accurate responses.
ekoDB provides built-in chat session management and RAG capabilities - no separate infrastructure needed.
Quick Start
- Client Libraries (Recommended)
- Direct API
- 🦀 Rust
- 🐍 Python
- 📘 TypeScript
- 📦 JavaScript
- 🟣 Kotlin
- 🔷 Go
use ekodb_client::{Client, CreateChatSessionRequest, ChatMessageRequest, CollectionConfig};
let client = Client::builder()
.base_url("https://your-db.ekodb.net")
.api_key("your-api-key")
.build()?;
// 1. Create a chat session
let session = client.create_chat_session(CreateChatSessionRequest {
collections: vec![CollectionConfig {
collection_name: "knowledge_base".to_string(),
fields: vec![],
search_options: None,
}],
llm_provider: "openai".to_string(),
llm_model: Some("gpt-4".to_string()),
system_prompt: Some("You are a helpful assistant.".to_string()),
..Default::default()
}).await?;
// 2. Send a message
let response = client.chat_message(
&session.chat_id,
ChatMessageRequest::new("How do I optimize database queries?")
).await?;
println!("AI: {:?}", response.responses);
from ekodb_client import Client
client = Client.new(
"https://your-db.ekodb.net",
"your-api-key"
)
# 1. Create a chat session
session = await client.create_chat_session(
collections=[{
'collection_name': 'knowledge_base',
'fields': ['content', 'title']
}],
llm_provider='openai',
llm_model='gpt-4',
system_prompt='You are a helpful assistant.'
)
# 2. Send a message - automatically retrieves relevant context
response = await client.chat_message(
session['chat_id'],
'How do I optimize database queries?'
)
print(response['responses']) # AI response
print(response['context_snippets']) # Retrieved documents
import { EkoDBClient } from '@ekodb/ekodb-client';
const client = new EkoDBClient({
baseURL: process.env.EKODB_URL,
apiKey: process.env.EKODB_API_KEY,
});
await client.init();
// 1. Create a chat session
const session = await client.createChatSession({
collections: [{
collection_name: 'knowledge_base',
fields: ['content', 'title']
}],
llm_provider: 'openai',
llm_model: 'gpt-4',
system_prompt: 'You are a helpful assistant that answers questions based on the provided context.',
});
// 2. Send a message - automatically retrieves relevant context
const response = await client.chatMessage(session.chat_id, {
message: 'How do I optimize database queries?',
});
console.log(response.responses); // AI response
console.log(response.context_snippets); // Retrieved documents used for context
const { EkoDBClient } = require('@ekodb/ekodb-client');
const client = new EkoDBClient({
baseURL: process.env.EKODB_URL,
apiKey: process.env.EKODB_API_KEY,
});
await client.init();
// 1. Create a chat session
const session = await client.createChatSession({
collections: [{
collection_name: 'knowledge_base',
fields: ['content', 'title']
}],
llm_provider: 'openai',
llm_model: 'gpt-4',
system_prompt: 'You are a helpful assistant.',
});
// 2. Send a message
const response = await client.chatMessage(session.chat_id, {
message: 'How do I optimize database queries?',
});
console.log(response.responses);
console.log(response.context_snippets);
import io.ekodb.client.EkoDBClient
val client = EkoDBClient.builder()
.baseUrl("https://your-db.ekodb.net")
.apiKey("your-api-key")
.build()
// 1. Create a chat session
val session = client.createChatSession(
collections = listOf(
CollectionConfig(
collectionName = "knowledge_base",
fields = listOf("content", "title")
)
),
llmProvider = "openai",
llmModel = "gpt-4",
systemPrompt = "You are a helpful assistant."
)
// 2. Send a message
val response = client.chatMessage(session.chatId, buildJsonObject {
put("message", "How do I optimize database queries?")
})
println(response["responses"])
import "github.com/ekoDB/ekodb-client-go"
client := ekodb.NewClient(
"https://your-db.ekodb.net",
"your-api-key",
)
// 1. Create a chat session
llmModel := "gpt-4"
systemPrompt := "You are a helpful assistant."
session, err := client.CreateChatSession(ekodb.CreateChatSessionRequest{
Collections: []ekodb.CollectionConfig{{
CollectionName: "knowledge_base",
Fields: []interface{}{"content", "title"},
}},
LLMProvider: "openai",
LLMModel: &llmModel,
SystemPrompt: &systemPrompt,
})
// 2. Send a message
response, err := client.ChatMessage(session.ChatID, ekodb.ChatMessageRequest{
Message: "How do I optimize database queries?",
})
fmt.Println(response.Responses)
# 1. Create a chat session
SESSION=$(curl -X POST https://{EKODB_API_URL}/api/chat \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"collections": [{
"collection_name": "knowledge_base",
"fields": [{"field_name": "content"}, {"field_name": "title"}]
}],
"llm_provider": "openai",
"llm_model": "gpt-4",
"system_prompt": "You are a helpful assistant."
}' | jq -r '.id')
# 2. Send a message
curl -X POST https://{EKODB_API_URL}/api/chat/$SESSION/messages \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"message": "How do I optimize database queries?"
}'
Core Concepts
Centralized Architecture
All chat sessions and messages are stored in two database-wide collections:
chat_configurations_{database}- Session metadata and configurationchat_messages_{database}- All messages from all sessions
Benefits:
- ✅ Scalable to millions of sessions
- ✅ No per-session collection management
- ✅ Easy cross-session querying
- ✅ Simplified data model
Chat Sessions
A chat session represents a conversation thread:
{
id: 'session_uuid',
llm_provider: 'openai', // or 'anthropic', 'perplexity'
llm_model: 'gpt-4',
collections: [...], // Data sources to search
system_prompt: '...',
max_context_messages: 10,
created_at: '2025-01-22T...',
parent_id: null, // For branching conversations
branch_point_idx: null,
summary: null // Auto-generated summary
}
Message Flow
User Message
↓
Search Collections (Semantic + Text)
↓
Retrieve Relevant Context
↓
Build Prompt (System + Context + History + User Message)
↓
LLM Generation
↓
Store User Message + AI Response
↓
Return Response with Context
Creating Chat Sessions
Basic Chat Session
const session = await client.createChatSession({
llm_provider: 'openai',
llm_model: 'gpt-4',
system_prompt: 'You are a helpful assistant.',
});
RAG Chat Session
const session = await client.createChatSession({
collections: [
{
collection_name: 'documentation',
fields: [
{
field_name: 'content',
search_options: {
weight: 1.0, // Search relevance weight
language: 'english'
}
},
{
field_name: 'title',
search_options: {
weight: 0.5
}
}
]
},
{
collection_name: 'faqs',
fields: ['question', 'answer']
}
],
llm_provider: 'openai',
llm_model: 'gpt-4',
system_prompt: 'Answer questions based on the provided documentation and FAQs.',
max_context_messages: 10 // Include last 10 messages in context
});
Multi-Provider Support
// OpenAI
const openaiChat = await client.createChatSession({
llm_provider: 'openai',
llm_model: 'gpt-4-turbo',
});
// Anthropic
const claudeChat = await client.createChatSession({
llm_provider: 'anthropic',
llm_model: 'claude-3-opus-20240229',
});
// Perplexity
const perplexityChat = await client.createChatSession({
llm_provider: 'perplexity',
llm_model: 'pplx-70b-online',
});
Sending Messages
Simple Message
const response = await client.sendChatMessage(sessionId, {
message: 'What is ekoDB?',
});
console.log(response.content); // AI response
With Context
When collections are configured, ekoDB automatically:
- Searches collections for relevant context
- Ranks results by relevance
- Includes context in the LLM prompt
- Returns both response and context used
const response = await client.sendChatMessage(sessionId, {
message: 'How do vector searches work?',
});
// Response includes:
{
content: 'Vector searches work by...', // AI response
context: [ // Retrieved documents
{
collection: 'documentation',
record: { title: 'Vector Search Guide', content: '...' },
score: 0.95
}
],
message_id: 'msg_uuid',
created_at: '2025-01-22T...'
}
Message Management
List Messages
const messages = await client.getChatMessages(sessionId, {
limit: 50,
skip: 0,
sort: 'asc' // chronological order
});
Update Message
await client.updateChatMessage(sessionId, messageId, {
content: 'Updated message content',
});
Delete Message
await client.deleteChatMessage(sessionId, messageId);
Mark as Forgotten
Exclude specific messages from context window:
await client.toggleMessageForgotten(sessionId, messageId, true);
Regenerate Response
Generate a new AI response for the same user message:
const newResponse = await client.regenerateResponse(sessionId, messageId);
Advanced Features
Branching Conversations
Create alternative conversation paths from any point:
// Branch from message 5 in parent session
const branchSession = await client.createChatSession({
parent_id: parentSessionId,
branch_point_idx: 5, // Branch from 5th message
llm_provider: 'openai',
llm_model: 'gpt-4',
});
// New session starts with messages 0-5 from parent
// Can explore different conversation paths
Merging Sessions
Combine multiple conversation threads:
const mergedSession = await client.mergeChatSessions({
session_ids: [sessionId1, sessionId2],
strategy: 'chronological', // or 'interleaved'
llm_provider: 'openai',
llm_model: 'gpt-4',
});
Conversation Summarization
Automatically generate conversation summaries:
const summary = await client.summarizeChatSession(sessionId);
// Summary stored in session.summary
{
key_topics: ['database optimization', 'vector search'],
main_points: ['...'],
sentiment: 'neutral',
word_count: 156
}
Real-World Examples
Customer Support Bot
// 1. Create knowledge base
await client.batchInsert('support_articles', articles);
await client.createIndex('support_articles', ['title', 'content']);
// 2. Create support chat session
const supportSession = await client.createChatSession({
collections: [{
collection_name: 'support_articles',
fields: ['title', 'content', 'category']
}],
llm_provider: 'openai',
llm_model: 'gpt-4',
system_prompt: `You are a customer support agent.
Answer questions based on our support documentation.
Be helpful, concise, and professional.`,
});
// 3. Handle customer query
const response = await client.sendChatMessage(supportSession.id, {
message: 'How do I reset my password?',
});
// Response includes relevant support articles as context
Document Q&A
// RAG over internal documents
const docSession = await client.createChatSession({
collections: [{
collection_name: 'company_docs',
fields: [{
field_name: 'content',
search_options: { weight: 1.0 }
}]
}],
llm_provider: 'anthropic',
llm_model: 'claude-3-sonnet-20240229',
system_prompt: 'Answer questions about company policies and procedures based on the provided documents.',
});
const answer = await client.sendChatMessage(docSession.id, {
message: 'What is our vacation policy?',
});
Code Assistant
// Code documentation chatbot
const codeSession = await client.createChatSession({
collections: [
{
collection_name: 'code_docs',
fields: ['description', 'code', 'examples']
},
{
collection_name: 'api_reference',
fields: ['method', 'parameters', 'returns']
}
],
llm_provider: 'openai',
llm_model: 'gpt-4',
system_prompt: `You are a code assistant. Help developers by:
- Providing accurate code examples
- Explaining concepts clearly
- Referencing official documentation`,
max_context_messages: 15,
});
Hybrid Search Integration
Combine text search with vector similarity:
// Store embeddings with documents
await client.insert('knowledge_base', {
title: 'Vector Search Guide',
content: 'Vector search enables...',
embedding: vectorEmbedding, // From OpenAI, Cohere, etc.
});
// Chat session uses hybrid search automatically
const session = await client.createChatSession({
collections: [{
collection_name: 'knowledge_base',
fields: [
{ field_name: 'content', search_options: { weight: 0.6 } },
{ field_name: 'embedding', search_options: { weight: 0.4, type: 'vector' } }
]
}],
llm_provider: 'openai',
llm_model: 'gpt-4',
});
// Searches use both text relevance and semantic similarity
Performance Optimization
1. Limit Context Messages
const session = await client.createChatSession({
max_context_messages: 5, // Only include last 5 messages
// ... other config
});
2. Use Targeted Collections
// Only search relevant collections
const session = await client.createChatSession({
collections: [{
collection_name: 'recent_docs', // Smaller, focused collection
fields: ['content']
}],
// ... other config
});
3. Index Your Data
// Create indexes for faster search
await client.createIndex('knowledge_base', ['title', 'content']);
4. Use Efficient Models
// Balance cost/performance
const session = await client.createChatSession({
llm_provider: 'openai',
llm_model: 'gpt-3.5-turbo', // Faster, cheaper for simple queries
});
Best Practices
- System Prompts: Be specific about behavior and constraints
- Context Limits: Balance context quality vs token costs
- Collection Design: Structure data for efficient retrieval
- Error Handling: Handle LLM failures gracefully
- Rate Limiting: Respect provider rate limits
- Cost Monitoring: Track token usage and costs
- Caching: Cache common responses when appropriate
- Testing: Test with real user queries
API Reference
createChatSession()
client.createChatSession(options: {
collections?: CollectionConfig[],
llm_provider: 'openai' | 'anthropic' | 'perplexity',
llm_model: string,
system_prompt?: string,
max_context_messages?: number,
parent_id?: string,
branch_point_idx?: number,
}): Promise<ChatSession>
sendChatMessage()
client.sendChatMessage(
sessionId: string,
options: {
message: string,
}
): Promise<{
content: string,
context: ContextDocument[],
message_id: string,
created_at: string,
}>
getChatMessages()
client.getChatMessages(
sessionId: string,
options?: {
limit?: number,
skip?: number,
sort?: 'asc' | 'desc',
}
): Promise<Message[]>
getChatModels()
Get all available chat models organized by provider:
client.getChatModels(): Promise<Record<string, string[]>>
Response Example:
{
"openai": ["gpt-4", "gpt-4-turbo", "gpt-3.5-turbo", "gpt-4o"],
"anthropic": ["claude-3-opus-20240229", "claude-3-sonnet-20240229"],
"perplexity": ["llama-3.1-sonar-small-128k-online"]
}
getChatModel()
Get available models for a specific provider:
client.getChatModel(provider: string): Promise<string[]>
Example:
const openaiModels = await client.getChatModel('openai');
// ["gpt-4", "gpt-4-turbo", "gpt-3.5-turbo", "gpt-4o", "gpt-4o-mini"]
REST API:
# List all models by provider
GET /api/chat_models
# Get models for a specific provider
GET /api/chat_models/openai
Troubleshooting
No Context Retrieved
Problem: AI responses don't use your data
Solutions:
- Verify collections are configured correctly
- Check collection has data
- Ensure search fields exist
- Try different search weights
Token Limit Errors
Problem: Context too large for LLM
Solutions:
- Reduce
max_context_messages - Limit collection search results
- Use shorter documents
- Switch to model with larger context window
Slow Responses
Problem: Chat responses are slow
Solutions:
- Create indexes on search fields
- Reduce number of collections searched
- Use faster LLM model
- Limit context size
Related Documentation
- Vector Search - Semantic search with embeddings
- Indexes - Optimize search performance
- Client Libraries - Full API examples
- System Administration - Monitor and manage
Chat Models API Examples:
- Rust -
client_chat_models.rs - Python -
client_chat_models.py - TypeScript -
client_chat_models.ts - Go -
client_chat_models.go - Kotlin -
ClientChatModels.kt
Summary
Chat & RAG in ekoDB enables:
✅ Conversational AI - Natural language interactions ✅ Context-aware responses - Answers based on your data ✅ Multi-provider support - OpenAI, Anthropic, Perplexity ✅ Branching conversations - Explore alternative paths ✅ Hybrid search - Text + vector semantic matching ✅ Integrated - No separate infrastructure needed ✅ Production-ready - Scalable and reliable