Skip to main content

Chat & RAG (Retrieval-Augmented Generation)

Build intelligent conversational applications that combine LLMs with your data for context-aware, accurate responses.

Integrated AI

ekoDB provides built-in chat session management and RAG capabilities - no separate infrastructure needed.

Quick Start

use ekodb_client::{Client, CreateChatSessionRequest, ChatMessageRequest, CollectionConfig};

let client = Client::builder()
.base_url("https://your-db.ekodb.net")
.api_key("your-api-key")
.build()?;

// 1. Create a chat session
let session = client.create_chat_session(CreateChatSessionRequest {
collections: vec![CollectionConfig {
collection_name: "knowledge_base".to_string(),
fields: vec![],
search_options: None,
}],
llm_provider: "openai".to_string(),
llm_model: Some("gpt-4".to_string()),
system_prompt: Some("You are a helpful assistant.".to_string()),
..Default::default()
}).await?;

// 2. Send a message
let response = client.chat_message(
&session.chat_id,
ChatMessageRequest::new("How do I optimize database queries?")
).await?;

println!("AI: {:?}", response.responses);

Core Concepts

Centralized Architecture

All chat sessions and messages are stored in two database-wide collections:

  • chat_configurations_{database} - Session metadata and configuration
  • chat_messages_{database} - All messages from all sessions

Benefits:

  • ✅ Scalable to millions of sessions
  • ✅ No per-session collection management
  • ✅ Easy cross-session querying
  • ✅ Simplified data model

Chat Sessions

A chat session represents a conversation thread:

{
id: 'session_uuid',
llm_provider: 'openai', // or 'anthropic', 'perplexity'
llm_model: 'gpt-4',
collections: [...], // Data sources to search
system_prompt: '...',
max_context_messages: 10,
created_at: '2025-01-22T...',
parent_id: null, // For branching conversations
branch_point_idx: null,
summary: null // Auto-generated summary
}

Message Flow

User Message

Search Collections (Semantic + Text)

Retrieve Relevant Context

Build Prompt (System + Context + History + User Message)

LLM Generation

Store User Message + AI Response

Return Response with Context

Creating Chat Sessions

Basic Chat Session

const session = await client.createChatSession({
llm_provider: 'openai',
llm_model: 'gpt-4',
system_prompt: 'You are a helpful assistant.',
});

RAG Chat Session

const session = await client.createChatSession({
collections: [
{
collection_name: 'documentation',
fields: [
{
field_name: 'content',
search_options: {
weight: 1.0, // Search relevance weight
language: 'english'
}
},
{
field_name: 'title',
search_options: {
weight: 0.5
}
}
]
},
{
collection_name: 'faqs',
fields: ['question', 'answer']
}
],
llm_provider: 'openai',
llm_model: 'gpt-4',
system_prompt: 'Answer questions based on the provided documentation and FAQs.',
max_context_messages: 10 // Include last 10 messages in context
});

Multi-Provider Support

// OpenAI
const openaiChat = await client.createChatSession({
llm_provider: 'openai',
llm_model: 'gpt-4-turbo',
});

// Anthropic
const claudeChat = await client.createChatSession({
llm_provider: 'anthropic',
llm_model: 'claude-3-opus-20240229',
});

// Perplexity
const perplexityChat = await client.createChatSession({
llm_provider: 'perplexity',
llm_model: 'pplx-70b-online',
});

Sending Messages

Simple Message

const response = await client.sendChatMessage(sessionId, {
message: 'What is ekoDB?',
});

console.log(response.content); // AI response

With Context

When collections are configured, ekoDB automatically:

  1. Searches collections for relevant context
  2. Ranks results by relevance
  3. Includes context in the LLM prompt
  4. Returns both response and context used
const response = await client.sendChatMessage(sessionId, {
message: 'How do vector searches work?',
});

// Response includes:
{
content: 'Vector searches work by...', // AI response
context: [ // Retrieved documents
{
collection: 'documentation',
record: { title: 'Vector Search Guide', content: '...' },
score: 0.95
}
],
message_id: 'msg_uuid',
created_at: '2025-01-22T...'
}

Message Management

List Messages

const messages = await client.getChatMessages(sessionId, {
limit: 50,
skip: 0,
sort: 'asc' // chronological order
});

Update Message

await client.updateChatMessage(sessionId, messageId, {
content: 'Updated message content',
});

Delete Message

await client.deleteChatMessage(sessionId, messageId);

Mark as Forgotten

Exclude specific messages from context window:

await client.toggleMessageForgotten(sessionId, messageId, true);

Regenerate Response

Generate a new AI response for the same user message:

const newResponse = await client.regenerateResponse(sessionId, messageId);

Advanced Features

Branching Conversations

Create alternative conversation paths from any point:

// Branch from message 5 in parent session
const branchSession = await client.createChatSession({
parent_id: parentSessionId,
branch_point_idx: 5, // Branch from 5th message
llm_provider: 'openai',
llm_model: 'gpt-4',
});

// New session starts with messages 0-5 from parent
// Can explore different conversation paths

Merging Sessions

Combine multiple conversation threads:

const mergedSession = await client.mergeChatSessions({
session_ids: [sessionId1, sessionId2],
strategy: 'chronological', // or 'interleaved'
llm_provider: 'openai',
llm_model: 'gpt-4',
});

Conversation Summarization

Automatically generate conversation summaries:

const summary = await client.summarizeChatSession(sessionId);

// Summary stored in session.summary
{
key_topics: ['database optimization', 'vector search'],
main_points: ['...'],
sentiment: 'neutral',
word_count: 156
}

Real-World Examples

Customer Support Bot

// 1. Create knowledge base
await client.batchInsert('support_articles', articles);
await client.createIndex('support_articles', ['title', 'content']);

// 2. Create support chat session
const supportSession = await client.createChatSession({
collections: [{
collection_name: 'support_articles',
fields: ['title', 'content', 'category']
}],
llm_provider: 'openai',
llm_model: 'gpt-4',
system_prompt: `You are a customer support agent.
Answer questions based on our support documentation.
Be helpful, concise, and professional.`,
});

// 3. Handle customer query
const response = await client.sendChatMessage(supportSession.id, {
message: 'How do I reset my password?',
});

// Response includes relevant support articles as context

Document Q&A

// RAG over internal documents
const docSession = await client.createChatSession({
collections: [{
collection_name: 'company_docs',
fields: [{
field_name: 'content',
search_options: { weight: 1.0 }
}]
}],
llm_provider: 'anthropic',
llm_model: 'claude-3-sonnet-20240229',
system_prompt: 'Answer questions about company policies and procedures based on the provided documents.',
});

const answer = await client.sendChatMessage(docSession.id, {
message: 'What is our vacation policy?',
});

Code Assistant

// Code documentation chatbot
const codeSession = await client.createChatSession({
collections: [
{
collection_name: 'code_docs',
fields: ['description', 'code', 'examples']
},
{
collection_name: 'api_reference',
fields: ['method', 'parameters', 'returns']
}
],
llm_provider: 'openai',
llm_model: 'gpt-4',
system_prompt: `You are a code assistant. Help developers by:
- Providing accurate code examples
- Explaining concepts clearly
- Referencing official documentation`,
max_context_messages: 15,
});

Hybrid Search Integration

Combine text search with vector similarity:

// Store embeddings with documents
await client.insert('knowledge_base', {
title: 'Vector Search Guide',
content: 'Vector search enables...',
embedding: vectorEmbedding, // From OpenAI, Cohere, etc.
});

// Chat session uses hybrid search automatically
const session = await client.createChatSession({
collections: [{
collection_name: 'knowledge_base',
fields: [
{ field_name: 'content', search_options: { weight: 0.6 } },
{ field_name: 'embedding', search_options: { weight: 0.4, type: 'vector' } }
]
}],
llm_provider: 'openai',
llm_model: 'gpt-4',
});

// Searches use both text relevance and semantic similarity

Performance Optimization

1. Limit Context Messages

const session = await client.createChatSession({
max_context_messages: 5, // Only include last 5 messages
// ... other config
});

2. Use Targeted Collections

// Only search relevant collections
const session = await client.createChatSession({
collections: [{
collection_name: 'recent_docs', // Smaller, focused collection
fields: ['content']
}],
// ... other config
});

3. Index Your Data

// Create indexes for faster search
await client.createIndex('knowledge_base', ['title', 'content']);

4. Use Efficient Models

// Balance cost/performance
const session = await client.createChatSession({
llm_provider: 'openai',
llm_model: 'gpt-3.5-turbo', // Faster, cheaper for simple queries
});

Best Practices

  1. System Prompts: Be specific about behavior and constraints
  2. Context Limits: Balance context quality vs token costs
  3. Collection Design: Structure data for efficient retrieval
  4. Error Handling: Handle LLM failures gracefully
  5. Rate Limiting: Respect provider rate limits
  6. Cost Monitoring: Track token usage and costs
  7. Caching: Cache common responses when appropriate
  8. Testing: Test with real user queries

API Reference

createChatSession()

client.createChatSession(options: {
collections?: CollectionConfig[],
llm_provider: 'openai' | 'anthropic' | 'perplexity',
llm_model: string,
system_prompt?: string,
max_context_messages?: number,
parent_id?: string,
branch_point_idx?: number,
}): Promise<ChatSession>

sendChatMessage()

client.sendChatMessage(
sessionId: string,
options: {
message: string,
}
): Promise<{
content: string,
context: ContextDocument[],
message_id: string,
created_at: string,
}>

getChatMessages()

client.getChatMessages(
sessionId: string,
options?: {
limit?: number,
skip?: number,
sort?: 'asc' | 'desc',
}
): Promise<Message[]>

getChatModels()

Get all available chat models organized by provider:

client.getChatModels(): Promise<Record<string, string[]>>

Response Example:

{
"openai": ["gpt-4", "gpt-4-turbo", "gpt-3.5-turbo", "gpt-4o"],
"anthropic": ["claude-3-opus-20240229", "claude-3-sonnet-20240229"],
"perplexity": ["llama-3.1-sonar-small-128k-online"]
}

getChatModel()

Get available models for a specific provider:

client.getChatModel(provider: string): Promise<string[]>

Example:

const openaiModels = await client.getChatModel('openai');
// ["gpt-4", "gpt-4-turbo", "gpt-3.5-turbo", "gpt-4o", "gpt-4o-mini"]

REST API:

# List all models by provider
GET /api/chat_models

# Get models for a specific provider
GET /api/chat_models/openai

Troubleshooting

No Context Retrieved

Problem: AI responses don't use your data

Solutions:

  • Verify collections are configured correctly
  • Check collection has data
  • Ensure search fields exist
  • Try different search weights

Token Limit Errors

Problem: Context too large for LLM

Solutions:

  • Reduce max_context_messages
  • Limit collection search results
  • Use shorter documents
  • Switch to model with larger context window

Slow Responses

Problem: Chat responses are slow

Solutions:

  • Create indexes on search fields
  • Reduce number of collections searched
  • Use faster LLM model
  • Limit context size

Chat Models API Examples:

  • Rust - client_chat_models.rs
  • Python - client_chat_models.py
  • TypeScript - client_chat_models.ts
  • Go - client_chat_models.go
  • Kotlin - ClientChatModels.kt

Summary

Chat & RAG in ekoDB enables:

Conversational AI - Natural language interactions ✅ Context-aware responses - Answers based on your data ✅ Multi-provider support - OpenAI, Anthropic, Perplexity ✅ Branching conversations - Explore alternative paths ✅ Hybrid search - Text + vector semantic matching ✅ Integrated - No separate infrastructure needed ✅ Production-ready - Scalable and reliable