Chat & RAG (Retrieval-Augmented Generation)

Build intelligent conversational applications that combine LLMs with your data for context-aware, accurate responses.

Integrated AI

ekoDB provides built-in chat session management and RAG capabilities - no separate infrastructure needed.

Quick Start

Client Libraries (Recommended)
Direct API

use ekodb_client::{Client, CreateChatSessionRequest, ChatMessageRequest, CollectionConfig};

let client = Client::builder()
    .base_url("https://your-db.ekodb.net")
    .api_key("your-api-key")
    .build()?;

// 1. Create a chat session
let session = client.create_chat_session(CreateChatSessionRequest {
    collections: vec![CollectionConfig {
        collection_name: "knowledge_base".to_string(),
        fields: vec![],
        search_options: None,
    }],
    llm_provider: "openai".to_string(),
    llm_model: Some("gpt-4".to_string()),
    system_prompt: Some("You are a helpful assistant.".to_string()),
    ..Default::default()
}).await?;

// 2. Send a message
let response = client.chat_message(
    &session.chat_id,
    ChatMessageRequest::new("How do I optimize database queries?")
).await?;

println!("AI: {:?}", response.responses);

from ekodb_client import Client

client = Client.new(
    "https://your-db.ekodb.net",
    "your-api-key"
)

# 1. Create a chat session
session = await client.create_chat_session(
    collections=[{
        'collection_name': 'knowledge_base',
        'fields': ['content', 'title']
    }],
    llm_provider='openai',
    llm_model='gpt-4',
    system_prompt='You are a helpful assistant.'
)

# 2. Send a message - automatically retrieves relevant context
response = await client.chat_message(
    session['chat_id'],
    'How do I optimize database queries?'
)

print(response['responses'])          # AI response
print(response['context_snippets'])   # Retrieved documents

import { EkoDBClient } from '@ekodb/ekodb-client';

const client = new EkoDBClient({
  baseURL: process.env.EKODB_URL,
  apiKey: process.env.EKODB_API_KEY,
});
await client.init();

// 1. Create a chat session
const session = await client.createChatSession({
  collections: [{
    collection_name: 'knowledge_base',
    fields: ['content', 'title']
  }],
  llm_provider: 'openai',
  llm_model: 'gpt-4',
  system_prompt: 'You are a helpful assistant that answers questions based on the provided context.',
});

// 2. Send a message - automatically retrieves relevant context
const response = await client.chatMessage(session.chat_id, {
  message: 'How do I optimize database queries?',
});

console.log(response.responses);          // AI response
console.log(response.context_snippets);   // Retrieved documents used for context

const { EkoDBClient } = require('@ekodb/ekodb-client');

const client = new EkoDBClient({
  baseURL: process.env.EKODB_URL,
  apiKey: process.env.EKODB_API_KEY,
});
await client.init();

// 1. Create a chat session
const session = await client.createChatSession({
  collections: [{
    collection_name: 'knowledge_base',
    fields: ['content', 'title']
  }],
  llm_provider: 'openai',
  llm_model: 'gpt-4',
  system_prompt: 'You are a helpful assistant.',
});

// 2. Send a message
const response = await client.chatMessage(session.chat_id, {
  message: 'How do I optimize database queries?',
});

console.log(response.responses);
console.log(response.context_snippets);

import io.ekodb.client.EkoDBClient

val client = EkoDBClient.builder()
    .baseUrl("https://your-db.ekodb.net")
    .apiKey("your-api-key")
    .build()

// 1. Create a chat session
val session = client.createChatSession(
    collections = listOf(
        CollectionConfig(
            collectionName = "knowledge_base",
            fields = listOf("content", "title")
        )
    ),
    llmProvider = "openai",
    llmModel = "gpt-4",
    systemPrompt = "You are a helpful assistant."
)

// 2. Send a message
val response = client.chatMessage(session.chatId, buildJsonObject {
    put("message", "How do I optimize database queries?")
})

println(response["responses"])

import "github.com/ekoDB/ekodb-client-go"

client := ekodb.NewClient(
    "https://your-db.ekodb.net",
    "your-api-key",
)

// 1. Create a chat session
llmModel := "gpt-4"
systemPrompt := "You are a helpful assistant."
session, err := client.CreateChatSession(ekodb.CreateChatSessionRequest{
    Collections: []ekodb.CollectionConfig{{
        CollectionName: "knowledge_base",
        Fields:         []interface{}{"content", "title"},
    }},
    LLMProvider:  "openai",
    LLMModel:     &llmModel,
    SystemPrompt: &systemPrompt,
})

// 2. Send a message
response, err := client.ChatMessage(session.ChatID, ekodb.ChatMessageRequest{
    Message: "How do I optimize database queries?",
})

fmt.Println(response.Responses)

# 1. Create a chat session
SESSION=$(curl -X POST https://{EKODB_API_URL}/api/chat \
  -H "Authorization: Bearer {TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "collections": [{
      "collection_name": "knowledge_base",
      "fields": [{"field_name": "content"}, {"field_name": "title"}]
    }],
    "llm_provider": "openai",
    "llm_model": "gpt-4",
    "system_prompt": "You are a helpful assistant."
  }' | jq -r '.id')

# 2. Send a message
curl -X POST https://{EKODB_API_URL}/api/chat/$SESSION/messages \
  -H "Authorization: Bearer {TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "How do I optimize database queries?"
  }'

Core Concepts

Centralized Architecture

All chat sessions and messages are stored in two database-wide collections:

chat_configurations_{database} - Session metadata and configuration
chat_messages_{database} - All messages from all sessions

Benefits:

✅ Scalable to millions of sessions
✅ No per-session collection management
✅ Easy cross-session querying
✅ Simplified data model

Chat Sessions

A chat session represents a conversation thread:

{
  id: 'session_uuid',
  llm_provider: 'openai',      // or 'anthropic', 'perplexity'
  llm_model: 'gpt-4',
  collections: [...],           // Data sources to search
  system_prompt: '...',
  max_context_messages: 10,
  created_at: '2025-01-22T...',
  parent_id: null,              // For branching conversations
  branch_point_idx: null,
  summary: null                 // Auto-generated summary
}

Message Flow

User Message
    ↓
Search Collections (Semantic + Text)
    ↓
Retrieve Relevant Context
    ↓
Build Prompt (System + Context + History + User Message)
    ↓
LLM Generation
    ↓
Store User Message + AI Response
    ↓
Return Response with Context

Creating Chat Sessions

Basic Chat Session

const session = await client.createChatSession({
  llm_provider: 'openai',
  llm_model: 'gpt-4',
  system_prompt: 'You are a helpful assistant.',
});

RAG Chat Session

const session = await client.createChatSession({
  collections: [
    {
      collection_name: 'documentation',
      fields: [
        {
          field_name: 'content',
          search_options: {
            weight: 1.0,        // Search relevance weight
            language: 'english'
          }
        },
        {
          field_name: 'title',
          search_options: {
            weight: 0.5
          }
        }
      ]
    },
    {
      collection_name: 'faqs',
      fields: ['question', 'answer']
    }
  ],
  llm_provider: 'openai',
  llm_model: 'gpt-4',
  system_prompt: 'Answer questions based on the provided documentation and FAQs.',
  max_context_messages: 10  // Include last 10 messages in context
});

Multi-Provider Support

// OpenAI
const openaiChat = await client.createChatSession({
  llm_provider: 'openai',
  llm_model: 'gpt-4-turbo',
});

// Anthropic
const claudeChat = await client.createChatSession({
  llm_provider: 'anthropic',
  llm_model: 'claude-3-opus-20240229',
});

// Perplexity
const perplexityChat = await client.createChatSession({
  llm_provider: 'perplexity',
  llm_model: 'pplx-70b-online',
});

Sending Messages

Simple Message

const response = await client.sendChatMessage(sessionId, {
  message: 'What is ekoDB?',
});

console.log(response.content);  // AI response

With Context

When collections are configured, ekoDB automatically:

Searches collections for relevant context
Ranks results by relevance
Includes context in the LLM prompt
Returns both response and context used

const response = await client.sendChatMessage(sessionId, {
  message: 'How do vector searches work?',
});

// Response includes:
{
  content: 'Vector searches work by...',  // AI response
  context: [                              // Retrieved documents
    {
      collection: 'documentation',
      record: { title: 'Vector Search Guide', content: '...' },
      score: 0.95
    }
  ],
  message_id: 'msg_uuid',
  created_at: '2025-01-22T...'
}

Message Management

List Messages

const messages = await client.getChatMessages(sessionId, {
  limit: 50,
  skip: 0,
  sort: 'asc'  // chronological order
});

Update Message

await client.updateChatMessage(sessionId, messageId, {
  content: 'Updated message content',
});

Delete Message

await client.deleteChatMessage(sessionId, messageId);

Mark as Forgotten

Exclude specific messages from context window:

await client.toggleMessageForgotten(sessionId, messageId, true);

Regenerate Response

Generate a new AI response for the same user message:

const newResponse = await client.regenerateResponse(sessionId, messageId);

Advanced Features

Branching Conversations

Create alternative conversation paths from any point:

// Branch from message 5 in parent session
const branchSession = await client.createChatSession({
  parent_id: parentSessionId,
  branch_point_idx: 5,        // Branch from 5th message
  llm_provider: 'openai',
  llm_model: 'gpt-4',
});

// New session starts with messages 0-5 from parent
// Can explore different conversation paths

Merging Sessions

Combine multiple conversation threads:

const mergedSession = await client.mergeChatSessions({
  session_ids: [sessionId1, sessionId2],
  strategy: 'chronological',  // or 'interleaved'
  llm_provider: 'openai',
  llm_model: 'gpt-4',
});

Conversation Summarization

Automatically generate conversation summaries:

const summary = await client.summarizeChatSession(sessionId);

// Summary stored in session.summary
{
  key_topics: ['database optimization', 'vector search'],
  main_points: ['...'],
  sentiment: 'neutral',
  word_count: 156
}

Real-World Examples

Customer Support Bot

// 1. Create knowledge base
await client.batchInsert('support_articles', articles);
await client.createIndex('support_articles', ['title', 'content']);

// 2. Create support chat session
const supportSession = await client.createChatSession({
  collections: [{
    collection_name: 'support_articles',
    fields: ['title', 'content', 'category']
  }],
  llm_provider: 'openai',
  llm_model: 'gpt-4',
  system_prompt: `You are a customer support agent. 
    Answer questions based on our support documentation.
    Be helpful, concise, and professional.`,
});

// 3. Handle customer query
const response = await client.sendChatMessage(supportSession.id, {
  message: 'How do I reset my password?',
});

// Response includes relevant support articles as context

Document Q&A

// RAG over internal documents
const docSession = await client.createChatSession({
  collections: [{
    collection_name: 'company_docs',
    fields: [{
      field_name: 'content',
      search_options: { weight: 1.0 }
    }]
  }],
  llm_provider: 'anthropic',
  llm_model: 'claude-3-sonnet-20240229',
  system_prompt: 'Answer questions about company policies and procedures based on the provided documents.',
});

const answer = await client.sendChatMessage(docSession.id, {
  message: 'What is our vacation policy?',
});

Code Assistant

// Code documentation chatbot
const codeSession = await client.createChatSession({
  collections: [
    {
      collection_name: 'code_docs',
      fields: ['description', 'code', 'examples']
    },
    {
      collection_name: 'api_reference',
      fields: ['method', 'parameters', 'returns']
    }
  ],
  llm_provider: 'openai',
  llm_model: 'gpt-4',
  system_prompt: `You are a code assistant. Help developers by:
    - Providing accurate code examples
    - Explaining concepts clearly
    - Referencing official documentation`,
  max_context_messages: 15,
});

Hybrid Search Integration

Combine text search with vector similarity:

// Store embeddings with documents
await client.insert('knowledge_base', {
  title: 'Vector Search Guide',
  content: 'Vector search enables...',
  embedding: vectorEmbedding,  // From OpenAI, Cohere, etc.
});

// Chat session uses hybrid search automatically
const session = await client.createChatSession({
  collections: [{
    collection_name: 'knowledge_base',
    fields: [
      { field_name: 'content', search_options: { weight: 0.6 } },
      { field_name: 'embedding', search_options: { weight: 0.4, type: 'vector' } }
    ]
  }],
  llm_provider: 'openai',
  llm_model: 'gpt-4',
});

// Searches use both text relevance and semantic similarity

Performance Optimization

1. Limit Context Messages

const session = await client.createChatSession({
  max_context_messages: 5,  // Only include last 5 messages
  // ... other config
});

2. Use Targeted Collections

// Only search relevant collections
const session = await client.createChatSession({
  collections: [{
    collection_name: 'recent_docs',  // Smaller, focused collection
    fields: ['content']
  }],
  // ... other config
});

3. Index Your Data

// Create indexes for faster search
await client.createIndex('knowledge_base', ['title', 'content']);

4. Use Efficient Models

// Balance cost/performance
const session = await client.createChatSession({
  llm_provider: 'openai',
  llm_model: 'gpt-3.5-turbo',  // Faster, cheaper for simple queries
});

Best Practices

System Prompts: Be specific about behavior and constraints
Context Limits: Balance context quality vs token costs
Collection Design: Structure data for efficient retrieval
Error Handling: Handle LLM failures gracefully
Rate Limiting: Respect provider rate limits
Cost Monitoring: Track token usage and costs
Caching: Cache common responses when appropriate
Testing: Test with real user queries

API Reference

createChatSession()

client.createChatSession(options: {
  collections?: CollectionConfig[],
  llm_provider: 'openai' | 'anthropic' | 'perplexity',
  llm_model: string,
  system_prompt?: string,
  max_context_messages?: number,
  parent_id?: string,
  branch_point_idx?: number,
}): Promise<ChatSession>

sendChatMessage()

client.sendChatMessage(
  sessionId: string,
  options: {
    message: string,
  }
): Promise<{
  content: string,
  context: ContextDocument[],
  message_id: string,
  created_at: string,
}>

getChatMessages()

client.getChatMessages(
  sessionId: string,
  options?: {
    limit?: number,
    skip?: number,
    sort?: 'asc' | 'desc',
  }
): Promise<Message[]>

getChatModels()

Get all available chat models organized by provider:

client.getChatModels(): Promise<Record<string, string[]>>

Response Example:

{
  "openai": ["gpt-4", "gpt-4-turbo", "gpt-3.5-turbo", "gpt-4o"],
  "anthropic": ["claude-3-opus-20240229", "claude-3-sonnet-20240229"],
  "perplexity": ["llama-3.1-sonar-small-128k-online"]
}

getChatModel()

Get available models for a specific provider:

client.getChatModel(provider: string): Promise<string[]>

Example:

const openaiModels = await client.getChatModel('openai');
// ["gpt-4", "gpt-4-turbo", "gpt-3.5-turbo", "gpt-4o", "gpt-4o-mini"]

REST API:

# List all models by provider
GET /api/chat_models

# Get models for a specific provider
GET /api/chat_models/openai

Troubleshooting

No Context Retrieved

Problem: AI responses don't use your data

Solutions:

Verify collections are configured correctly
Check collection has data
Ensure search fields exist
Try different search weights

Token Limit Errors

Problem: Context too large for LLM

Solutions:

Reduce max_context_messages
Limit collection search results
Use shorter documents
Switch to model with larger context window

Slow Responses

Problem: Chat responses are slow

Solutions:

Create indexes on search fields
Reduce number of collections searched
Use faster LLM model
Limit context size

Vector Search - Semantic search with embeddings
Indexes - Optimize search performance
Client Libraries - Full API examples
System Administration - Monitor and manage

Chat Models API Examples:

Rust - client_chat_models.rs
Python - client_chat_models.py
TypeScript - client_chat_models.ts
Go - client_chat_models.go
Kotlin - ClientChatModels.kt

Summary

Chat & RAG in ekoDB enables:

✅ Conversational AI - Natural language interactions ✅ Context-aware responses - Answers based on your data ✅ Multi-provider support - OpenAI, Anthropic, Perplexity ✅ Branching conversations - Explore alternative paths ✅ Hybrid search - Text + vector semantic matching ✅ Integrated - No separate infrastructure needed ✅ Production-ready - Scalable and reliable

Quick Start​

Core Concepts​

Centralized Architecture​

Chat Sessions​

Message Flow​

Creating Chat Sessions​

Basic Chat Session​

RAG Chat Session​

Multi-Provider Support​

Sending Messages​

Simple Message​

With Context​

Message Management​

List Messages​

Update Message​

Delete Message​

Mark as Forgotten​

Regenerate Response​

Advanced Features​

Branching Conversations​

Merging Sessions​

Conversation Summarization​

Real-World Examples​

Customer Support Bot​

Document Q&A​

Code Assistant​

Hybrid Search Integration​

Performance Optimization​

1. Limit Context Messages​

2. Use Targeted Collections​

3. Index Your Data​

4. Use Efficient Models​

Best Practices​

API Reference​

createChatSession()​

sendChatMessage()​

getChatMessages()​

getChatModels()​

getChatModel()​

Troubleshooting​

No Context Retrieved​

Token Limit Errors​

Slow Responses​

Related Documentation​

Summary​

Quick Start

Core Concepts

Centralized Architecture

Chat Sessions

Message Flow

Creating Chat Sessions

Basic Chat Session

RAG Chat Session

Multi-Provider Support

Sending Messages

Simple Message

With Context

Message Management

List Messages

Update Message

Delete Message

Mark as Forgotten

Regenerate Response

Advanced Features

Branching Conversations

Merging Sessions

Conversation Summarization

Real-World Examples

Customer Support Bot

Document Q&A

Code Assistant

Hybrid Search Integration

Performance Optimization

1. Limit Context Messages

2. Use Targeted Collections

3. Index Your Data

4. Use Efficient Models

Best Practices

API Reference

createChatSession()

sendChatMessage()

getChatMessages()

getChatModels()

getChatModel()

Troubleshooting

No Context Retrieved

Token Limit Errors

Slow Responses

Related Documentation

Summary