Vector Search & Embeddings
Build intelligent search and recommendation systems with ekoDB's integrated vector search capabilities.
Vector search is built directly into ekoDB - no separate deployment needed. Performance competitive with specialized vector databases.
Quick Start
- Client Libraries (Recommended)
- Direct API
- 🦀 Rust
- 🐍 Python
- 📘 TypeScript
- 📦 JavaScript
- 🟣 Kotlin
- 🔷 Go
use ekodb_client::Client;
// 1. Store vectors with your data
let mut record = Record::new();
record.insert("name", "Ergonomic Chair");
record.insert("description", "Comfortable office chair...");
record.insert("embedding", vec![0.12, 0.34, 0.56]); // 1536-dim vector
client.insert("products", record).await?;
// 2. Search by similarity
let results = client.vector_search(
"products",
query_embedding,
10,
Some(VectorSearchOptions {
metric: SimilarityMetric::Cosine,
..Default::default()
})
).await?;
from ekodb_client import Client
client = Client.new("https://your-db.ekodb.net", "your-api-key")
# 1. Store vectors with your data
client.insert('products', {
'name': 'Ergonomic Chair',
'description': 'Comfortable office chair...',
'embedding': [0.12, 0.34, 0.56] # 1536-dim vector from OpenAI
})
# 2. Search by similarity
results = client.vector_search(
'products',
query_embedding,
k=10,
metric='Cosine'
)
import { EkoDBClient } from '@ekodb/ekodb-client';
const client = new EkoDBClient({
baseURL: 'https://your-db.ekodb.net',
apiKey: 'your-api-key'
});
await client.init();
// 1. Store vectors with your data
await client.insert('products', {
name: 'Ergonomic Chair',
description: 'Comfortable office chair...',
embedding: [0.12, 0.34, 0.56, ...], // 1536-dim vector from OpenAI
});
// 2. Search by similarity
const results = await client.vectorSearch(
'products',
queryEmbedding, // Your query vector
10, // Top 10 results
{ metric: 'Cosine' }
);
const { EkoDBClient } = require('@ekodb/ekodb-client');
const client = new EkoDBClient({
baseURL: 'https://your-db.ekodb.net',
apiKey: 'your-api-key'
});
await client.init();
// 1. Store vectors with your data
await client.insert('products', {
name: 'Ergonomic Chair',
description: 'Comfortable office chair...',
embedding: [0.12, 0.34, 0.56], // 1536-dim vector from OpenAI
});
// 2. Search by similarity
const results = await client.vectorSearch(
'products',
queryEmbedding,
10,
{ metric: 'Cosine' }
);
import io.ekodb.client.EkoDBClient
val client = EkoDBClient.builder()
.baseUrl("https://your-db.ekodb.net")
.apiKey("your-api-key")
.build()
// 1. Store vectors with your data
client.insert("products", mapOf(
"name" to "Ergonomic Chair",
"description" to "Comfortable office chair...",
"embedding" to listOf(0.12, 0.34, 0.56) // 1536-dim vector
))
// 2. Search by similarity
val results = client.vectorSearch(
collection = "products",
queryVector = queryEmbedding,
k = 10,
metric = "Cosine"
)
import "github.com/ekoDB/ekodb-client-go"
client := ekodb.NewClient(
"https://your-db.ekodb.net",
"your-api-key",
)
// 1. Store vectors with your data
client.Insert("products", map[string]interface{}{
"name": "Ergonomic Chair",
"description": "Comfortable office chair...",
"embedding": []float64{0.12, 0.34, 0.56}, // 1536-dim vector
})
// 2. Search by similarity
results, err := client.VectorSearch(
"products",
queryEmbedding,
10,
ekodb.VectorSearchOptions{Metric: "Cosine"},
)
# 1. Store vectors with your data
curl -X POST https://{EKODB_API_URL}/api/insert/products \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"name": {"type": "String", "value": "Ergonomic Chair"},
"description": {"type": "String", "value": "Comfortable office chair..."},
"embedding": {"type": "Vector", "value": [0.12, 0.34, 0.56, ...]}
}'
# 2. Search by similarity
curl -X POST https://{EKODB_API_URL}/api/search/{collection} \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"query_type": "vector",
"vector": [0.12, 0.34, 0.56, ...],
"limit": 10,
"metric": "Cosine"
}'
Performance
| Dataset Size | Average Latency | Throughput |
|---|---|---|
| 1K vectors | 51ms | 379 RPS |
| 10K vectors | 322ms | 56 RPS |
| Hybrid search | Text + vector combined | Configurable weights |
Competitive Performance:
- ✅ 6x faster than PostgreSQL pgvector
- ✅ Competitive with Milvus, Elasticsearch
- ✅ Integrated - no separate deployment
Vector Types
ekoDB supports vector fields for storing embeddings from AI models:
// Standard vector field
const record = {
title: 'Database Performance Tips',
content: 'This article discusses...',
embedding: {
type: 'Vector',
value: [0.12, 0.34, 0.56, ...], // Array of numbers
metadata: {
model: 'text-embedding-ada-002',
dimensions: 1536
}
}
};
Schema Definition
Define vector fields in your schema with dimension validation:
const schema = {
title: { field_type: 'String', required: true },
content: { field_type: 'String', required: true },
embedding: {
field_type: 'Vector',
dimensions: 1536, // Enforce dimensionality
required: true,
// Optional: Vector index configuration
index: {
algorithm: 'Flat', // Exact search
metric: 'Cosine', // Similarity metric
}
}
};
await client.createSchema('articles', schema);
Distance Metrics
Choose the right metric for your use case:
Cosine Similarity (Recommended)
Measures the angle between vectors. Range: [-1, 1]
const results = await client.vectorSearch(
'articles',
queryVector,
10,
{ metric: 'Cosine' }
);
Best for:
- ✅ Text embeddings and semantic search
- ✅ When vector magnitude should be ignored
- ✅ Most AI/ML embeddings (OpenAI, Cohere, etc.)
Euclidean Distance
Measures straight-line (L2) distance. Lower = more similar.
const results = await client.vectorSearch(
'locations',
queryVector,
10,
{ metric: 'Euclidean' }
);
Best for:
- ✅ Spatial data and coordinates
- ✅ When both magnitude and direction matter
- ✅ Physical measurements
Dot Product
Calculates inner product. Higher = more similar.
const results = await client.vectorSearch(
'recommendations',
queryVector,
10,
{ metric: 'DotProduct' }
);
Best for:
- ✅ When vectors are pre-normalized
- ✅ Certain recommendation systems
- ✅ When magnitude contains meaningful information
Search Methods
ekoDB's vector search finds the true nearest neighbors for your query, with performance optimized by vector indexes when defined in your schema.
const results = await client.vectorSearch(
'products',
queryVector,
10,
{
metric: 'Cosine',
threshold: 0.7, // Minimum similarity score
filters: { category: 'electronics' }, // Metadata filtering
select_fields: ['title', 'price'], // Field projection
}
);
Performance:
- Low-latency similarity search across collections
- Accurate results guaranteed
- Further optimized with vector indexes (when defined in schema)
Hybrid Search
Combine text search with vector similarity for powerful hybrid queries:
// Hybrid: Text + Vector search
const results = await client.hybridSearch(
'articles',
{
text_query: 'database performance',
vector: queryEmbedding,
text_weight: 0.3, // 30% text relevance
vector_weight: 0.7, // 70% semantic similarity
limit: 10
}
);
Use cases:
- Semantic search with keyword filtering
- Recommendations with category constraints
- RAG systems with both keyword and semantic matching
Real-World Examples
Semantic Search
- Client Libraries (Recommended)
- Direct API
// Generate embedding from OpenAI
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
async function semanticSearch(query: string) {
// 1. Get query embedding
const response = await openai.embeddings.create({
model: 'text-embedding-ada-002',
input: query,
});
const queryVector = response.data[0].embedding;
// 2. Search by similarity
const results = await client.vectorSearch(
'articles',
queryVector,
5,
{ metric: 'Cosine' }
);
return results;
}
// Search: "How to optimize database queries"
const articles = await semanticSearch('database optimization tips');
# 1. Get embedding from OpenAI (or your embedding service)
QUERY_VECTOR=$(curl https://api.openai.com/v1/embeddings \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "text-embedding-ada-002", "input": "database optimization tips"}' \
| jq '.data[0].embedding')
# 2. Search ekoDB
curl -X POST https://{EKODB_API_URL}/api/search/articles \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d "{
\"query_type\": \"vector\",
\"vector\": $QUERY_VECTOR,
\"limit\": 5,
\"metric\": \"Cosine\"
}"
Product Recommendations
async function getSimilarProducts(productId: string) {
// 1. Get product's embedding
const product = await client.findById('products', productId);
const embedding = product.embedding;
// 2. Find similar products
const similar = await client.vectorSearch(
'products',
embedding,
10,
{ metric: 'Cosine' }
);
// Filter out the original product
return similar.filter(p => p.id !== productId);
}
Image Similarity
// Store image embeddings from CLIP or similar model
async function findSimilarImages(imageEmbedding: number[], category?: string) {
const results = await client.vectorSearch(
'images',
imageEmbedding,
20,
{
metric: 'Cosine',
threshold: 0.7,
// Use metadata filters to reduce search space
filters: category ? { category } : undefined
}
);
return results.map(r => ({
url: r.url,
similarity: r.score,
metadata: r.metadata
}));
}
Deletion & Index Maintenance
Automatic Deletion from Search
When you delete a record that contains a vector field, ekoDB immediately removes it from vector search results. There's no need to manually update the index — deleted records won't appear in searches.
// Insert a record with a vector
await client.insert('products', {
name: 'Discontinued Widget',
embedding: [0.12, 0.34, 0.56, ...]
});
// Delete it — immediately excluded from vector search results
await client.delete('products', recordId);
Reindexing
Over time, frequent deletions can degrade search performance. ekoDB tracks the deletion ratio and recommends a reindex when more than 20% of vectors have been deleted. Reindexing rebuilds the search graph from scratch, restoring optimal performance.
# Reindex a collection's vector index
curl -X POST https://{EKODB_API_URL}/api/indexes/search/{collection}/reindex \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{}'
# Optionally specify which vector field to reindex
curl -X POST https://{EKODB_API_URL}/api/indexes/search/{collection}/reindex \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{"field": "embedding"}'
Response:
{
"status": "ok",
"collection": "products",
"field": "embedding",
"vectors_reindexed": 4850,
"duration_ms": 127.5
}
Reindexing is only needed after heavy deletion workloads. For most applications with occasional deletes, the index stays performant without manual intervention.
Search Tuning
ef_search (Beam Width)
The ef_search parameter controls the search beam width — higher values explore more of the graph, improving accuracy at the cost of latency. ekoDB resolves ef_search with a 3-tier fallback:
- Per-query override — pass
ef_searchin the search request - Collection-level config — set in the vector index configuration
- Heuristic default —
max(k * 2, 64)
// Per-query override for a high-precision search
const results = await client.vectorSearch(
'articles',
queryVector,
10,
{
metric: 'Cosine',
ef_search: 200 // Higher = more accurate, slower
}
);
# Direct API with ef_search override
curl -X POST https://{EKODB_API_URL}/api/search/{collection} \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"query_type": "vector",
"vector": [0.12, 0.34, 0.56, ...],
"limit": 10,
"metric": "Cosine",
"ef_search": 200
}'
| ef_search | Accuracy | Latency | Use Case |
|---|---|---|---|
| 32-64 | Good | Low | Real-time recommendations |
| 64-128 | High | Medium | Semantic search (default range) |
| 200+ | Very High | Higher | Precision-critical applications |
Performance Optimization
1. Use Vector Indexes
Define vector indexes in your schema for automatic indexing:
// Define index in schema for automatic indexing
const schema = {
embedding: {
field_type: 'Vector',
dimensions: 768,
index: {
algorithm: 'Flat', // Exact search
metric: 'Cosine',
}
}
};
2. Use Metadata Filters
Reduce search space with metadata filters:
const results = await client.vectorSearch(
'products',
queryVector,
10,
{
metric: 'Cosine',
filters: {
category: 'electronics',
in_stock: true
}
}
);
3. Set Similarity Threshold
Filter out low-relevance results:
const results = await client.vectorSearch(
'articles',
queryVector,
10,
{
metric: 'Cosine',
threshold: 0.75 // Only return results with > 75% similarity
}
);
4. Batch Insert Vectors
// More efficient than individual inserts
await client.batchInsert('articles', articlesWithEmbeddings);
5. Use Field Projection
Return only needed fields to reduce data transfer:
const results = await client.vectorSearch(
'products',
queryVector,
10,
{
metric: 'Cosine',
select_fields: ['title', 'price', 'image_url'] // Only these fields
}
);
Embedding Models
ekoDB works with any embedding model. Popular choices:
OpenAI
import OpenAI from 'openai';
const openai = new OpenAI();
const response = await openai.embeddings.create({
model: 'text-embedding-ada-002', // 1536 dimensions
input: 'Your text here',
});
const embedding = response.data[0].embedding;
Cohere
import { CohereClient } from 'cohere-ai';
const cohere = new CohereClient({ token: process.env.COHERE_API_KEY });
const response = await cohere.embed({
texts: ['Your text here'],
model: 'embed-english-v3.0', // 1024 dimensions
});
const embedding = response.embeddings[0];
Local Models (Transformers.js)
import { pipeline } from '@xenova/transformers';
const extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
const output = await extractor('Your text here', { pooling: 'mean', normalize: true });
const embedding = Array.from(output.data); // 384 dimensions
API Reference
vectorSearch()
client.vectorSearch(
collection: string,
vector: number[],
limit: number,
options?: {
metric?: 'Cosine' | 'Euclidean' | 'DotProduct',
threshold?: number, // Minimum similarity score (0.0-1.0)
vector_field?: string, // Default: 'embedding'
filters?: Record<string, any>, // Metadata filters
normalize?: boolean, // Auto-normalize vectors (default: true)
bypass_cache?: boolean,
select_fields?: string[], // Field projection
exclude_fields?: string[],
}
): Promise<SearchResult[]>
hybridSearch()
client.hybridSearch(
collection: string,
options: {
text_query: string,
vector: number[],
text_weight: number,
vector_weight: number,
limit: number,
metric?: 'Cosine' | 'Euclidean' | 'DotProduct'
}
): Promise<SearchResult[]>
Best Practices
- Match Dimensions: Ensure vector dimensions match your embedding model
- Use Schemas: Define dimensions in schema for validation
- Choose Right Metric: Cosine for most AI embeddings (OpenAI, Cohere, etc.)
- Batch Operations: Use
batchInsertfor multiple vectors - Set Thresholds: Filter low-relevance results with
thresholdparameter - Use Metadata Filters: Reduce search space when possible
- Field Projection: Only return fields you need
- Monitor Performance: Track query latency and optimize with vector indexes as needed
Related Documentation
- Indexes - Create indexes for optimal performance
- Query Expressions - Filter syntax
- Client Libraries - Full API examples
- System Administration - Monitor performance
Summary
Vector search in ekoDB enables:
✅ Semantic search - Find by meaning, not just keywords ✅ Recommendations - Product, content, and user similarity ✅ Image search - Visual similarity matching ✅ RAG systems - Retrieval-augmented generation ✅ Integrated - No separate vector database needed ✅ Production-ready - Competitive performance and reliability