Vector Search & Embeddings
Build intelligent search and recommendation systems with ekoDB's integrated vector search capabilities.
Vector search is built directly into ekoDB - no separate deployment needed. Performance competitive with specialized vector databases.
Quick Start
- Client Libraries (Recommended)
- Direct API
- 🦀 Rust
- 🐍 Python
- 📘 TypeScript
- 📦 JavaScript
- 🟣 Kotlin
- 🔷 Go
use ekodb_client::Client;
// 1. Store vectors with your data
let mut record = Record::new();
record.insert("name", "Ergonomic Chair");
record.insert("description", "Comfortable office chair...");
record.insert("embedding", vec![0.12, 0.34, 0.56]); // 1536-dim vector
client.insert("products", record).await?;
// 2. Search by similarity
let results = client.vector_search(
"products",
query_embedding,
10,
Some(VectorSearchOptions {
metric: SimilarityMetric::Cosine,
..Default::default()
})
).await?;
from ekodb_client import Client
client = Client.new("https://your-db.ekodb.net", "your-api-key")
# 1. Store vectors with your data
client.insert('products', {
'name': 'Ergonomic Chair',
'description': 'Comfortable office chair...',
'embedding': [0.12, 0.34, 0.56] # 1536-dim vector from OpenAI
})
# 2. Search by similarity
results = client.vector_search(
'products',
query_embedding,
k=10,
metric='Cosine'
)
import { EkoDBClient } from '@ekodb/ekodb-client';
const client = new EkoDBClient({
baseURL: 'https://your-db.ekodb.net',
apiKey: 'your-api-key'
});
await client.init();
// 1. Store vectors with your data
await client.insert('products', {
name: 'Ergonomic Chair',
description: 'Comfortable office chair...',
embedding: [0.12, 0.34, 0.56, ...], // 1536-dim vector from OpenAI
});
// 2. Search by similarity
const results = await client.vectorSearch(
'products',
queryEmbedding, // Your query vector
10, // Top 10 results
{ metric: 'Cosine' }
);
const { EkoDBClient } = require('@ekodb/ekodb-client');
const client = new EkoDBClient({
baseURL: 'https://your-db.ekodb.net',
apiKey: 'your-api-key'
});
await client.init();
// 1. Store vectors with your data
await client.insert('products', {
name: 'Ergonomic Chair',
description: 'Comfortable office chair...',
embedding: [0.12, 0.34, 0.56], // 1536-dim vector from OpenAI
});
// 2. Search by similarity
const results = await client.vectorSearch(
'products',
queryEmbedding,
10,
{ metric: 'Cosine' }
);
import io.ekodb.client.EkoDBClient
val client = EkoDBClient.builder()
.baseUrl("https://your-db.ekodb.net")
.apiKey("your-api-key")
.build()
// 1. Store vectors with your data
client.insert("products", mapOf(
"name" to "Ergonomic Chair",
"description" to "Comfortable office chair...",
"embedding" to listOf(0.12, 0.34, 0.56) // 1536-dim vector
))
// 2. Search by similarity
val results = client.vectorSearch(
collection = "products",
queryVector = queryEmbedding,
k = 10,
metric = "Cosine"
)
import "github.com/ekoDB/ekodb-client-go"
client := ekodb.NewClient(
"https://your-db.ekodb.net",
"your-api-key",
)
// 1. Store vectors with your data
client.Insert("products", map[string]interface{}{
"name": "Ergonomic Chair",
"description": "Comfortable office chair...",
"embedding": []float64{0.12, 0.34, 0.56}, // 1536-dim vector
})
// 2. Search by similarity
results, err := client.VectorSearch(
"products",
queryEmbedding,
10,
ekodb.VectorSearchOptions{Metric: "Cosine"},
)
# 1. Store vectors with your data
curl -X POST https://{EKODB_API_URL}/api/insert/products \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"name": {"type": "String", "value": "Ergonomic Chair"},
"description": {"type": "String", "value": "Comfortable office chair..."},
"embedding": {"type": "Vector", "value": [0.12, 0.34, 0.56, ...]}
}'
# 2. Search by similarity
curl -X POST https://{EKODB_API_URL}/api/search/{collection} \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"query_type": "vector",
"vector": [0.12, 0.34, 0.56, ...],
"limit": 10,
"metric": "Cosine"
}'
Performance
| Dataset Size | Average Latency | Throughput |
|---|---|---|
| 1K vectors | 51ms | 379 RPS |
| 10K vectors | 322ms | 56 RPS |
| Hybrid search | Text + vector combined | Configurable weights |
Competitive Performance:
- ✅ 6x faster than PostgreSQL pgvector
- ✅ Competitive with Milvus, Elasticsearch
- ✅ Integrated - no separate deployment
Vector Types
ekoDB supports vector fields for storing embeddings from AI models:
// Standard vector field
const record = {
title: 'Database Performance Tips',
content: 'This article discusses...',
embedding: {
type: 'Vector',
value: [0.12, 0.34, 0.56, ...], // Array of numbers
metadata: {
model: 'text-embedding-ada-002',
dimensions: 1536
}
}
};
Schema Definition
Define vector fields in your schema with dimension validation:
const schema = {
title: { field_type: 'String', required: true },
content: { field_type: 'String', required: true },
embedding: {
field_type: 'Vector',
dimensions: 1536, // Enforce dimensionality
required: true,
// Optional: Vector index configuration
index: {
algorithm: 'Flat', // Exact search
metric: 'Cosine', // Similarity metric
}
}
};
await client.createSchema('articles', schema);
Distance Metrics
Choose the right metric for your use case:
Cosine Similarity (Recommended)
Measures the angle between vectors. Range: [-1, 1]
const results = await client.vectorSearch(
'articles',
queryVector,
10,
{ metric: 'Cosine' }
);
Best for:
- ✅ Text embeddings and semantic search
- ✅ When vector magnitude should be ignored
- ✅ Most AI/ML embeddings (OpenAI, Cohere, etc.)
Euclidean Distance
Measures straight-line (L2) distance. Lower = more similar.
const results = await client.vectorSearch(
'locations',
queryVector,
10,
{ metric: 'Euclidean' }
);
Best for:
- ✅ Spatial data and coordinates
- ✅ When both magnitude and direction matter
- ✅ Physical measurements
Dot Product
Calculates inner product. Higher = more similar.
const results = await client.vectorSearch(
'recommendations',
queryVector,
10,
{ metric: 'DotProduct' }
);
Best for:
- ✅ When vectors are pre-normalized
- ✅ Certain recommendation systems
- ✅ When magnitude contains meaningful information
Search Methods
ekoDB performs exact vector search - comparing the query vector against all vectors in the collection. This guarantees finding the true nearest neighbors.
const results = await client.vectorSearch(
'products',
queryVector,
10,
{
metric: 'Cosine',
threshold: 0.7, // Minimum similarity score
filters: { category: 'electronics' }, // Metadata filtering
select_fields: ['title', 'price'], // Field projection
}
);
Performance:
- Small-medium collections (< 100K): 51ms average
- Exact results guaranteed
- Optimized with vector indexes (when defined in schema)
Hybrid Search
Combine text search with vector similarity for powerful hybrid queries:
// Hybrid: Text + Vector search
const results = await client.hybridSearch(
'articles',
{
text_query: 'database performance',
vector: queryEmbedding,
text_weight: 0.3, // 30% text relevance
vector_weight: 0.7, // 70% semantic similarity
limit: 10
}
);
Use cases:
- Semantic search with keyword filtering
- Recommendations with category constraints
- RAG systems with both keyword and semantic matching
Real-World Examples
Semantic Search
- Client Libraries (Recommended)
- Direct API
// Generate embedding from OpenAI
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
async function semanticSearch(query: string) {
// 1. Get query embedding
const response = await openai.embeddings.create({
model: 'text-embedding-ada-002',
input: query,
});
const queryVector = response.data[0].embedding;
// 2. Search by similarity
const results = await client.vectorSearch(
'articles',
queryVector,
5,
{ metric: 'Cosine' }
);
return results;
}
// Search: "How to optimize database queries"
const articles = await semanticSearch('database optimization tips');
# 1. Get embedding from OpenAI (or your embedding service)
QUERY_VECTOR=$(curl https://api.openai.com/v1/embeddings \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "text-embedding-ada-002", "input": "database optimization tips"}' \
| jq '.data[0].embedding')
# 2. Search ekoDB
curl -X POST https://{EKODB_API_URL}/api/search/articles \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d "{
\"query_type\": \"vector\",
\"vector\": $QUERY_VECTOR,
\"limit\": 5,
\"metric\": \"Cosine\"
}"
Product Recommendations
async function getSimilarProducts(productId: string) {
// 1. Get product's embedding
const product = await client.findById('products', productId);
const embedding = product.embedding;
// 2. Find similar products
const similar = await client.vectorSearch(
'products',
embedding,
10,
{ metric: 'Cosine' }
);
// Filter out the original product
return similar.filter(p => p.id !== productId);
}
Image Similarity
// Store image embeddings from CLIP or similar model
async function findSimilarImages(imageEmbedding: number[], category?: string) {
const results = await client.vectorSearch(
'images',
imageEmbedding,
20,
{
metric: 'Cosine',
threshold: 0.7,
// Use metadata filters to reduce search space
filters: category ? { category } : undefined
}
);
return results.map(r => ({
url: r.url,
similarity: r.score,
metadata: r.metadata
}));
}
Performance Optimization
1. Use Vector Indexes
Define vector indexes in your schema for automatic indexing:
// Define index in schema for automatic indexing
const schema = {
embedding: {
field_type: 'Vector',
dimensions: 768,
index: {
algorithm: 'Flat', // Exact search
metric: 'Cosine',
}
}
};
2. Use Metadata Filters
Reduce search space with metadata filters:
const results = await client.vectorSearch(
'products',
queryVector,
10,
{
metric: 'Cosine',
filters: {
category: 'electronics',
in_stock: true
}
}
);
3. Set Similarity Threshold
Filter out low-relevance results:
const results = await client.vectorSearch(
'articles',
queryVector,
10,
{
metric: 'Cosine',
threshold: 0.75 // Only return results with > 75% similarity
}
);
4. Batch Insert Vectors
// More efficient than individual inserts
await client.batchInsert('articles', articlesWithEmbeddings);
5. Use Field Projection
Return only needed fields to reduce data transfer:
const results = await client.vectorSearch(
'products',
queryVector,
10,
{
metric: 'Cosine',
select_fields: ['title', 'price', 'image_url'] // Only these fields
}
);
Embedding Models
ekoDB works with any embedding model. Popular choices:
OpenAI
import OpenAI from 'openai';
const openai = new OpenAI();
const response = await openai.embeddings.create({
model: 'text-embedding-ada-002', // 1536 dimensions
input: 'Your text here',
});
const embedding = response.data[0].embedding;
Cohere
import { CohereClient } from 'cohere-ai';
const cohere = new CohereClient({ token: process.env.COHERE_API_KEY });
const response = await cohere.embed({
texts: ['Your text here'],
model: 'embed-english-v3.0', // 1024 dimensions
});
const embedding = response.embeddings[0];
Local Models (Transformers.js)
import { pipeline } from '@xenova/transformers';
const extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
const output = await extractor('Your text here', { pooling: 'mean', normalize: true });
const embedding = Array.from(output.data); // 384 dimensions
API Reference
vectorSearch()
client.vectorSearch(
collection: string,
vector: number[],
limit: number,
options?: {
metric?: 'Cosine' | 'Euclidean' | 'DotProduct',
threshold?: number, // Minimum similarity score (0.0-1.0)
vector_field?: string, // Default: 'embedding'
filters?: Record<string, any>, // Metadata filters
normalize?: boolean, // Auto-normalize vectors (default: true)
bypass_cache?: boolean,
select_fields?: string[], // Field projection
exclude_fields?: string[],
}
): Promise<SearchResult[]>
hybridSearch()
client.hybridSearch(
collection: string,
options: {
text_query: string,
vector: number[],
text_weight: number,
vector_weight: number,
limit: number,
metric?: 'Cosine' | 'Euclidean' | 'DotProduct'
}
): Promise<SearchResult[]>
Best Practices
- Match Dimensions: Ensure vector dimensions match your embedding model
- Use Schemas: Define dimensions in schema for validation
- Choose Right Metric: Cosine for most AI embeddings (OpenAI, Cohere, etc.)
- Batch Operations: Use
batchInsertfor multiple vectors - Set Thresholds: Filter low-relevance results with
thresholdparameter - Use Metadata Filters: Reduce search space when possible
- Field Projection: Only return fields you need
- Monitor Performance: Track query latency (< 100ms target for < 100K vectors)
Related Documentation
- Indexes - Create indexes for optimal performance
- Query Expressions - Filter syntax
- Client Libraries - Full API examples
- System Administration - Monitor performance
Summary
Vector search in ekoDB enables:
✅ Semantic search - Find by meaning, not just keywords ✅ Recommendations - Product, content, and user similarity ✅ Image search - Visual similarity matching ✅ RAG systems - Retrieval-augmented generation ✅ Integrated - No separate vector database needed ✅ Production-ready - Competitive performance and reliability