Skip to main content

Vector Search & Embeddings

Build intelligent search and recommendation systems with ekoDB's integrated vector search capabilities.

Production-Ready

Vector search is built directly into ekoDB - no separate deployment needed. Performance competitive with specialized vector databases.

Quick Start

use ekodb_client::Client;

// 1. Store vectors with your data
let mut record = Record::new();
record.insert("name", "Ergonomic Chair");
record.insert("description", "Comfortable office chair...");
record.insert("embedding", vec![0.12, 0.34, 0.56]); // 1536-dim vector

client.insert("products", record).await?;

// 2. Search by similarity
let results = client.vector_search(
"products",
query_embedding,
10,
Some(VectorSearchOptions {
metric: SimilarityMetric::Cosine,
..Default::default()
})
).await?;

Performance

Dataset SizeAverage LatencyThroughput
1K vectors51ms379 RPS
10K vectors322ms56 RPS
Hybrid searchText + vector combinedConfigurable weights

Competitive Performance:

  • 6x faster than PostgreSQL pgvector
  • Competitive with Milvus, Elasticsearch
  • Integrated - no separate deployment

Vector Types

ekoDB supports vector fields for storing embeddings from AI models:

// Standard vector field
const record = {
title: 'Database Performance Tips',
content: 'This article discusses...',
embedding: {
type: 'Vector',
value: [0.12, 0.34, 0.56, ...], // Array of numbers
metadata: {
model: 'text-embedding-ada-002',
dimensions: 1536
}
}
};

Schema Definition

Define vector fields in your schema with dimension validation:

const schema = {
title: { field_type: 'String', required: true },
content: { field_type: 'String', required: true },
embedding: {
field_type: 'Vector',
dimensions: 1536, // Enforce dimensionality
required: true,
// Optional: Vector index configuration
index: {
algorithm: 'Flat', // Exact search
metric: 'Cosine', // Similarity metric
}
}
};

await client.createSchema('articles', schema);

Distance Metrics

Choose the right metric for your use case:

Measures the angle between vectors. Range: [-1, 1]

const results = await client.vectorSearch(
'articles',
queryVector,
10,
{ metric: 'Cosine' }
);

Best for:

  • ✅ Text embeddings and semantic search
  • ✅ When vector magnitude should be ignored
  • ✅ Most AI/ML embeddings (OpenAI, Cohere, etc.)

Euclidean Distance

Measures straight-line (L2) distance. Lower = more similar.

const results = await client.vectorSearch(
'locations',
queryVector,
10,
{ metric: 'Euclidean' }
);

Best for:

  • ✅ Spatial data and coordinates
  • ✅ When both magnitude and direction matter
  • ✅ Physical measurements

Dot Product

Calculates inner product. Higher = more similar.

const results = await client.vectorSearch(
'recommendations',
queryVector,
10,
{ metric: 'DotProduct' }
);

Best for:

  • ✅ When vectors are pre-normalized
  • ✅ Certain recommendation systems
  • ✅ When magnitude contains meaningful information

Search Methods

ekoDB's vector search finds the true nearest neighbors for your query, with performance optimized by vector indexes when defined in your schema.

const results = await client.vectorSearch(
'products',
queryVector,
10,
{
metric: 'Cosine',
threshold: 0.7, // Minimum similarity score
filters: { category: 'electronics' }, // Metadata filtering
select_fields: ['title', 'price'], // Field projection
}
);

Performance:

  • Low-latency similarity search across collections
  • Accurate results guaranteed
  • Further optimized with vector indexes (when defined in schema)

Combine text search with vector similarity for powerful hybrid queries:

// Hybrid: Text + Vector search
const results = await client.hybridSearch(
'articles',
{
text_query: 'database performance',
vector: queryEmbedding,
text_weight: 0.3, // 30% text relevance
vector_weight: 0.7, // 70% semantic similarity
limit: 10
}
);

Use cases:

  • Semantic search with keyword filtering
  • Recommendations with category constraints
  • RAG systems with both keyword and semantic matching

Real-World Examples

// Generate embedding from OpenAI
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function semanticSearch(query: string) {
// 1. Get query embedding
const response = await openai.embeddings.create({
model: 'text-embedding-ada-002',
input: query,
});
const queryVector = response.data[0].embedding;

// 2. Search by similarity
const results = await client.vectorSearch(
'articles',
queryVector,
5,
{ metric: 'Cosine' }
);

return results;
}

// Search: "How to optimize database queries"
const articles = await semanticSearch('database optimization tips');

Product Recommendations

async function getSimilarProducts(productId: string) {
// 1. Get product's embedding
const product = await client.findById('products', productId);
const embedding = product.embedding;

// 2. Find similar products
const similar = await client.vectorSearch(
'products',
embedding,
10,
{ metric: 'Cosine' }
);

// Filter out the original product
return similar.filter(p => p.id !== productId);
}

Image Similarity

// Store image embeddings from CLIP or similar model
async function findSimilarImages(imageEmbedding: number[], category?: string) {
const results = await client.vectorSearch(
'images',
imageEmbedding,
20,
{
metric: 'Cosine',
threshold: 0.7,
// Use metadata filters to reduce search space
filters: category ? { category } : undefined
}
);

return results.map(r => ({
url: r.url,
similarity: r.score,
metadata: r.metadata
}));
}

Deletion & Index Maintenance

When you delete a record that contains a vector field, ekoDB immediately removes it from vector search results. There's no need to manually update the index — deleted records won't appear in searches.

// Insert a record with a vector
await client.insert('products', {
name: 'Discontinued Widget',
embedding: [0.12, 0.34, 0.56, ...]
});

// Delete it — immediately excluded from vector search results
await client.delete('products', recordId);

Reindexing

Over time, frequent deletions can degrade search performance. ekoDB tracks the deletion ratio and recommends a reindex when more than 20% of vectors have been deleted. Reindexing rebuilds the search graph from scratch, restoring optimal performance.

# Reindex a collection's vector index
curl -X POST https://{EKODB_API_URL}/api/indexes/search/{collection}/reindex \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{}'

# Optionally specify which vector field to reindex
curl -X POST https://{EKODB_API_URL}/api/indexes/search/{collection}/reindex \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{"field": "embedding"}'

Response:

{
"status": "ok",
"collection": "products",
"field": "embedding",
"vectors_reindexed": 4850,
"duration_ms": 127.5
}
When to Reindex

Reindexing is only needed after heavy deletion workloads. For most applications with occasional deletes, the index stays performant without manual intervention.

Search Tuning

ef_search (Beam Width)

The ef_search parameter controls the search beam width — higher values explore more of the graph, improving accuracy at the cost of latency. ekoDB resolves ef_search with a 3-tier fallback:

  1. Per-query override — pass ef_search in the search request
  2. Collection-level config — set in the vector index configuration
  3. Heuristic defaultmax(k * 2, 64)
// Per-query override for a high-precision search
const results = await client.vectorSearch(
'articles',
queryVector,
10,
{
metric: 'Cosine',
ef_search: 200 // Higher = more accurate, slower
}
);
# Direct API with ef_search override
curl -X POST https://{EKODB_API_URL}/api/search/{collection} \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"query_type": "vector",
"vector": [0.12, 0.34, 0.56, ...],
"limit": 10,
"metric": "Cosine",
"ef_search": 200
}'
ef_searchAccuracyLatencyUse Case
32-64GoodLowReal-time recommendations
64-128HighMediumSemantic search (default range)
200+Very HighHigherPrecision-critical applications

Performance Optimization

1. Use Vector Indexes

Define vector indexes in your schema for automatic indexing:

// Define index in schema for automatic indexing
const schema = {
embedding: {
field_type: 'Vector',
dimensions: 768,
index: {
algorithm: 'Flat', // Exact search
metric: 'Cosine',
}
}
};

2. Use Metadata Filters

Reduce search space with metadata filters:

const results = await client.vectorSearch(
'products',
queryVector,
10,
{
metric: 'Cosine',
filters: {
category: 'electronics',
in_stock: true
}
}
);

3. Set Similarity Threshold

Filter out low-relevance results:

const results = await client.vectorSearch(
'articles',
queryVector,
10,
{
metric: 'Cosine',
threshold: 0.75 // Only return results with > 75% similarity
}
);

4. Batch Insert Vectors

// More efficient than individual inserts
await client.batchInsert('articles', articlesWithEmbeddings);

5. Use Field Projection

Return only needed fields to reduce data transfer:

const results = await client.vectorSearch(
'products',
queryVector,
10,
{
metric: 'Cosine',
select_fields: ['title', 'price', 'image_url'] // Only these fields
}
);

Embedding Models

ekoDB works with any embedding model. Popular choices:

OpenAI

import OpenAI from 'openai';

const openai = new OpenAI();
const response = await openai.embeddings.create({
model: 'text-embedding-ada-002', // 1536 dimensions
input: 'Your text here',
});
const embedding = response.data[0].embedding;

Cohere

import { CohereClient } from 'cohere-ai';

const cohere = new CohereClient({ token: process.env.COHERE_API_KEY });
const response = await cohere.embed({
texts: ['Your text here'],
model: 'embed-english-v3.0', // 1024 dimensions
});
const embedding = response.embeddings[0];

Local Models (Transformers.js)

import { pipeline } from '@xenova/transformers';

const extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
const output = await extractor('Your text here', { pooling: 'mean', normalize: true });
const embedding = Array.from(output.data); // 384 dimensions

API Reference

vectorSearch()

client.vectorSearch(
collection: string,
vector: number[],
limit: number,
options?: {
metric?: 'Cosine' | 'Euclidean' | 'DotProduct',
threshold?: number, // Minimum similarity score (0.0-1.0)
vector_field?: string, // Default: 'embedding'
filters?: Record<string, any>, // Metadata filters
normalize?: boolean, // Auto-normalize vectors (default: true)
bypass_cache?: boolean,
select_fields?: string[], // Field projection
exclude_fields?: string[],
}
): Promise<SearchResult[]>

hybridSearch()

client.hybridSearch(
collection: string,
options: {
text_query: string,
vector: number[],
text_weight: number,
vector_weight: number,
limit: number,
metric?: 'Cosine' | 'Euclidean' | 'DotProduct'
}
): Promise<SearchResult[]>

Best Practices

  1. Match Dimensions: Ensure vector dimensions match your embedding model
  2. Use Schemas: Define dimensions in schema for validation
  3. Choose Right Metric: Cosine for most AI embeddings (OpenAI, Cohere, etc.)
  4. Batch Operations: Use batchInsert for multiple vectors
  5. Set Thresholds: Filter low-relevance results with threshold parameter
  6. Use Metadata Filters: Reduce search space when possible
  7. Field Projection: Only return fields you need
  8. Monitor Performance: Track query latency and optimize with vector indexes as needed

Summary

Vector search in ekoDB enables:

Semantic search - Find by meaning, not just keywords ✅ Recommendations - Product, content, and user similarity ✅ Image search - Visual similarity matching ✅ RAG systems - Retrieval-augmented generation ✅ Integrated - No separate vector database needed ✅ Production-ready - Competitive performance and reliability