Vector Search & Embeddings
Build intelligent search and recommendation systems with ekoDB's integrated vector search capabilities.
Vector search is built directly into ekoDB - no separate deployment needed. Performance competitive with specialized vector databases.
Quick Start
- Client Libraries (Recommended)
- Direct API
- 🦀 Rust
- 🐍 Python
- 📘 TypeScript
- 📦 JavaScript
- 🟣 Kotlin
- 🔷 Go
use ekodb_client::{Client, Record, SearchQuery};
let client = Client::builder()
.base_url("https://my-first-db.development.google.ekodb.net")
.api_key("your-api-key")
.build()?;
// 1. Store vectors with your data
let mut record = Record::new();
record.insert("name", "Ergonomic Chair");
record.insert("description", "Comfortable office chair...");
record.insert("embedding", vec![0.12, 0.34, 0.56]); // 1536-dim vector
client.insert("products", record).await?;
// 2. Search by similarity
let query_embedding: Vec<f64> = vec![0.12, 0.34, 0.56]; // from your embedding model
let query = SearchQuery {
query: String::new(),
vector: Some(query_embedding),
vector_field: Some("embedding".to_string()),
vector_metric: Some("cosine".to_string()),
vector_k: Some(10),
..Default::default()
};
let results = client.search("products", query).await?;
from ekodb_client import Client
client = Client.new("https://my-first-db.development.google.ekodb.net", "your-api-key")
# 1. Store vectors with your data
await client.insert('products', {
'name': 'Ergonomic Chair',
'description': 'Comfortable office chair...',
'embedding': [0.12, 0.34, 0.56] # 1536-dim vector from OpenAI
})
# 2. Search by similarity
query_embedding = [0.12, 0.34, 0.56] # from your embedding model
results = await client.search(
'products',
query='', # empty query for vector-only search
vector=query_embedding,
vector_field='embedding',
vector_metric='cosine',
vector_k=10
)
import { EkoDBClient, SearchQueryBuilder } from '@ekodb/ekodb-client';
const client = new EkoDBClient({
baseURL: 'https://my-first-db.development.google.ekodb.net',
apiKey: 'your-api-key'
});
await client.init();
// 1. Store vectors with your data
await client.insert('products', {
name: 'Ergonomic Chair',
description: 'Comfortable office chair...',
embedding: [0.12, 0.34, 0.56 /* ...1536 dims */], // from OpenAI
});
// 2. Search by similarity
const queryEmbedding = [0.12, 0.34, 0.56 /* ...1536 dims */]; // from your embedding model
const query = new SearchQueryBuilder('') // empty text query for vector-only search
.vector(queryEmbedding)
.vectorField('embedding')
.vectorMetric('cosine')
.vectorK(10)
.build();
const results = await client.search('products', query);
const { EkoDBClient, SearchQueryBuilder } = require('@ekodb/ekodb-client');
const client = new EkoDBClient({
baseURL: 'https://my-first-db.development.google.ekodb.net',
apiKey: 'your-api-key'
});
await client.init();
// 1. Store vectors with your data
await client.insert('products', {
name: 'Ergonomic Chair',
description: 'Comfortable office chair...',
embedding: [0.12, 0.34, 0.56 /* ...1536 dims */], // from OpenAI
});
// 2. Search by similarity
const queryEmbedding = [0.12, 0.34, 0.56 /* ...1536 dims */]; // from your embedding model
const query = new SearchQueryBuilder('') // empty text query for vector-only search
.vector(queryEmbedding)
.vectorField('embedding')
.vectorMetric('cosine')
.vectorK(10)
.build();
const results = await client.search('products', query);
import io.ekodb.client.EkoDBClient
import kotlinx.serialization.json.*
val client = EkoDBClient.builder()
.baseUrl("https://my-first-db.development.google.ekodb.net")
.apiKey("your-api-key")
.build()
// 1. Store vectors with your data
client.insert("products", mapOf(
"name" to "Ergonomic Chair",
"description" to "Comfortable office chair...",
"embedding" to listOf(0.12, 0.34, 0.56) // 1536-dim vector
))
// 2. Search by similarity
val queryEmbedding = listOf(0.12, 0.34, 0.56) // from your embedding model
val searchQuery = buildJsonObject {
put("query", "") // empty text query for vector-only search
put("vector", buildJsonArray { queryEmbedding.forEach { add(it) } })
put("vector_field", "embedding")
put("vector_metric", "cosine")
put("vector_k", 10)
}
val results = client.search("products", searchQuery)
import "github.com/ekoDB/ekodb-client-go"
client := ekodb.NewClient(
"https://my-first-db.development.google.ekodb.net",
"your-api-key",
)
// 1. Store vectors with your data
client.Insert("products", map[string]interface{}{
"name": "Ergonomic Chair",
"description": "Comfortable office chair...",
"embedding": []float64{0.12, 0.34, 0.56}, // 1536-dim vector
})
// 2. Search by similarity
queryEmbedding := []float64{0.12, 0.34, 0.56} // from your embedding model
vectorField := "embedding"
vectorMetric := "cosine"
vectorK := 10
results, err := client.Search("products", ekodb.SearchQuery{
Query: "",
Vector: queryEmbedding,
VectorField: &vectorField,
VectorMetric: &vectorMetric,
VectorK: &vectorK,
})
# 1. Store vectors with your data
curl -X POST https://{EKODB_API_URL}/api/insert/products \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"name": {"type": "String", "value": "Ergonomic Chair"},
"description": {"type": "String", "value": "Comfortable office chair..."},
"embedding": {"type": "Vector", "value": [0.12, 0.34, 0.56, ...]}
}'
# 2. Search by similarity
curl -X POST https://{EKODB_API_URL}/api/search/{collection} \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"query": "",
"vector": [0.12, 0.34, 0.56, ...],
"vector_field": "embedding",
"vector_metric": "cosine",
"vector_k": 10
}'
Performance
| Dataset Size | Average Latency | Throughput |
|---|---|---|
| 1K vectors | 51ms | 379 RPS |
| 10K vectors | 322ms | 56 RPS |
| Hybrid search | Text + vector combined | Configurable weights |
Competitive Performance:
- ✅ 6x faster than PostgreSQL pgvector
- ✅ Competitive with Milvus, Elasticsearch
- ✅ Integrated - no separate deployment
Vector Types
ekoDB supports vector fields for storing embeddings from AI models:
// Standard vector field
const record = {
title: 'Database Performance Tips',
content: 'This article discusses...',
embedding: {
type: 'Vector',
value: [0.12, 0.34, 0.56, ...], // Array of numbers
metadata: {
model: 'text-embedding-ada-002',
dimensions: 1536
}
}
};
Schema Definition
Define vector fields in your schema. ekoDB infers the vector dimension from the first inserted vector and rejects later vectors of a different length, so you do not declare the dimension yourself:
const schema = {
fields: {
title: { field_type: 'String', required: true },
content: { field_type: 'String', required: true },
embedding: {
field_type: 'Vector',
required: true,
// Optional: Vector index configuration
index: {
type: 'vector',
algorithm: 'flat', // exact search
metric: 'cosine', // similarity metric
}
}
}
};
await client.createCollection('articles', schema);
Distance Metrics
Choose the right metric for your use case:
Cosine Similarity (Recommended)
Measures the angle between vectors. Range: [-1, 1]
const query = new SearchQueryBuilder('')
.vector(queryVector)
.vectorMetric('cosine')
.vectorK(10)
.build();
const results = await client.search('articles', query);
Best for:
- ✅ Text embeddings and semantic search
- ✅ When vector magnitude should be ignored
- ✅ Most AI/ML embeddings (OpenAI, Cohere, etc.)
Euclidean Distance
Measures straight-line (L2) distance. Lower = more similar.
const query = new SearchQueryBuilder('')
.vector(queryVector)
.vectorMetric('euclidean')
.vectorK(10)
.build();
const results = await client.search('locations', query);
Best for:
- ✅ Spatial data and coordinates
- ✅ When both magnitude and direction matter
- ✅ Physical measurements
Dot Product
Calculates inner product. Higher = more similar.
const query = new SearchQueryBuilder('')
.vector(queryVector)
.vectorMetric('dotproduct')
.vectorK(10)
.build();
const results = await client.search('recommendations', query);
Best for:
- ✅ When vectors are pre-normalized
- ✅ Certain recommendation systems
- ✅ When magnitude contains meaningful information
Search Methods
ekoDB's vector search finds the true nearest neighbors for your query, with performance optimized by vector indexes when defined in your schema.
const query = new SearchQueryBuilder('')
.vector(queryVector)
.vectorMetric('cosine')
.vectorK(10)
.vectorThreshold(0.7) // minimum similarity score
.build();
const results = await client.search('products', query);
Performance:
- Low-latency similarity search across collections
- Accurate results guaranteed
- Further optimized with vector indexes (when defined in schema)
Hybrid Search
Combine text search with vector similarity for powerful hybrid queries:
// Hybrid: Text + Vector search
const query = new SearchQueryBuilder('database performance')
.fields(['title', 'content'])
.vector(queryEmbedding)
.vectorField('embedding')
.textWeight(0.3) // 30% text relevance
.vectorWeight(0.7) // 70% semantic similarity
.limit(10)
.build();
const results = await client.search('articles', query);
Use cases:
- Semantic search with keyword filtering
- Recommendations with category constraints
- RAG systems with both keyword and semantic matching
Filtering by Metadata
Restrict vector and hybrid search to a subset of records with a metadata pre-filter. Only records matching the filter are considered as candidates before similarity ranking, so a query like "find the nearest in-stock electronics over $100" never scores the rest of the collection.
import { SearchQueryBuilder, QueryBuilder } from '@ekodb/ekodb-client';
const filter = new QueryBuilder()
.eq('category', 'electronics')
.gte('price', 100)
.build().filter;
const query = new SearchQueryBuilder('')
.vector(queryVector)
.vectorK(10)
.filters(filter) // only electronics priced >= 100 are ranked
.build();
const results = await client.search('products', query);
The filter uses the same Query Expression syntax as find, including Logical And / Or / Not combinations. The pre-filter is uniform across every search mode: full-text search, brute-force vector search, the indexed vector paths, and hybrid search. It is always applied to the candidate set before ranking, so results are never silently truncated by an index window. In hybrid search it governs the entire result, so a record that matches the text query but fails the filter is excluded, not surfaced on its text score alone.
Indexed vector search with a filter is exact. A
Flatindex ranks only matching records directly. AnHNSWindex runs a fast approximate filtered traversal, but if a selective filter starves that traversal of candidates, ekoDB automatically falls back to an exact scan — so you always get the true matches, never a silently truncated set. The only cost of a highly selective filter is a little extra latency on that fallback.
The same .filters(...) works on a pure text search too:
// Full-text search restricted to one category
const query = new SearchQueryBuilder("introduction")
.fields(["title", "content"])
.filters(new QueryBuilder().eq("category", "ml").build().filter)
.build();
const results = await client.search("documents", query);
// Hybrid search constrained to a single tenant
const query = new SearchQueryBuilder('annual report')
.fields(['title', 'body'])
.vector(queryVector)
.textWeight(0.3)
.vectorWeight(0.7)
.filters(new QueryBuilder().eq('tenant_id', tenantId).build().filter)
.build();
const results = await client.search('documents', query);
Real-World Examples
Semantic Search
- Client Libraries (Recommended)
- Direct API
// Generate embedding from OpenAI
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
async function semanticSearch(query: string) {
// 1. Get query embedding
const response = await openai.embeddings.create({
model: 'text-embedding-ada-002',
input: query,
});
const queryVector = response.data[0].embedding;
// 2. Search by similarity
const query = new SearchQueryBuilder('')
.vector(queryVector)
.vectorMetric('cosine')
.vectorK(5)
.build();
const results = await client.search('articles', query);
return results;
}
// Search: "How to optimize database queries"
const articles = await semanticSearch('database optimization tips');
# 1. Get embedding from OpenAI (or your embedding service)
QUERY_VECTOR=$(curl https://api.openai.com/v1/embeddings \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "text-embedding-ada-002", "input": "database optimization tips"}' \
| jq '.data[0].embedding')
# 2. Search ekoDB
curl -X POST https://{EKODB_API_URL}/api/search/articles \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d "{
\"query\": \"\",
\"vector\": $QUERY_VECTOR,
\"vector_metric\": \"cosine\",
\"vector_k\": 5
}"
Product Recommendations
async function getSimilarProducts(productId: string) {
// 1. Get product's embedding
const product = await client.findById('products', productId);
const embedding = product.embedding;
// 2. Find similar products
const query = new SearchQueryBuilder('')
.vector(embedding)
.vectorMetric('cosine')
.vectorK(10)
.build();
const similar = await client.search('products', query);
// Filter out the original product
return similar.filter(p => p.id !== productId);
}
Image Similarity
// Store image embeddings from CLIP or similar model
async function findSimilarImages(imageEmbedding: number[], category?: string) {
const query = new SearchQueryBuilder('')
.vector(imageEmbedding)
.vectorMetric('cosine')
.vectorK(20)
.vectorThreshold(0.7)
.build();
const results = await client.search('images', query);
return results.map(r => ({
url: r.url,
similarity: r.score,
metadata: r.metadata
}));
}
Deletion & Index Maintenance
Automatic Deletion from Search
When you delete a record that contains a vector field, ekoDB immediately removes it from vector search results. There's no need to manually update the index — deleted records won't appear in searches.
// Insert a record with a vector
await client.insert('products', {
name: 'Discontinued Widget',
embedding: [0.12, 0.34, 0.56, ...]
});
// Delete it — immediately excluded from vector search results
await client.delete('products', recordId);
Reindexing
Over time, frequent deletions can degrade search performance, because a deleted vector is marked rather than removed from the graph. After heavy delete churn, reindex to rebuild the search graph and restore optimal performance. Reindexing is manual (call the reindex endpoint or your client's reindex method); ekoDB does not auto-trigger it.
# Reindex a collection's vector index
curl -X POST https://{EKODB_API_URL}/api/indexes/search/{collection}/reindex \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{}'
# Optionally specify which vector field to reindex
curl -X POST https://{EKODB_API_URL}/api/indexes/search/{collection}/reindex \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{"field": "embedding"}'
Response:
{
"status": "ok",
"collection": "products",
"field": "embedding",
"vectors_reindexed": 4850,
"duration_ms": 127.5
}
Reindexing is only needed after heavy deletion workloads. For most applications with occasional deletes, the index stays performant without manual intervention.
Search Tuning
ef_search (Beam Width)
The ef_search parameter controls the search beam width — higher values explore more of the graph, improving accuracy at the cost of latency. ekoDB resolves ef_search with a 3-tier fallback:
- Per-query override — pass
ef_searchin the search request - Collection-level config — set in the vector index configuration
- Heuristic default —
max(k * 2, 64)
Both vector_k and ef_search are clamped to server caps (max_vector_k, default 1000; max_ef_search, default 4000) so a single query cannot exhaust memory or CPU. Raise them via PUT /api/config. See Configuration — Search.
// Per-query override for a high-precision search.
// The builder covers the common parameters; ef_search is set via the Direct API (below).
const query = new SearchQueryBuilder('')
.vector(queryVector)
.vectorMetric('cosine')
.vectorK(10)
.build();
const results = await client.search('articles', query);
# Direct API with ef_search override
curl -X POST https://{EKODB_API_URL}/api/search/{collection} \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"query": "",
"vector": [0.12, 0.34, 0.56, ...],
"vector_metric": "cosine",
"vector_k": 10,
"ef_search": 200
}'
| ef_search | Accuracy | Latency | Use Case |
|---|---|---|---|
| 32-64 | Good | Low | Real-time recommendations |
| 64-128 | High | Medium | Semantic search (default range) |
| 200+ | Very High | Higher | Precision-critical applications |
Performance Optimization
1. Use Vector Indexes
Define vector indexes in your schema for automatic indexing:
// Define index in schema for automatic indexing
const schema = {
fields: {
embedding: {
field_type: 'Vector',
index: {
type: 'vector',
algorithm: 'flat', // exact search
metric: 'cosine',
}
}
}
};
2. Set Similarity Threshold
Filter out low-relevance results:
const query = new SearchQueryBuilder('')
.vector(queryVector)
.vectorMetric('cosine')
.vectorK(10)
.vectorThreshold(0.75) // only return results with > 75% similarity
.build();
const results = await client.search('articles', query);
3. Batch Insert Vectors
// More efficient than individual inserts
await client.batchInsert('articles', articlesWithEmbeddings);
4. Use Field Projection
Return only needed fields to reduce data transfer:
const query = new SearchQueryBuilder('')
.vector(queryVector)
.vectorMetric('cosine')
.vectorK(10)
.selectFields(['title', 'price', 'image_url']) // only these fields
.build();
const results = await client.search('products', query);
Embedding Models
ekoDB works with any embedding model. Popular choices:
OpenAI
import OpenAI from 'openai';
const openai = new OpenAI();
const response = await openai.embeddings.create({
model: 'text-embedding-ada-002', // 1536 dimensions
input: 'Your text here',
});
const embedding = response.data[0].embedding;
Cohere
import { CohereClient } from 'cohere-ai';
const cohere = new CohereClient({ token: process.env.COHERE_API_KEY });
const response = await cohere.embed({
texts: ['Your text here'],
model: 'embed-english-v3.0', // 1024 dimensions
});
const embedding = response.embeddings[0];
Local Models (Transformers.js)
import { pipeline } from '@xenova/transformers';
const extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
const output = await extractor('Your text here', { pooling: 'mean', normalize: true });
const embedding = Array.from(output.data); // 384 dimensions
API Reference
Vector, text, and hybrid search all go through client.search(collection, query), where query is built with SearchQueryBuilder. Which fields you set decides the search type.
Vector search
import { SearchQueryBuilder } from '@ekodb/ekodb-client';
const query = new SearchQueryBuilder('') // text query ('' for vector-only)
.vector(queryVector) // query vector (presence triggers vector search)
.vectorField('embedding') // field holding vectors (default: 'embedding')
.vectorMetric('cosine') // 'cosine' | 'euclidean' | 'dotproduct'
.vectorK(10) // number of nearest neighbors
.vectorThreshold(0.7) // minimum similarity score (0.0-1.0)
.filters(metadataFilter) // optional metadata pre-filter (QueryExpression)
.selectFields(['title', 'price']) // field projection (or .excludeFields([...]))
.build();
const results = await client.search('products', query);
Hybrid search
Set a text query plus a vector and weight them:
const query = new SearchQueryBuilder('database performance')
.fields(['title', 'content'])
.vector(queryVector)
.textWeight(0.3)
.vectorWeight(0.7)
.limit(10)
.build();
const results = await client.search('articles', query);
ef_search and other less-common parameters can be sent directly in the body of POST /api/search/{collection} (see the Direct API examples above).
Best Practices
- Match Dimensions: Keep every vector in a field the same length as your embedding model's output — ekoDB locks the dimension to the first inserted vector and rejects mismatched lengths
- Use Schemas: Declare the field as
Vectorand attach a vector index for automatic similarity indexing - Choose Right Metric: Cosine for most AI embeddings (OpenAI, Cohere, etc.)
- Batch Operations: Use
batchInsertfor multiple vectors - Set Thresholds: Filter low-relevance results with
thresholdparameter - Tune
ef_search: Raise it for precision-critical queries, lower it for latency-sensitive ones - Field Projection: Only return fields you need
- Monitor Performance: Track query latency and optimize with vector indexes as needed
Related Documentation
- Indexes - Create indexes for optimal performance
- Query Expressions - Filter syntax
- Client Libraries - Full API examples
- System Administration - Monitor performance
Summary
Vector search in ekoDB enables:
✅ Semantic search - Find by meaning, not just keywords ✅ Recommendations - Product, content, and user similarity ✅ Image search - Visual similarity matching ✅ RAG systems - Retrieval-augmented generation ✅ Integrated - No separate vector database needed ✅ Production-ready - Competitive performance and reliability