Indexes
Indexes dramatically improve query and search performance by creating optimized data structures for fast lookups. ekoDB supports two types of indexes: query indexes for exact matches and filters, and search indexes for full-text and vector similarity search.
- Query often on specific fields (e.g.,
status,email,user_id) - Search text content or vector embeddings
- Performance issues with large collections (> 10,000 records)
- Filter combinations are common in your queries
Query Indexes
Query indexes optimize exact match and comparison queries using B-tree structures.
Create Query Index
Create an index on one or more fields to speed up queries.
POST https://{EKODB_API_URL}/api/indexes/query/{collection}
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
{
"field": "email",
"index_type": "BTree",
"label": "email_idx"
}
# Response
{
"status": "success",
"message": "Index 'email_idx' created successfully",
"index": {
"label": "email_idx",
"collection": "users",
"field": "email",
"index_type": "BTree",
"unique": false,
"sparse": false
}
}
Optional Parameters:
unique(boolean) - Enforce uniqueness constraint (default: false)sparse(boolean) - Index only documents with the field (default: false)
Composite Index (Multiple Fields):
{
"fields": ["status", "created_at"],
"index_type": "btree"
}
List Query Indexes
Get all query indexes for a collection.
GET https://{EKODB_API_URL}/api/indexes/query/{collection}
Authorization: Bearer {ADMIN_TOKEN}
# Response
{
"collection": "users",
"indexes": [
{
"label": "users_email_index",
"collection": "users",
"field": "email",
"index_type": "BTree",
"unique": false,
"sparse": false
},
{
"label": "users_status_index",
"collection": "users",
"field": "status",
"index_type": "Hash",
"unique": false,
"sparse": false
}
],
"count": 2
}
Delete Query Index
Remove an index when it's no longer needed.
DELETE https://{EKODB_API_URL}/api/indexes/query/{collection}/{field}
Authorization: Bearer {ADMIN_TOKEN}
# Response
{
"status": "success",
"message": "Index on field 'email' deleted successfully",
"index": null
}
Explain Query Execution
Analyze how a query will be executed and whether indexes are used.
POST https://{EKODB_API_URL}/api/query/{collection}/explain
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
{
"filter": {
"type": "expression",
"content": {
"field": "email",
"operator": "Equals",
"value": "user@example.com"
}
}
}
# Response
{
"query": {
"filter": {
"type": "expression",
"content": {
"field": "email",
"operator": "Equals",
"value": "user@example.com"
}
}
},
"execution_plan": {
"scan_type": "index_scan",
"indexes_used": [
{
"field": "email",
"index_type": "BTree",
"unique": false,
"sparse": false
}
],
"estimated_rows": 100,
"filter_selectivity": 0.01
},
"estimated_cost": 5.5,
"recommendations": [
"Index on 'email' field is optimal for this query",
"Consider creating a unique index if email values are unique"
]
}
Search Indexes
Search indexes enable full-text search and vector similarity search.
Create Search Index
Create a search index for text or vector fields.
Text Search Index:
POST https://{EKODB_API_URL}/api/indexes/search/{collection}
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
{
"field": "description",
"index_type": "text",
"options": {
"language": "english",
"stemming": true,
"stop_words": true,
"min_word_length": 3
}
}
# Response
{
"status": "success",
"message": "Search index created successfully on field 'description'",
"field": "description",
"index_type": "text",
"documents_indexed": 5678
}
Vector Search Index:
POST https://{EKODB_API_URL}/api/indexes/search/{collection}
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
{
"field": "embedding",
"index_type": "vector",
"options": {
"dimension": 1536,
"metric": "cosine",
"algorithm": "hnsw",
"ef_construction": 200,
"m": 16
}
}
# Response
{
"status": "success",
"message": "Search index created successfully on field 'embedding'",
"field": "embedding",
"index_type": "vector",
"documents_indexed": 10000
}
Explain Text Search
Analyze text search query execution.
POST https://{EKODB_API_URL}/api/search/text/{collection}/explain
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
{
"query": "machine learning algorithms",
"field": "description",
"limit": 10
}
# Response
{
"query": "machine learning algorithms",
"parsed_terms": ["machine", "learning", "algorithm"],
"stemmed_terms": ["machin", "learn", "algorithm"],
"execution_plan": {
"index_used": "description_text",
"search_type": "full_text",
"estimated_matches": 156,
"scoring_method": "bm25"
},
"performance": {
"estimated_time_ms": 15,
"index_hit_rate": "high"
}
}
Explain Vector Search
Analyze vector similarity search execution.
POST https://{EKODB_API_URL}/api/search/vector/{collection}/explain
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
{
"vector": [0.1, 0.2, 0.3, ...],
"field": "embedding",
"limit": 10,
"metric": "cosine"
}
# Response
{
"execution_plan": {
"index_used": "embedding_vector",
"algorithm": "hnsw",
"dimension": 1536,
"metric": "cosine",
"estimated_comparisons": 234,
"search_type": "approximate"
},
"performance": {
"estimated_time_ms": 8,
"accuracy": "high",
"speedup_vs_brute_force": "427x"
},
"index_stats": {
"total_vectors": 10000,
"ef_search": 100,
"levels": 4
}
}
Explain Hybrid Search
Analyze combined text and vector search.
POST https://{EKODB_API_URL}/api/search/hybrid/{collection}/explain
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
{
"text_query": "neural networks",
"vector": [0.1, 0.2, ...],
"text_weight": 0.3,
"vector_weight": 0.7,
"limit": 10
}
# Response
{
"execution_plan": {
"text_search": {
"index_used": "description_text",
"estimated_matches": 89,
"weight": 0.3
},
"vector_search": {
"index_used": "embedding_vector",
"estimated_matches": 10000,
"weight": 0.7
},
"fusion_method": "reciprocal_rank"
},
"performance": {
"estimated_time_ms": 23,
"combined_accuracy": "very_high"
}
}
Index Types
Query Index Types
| Type | Use Case | Performance | Space Usage |
|---|---|---|---|
btree | Exact matches, range queries | Very fast lookups | Moderate |
hash | Exact matches only | Fastest lookups | Low |
Search Index Types
| Type | Use Case | Performance | Space Usage |
|---|---|---|---|
text | Full-text search | Fast | Moderate |
vector | Semantic/similarity search | Very fast | High |
hybrid | Combined text + vector search | Fast | High |
Vector Index Algorithms
HNSW (Hierarchical Navigable Small World)
Best for most use cases - balances speed and accuracy.
{
"algorithm": "hnsw",
"options": {
"ef_construction": 200,
"m": 16,
"ef_search": 100
}
}
Parameters:
ef_construction- Quality during build (higher = better, slower)m- Connections per layer (higher = better recall, more space)ef_search- Search quality (higher = better, slower)
IVF (Inverted File)
Best for very large datasets (> 1M vectors).
{
"algorithm": "ivf",
"options": {
"nlist": 100,
"nprobe": 10
}
}
Parameters:
nlist- Number of clustersnprobe- Clusters to search (higher = better, slower)
Distance Metrics
Cosine Similarity
Best for: Text embeddings, normalized vectors
{
"metric": "cosine"
}
Range: -1 to 1 (1 = identical, -1 = opposite)
Euclidean Distance (L2)
Best for: Spatial data, image embeddings
{
"metric": "euclidean"
}
Range: 0 to ∞ (0 = identical)
Dot Product
Best for: Pre-normalized embeddings, fast comparisons
{
"metric": "dot_product"
}
Complete Example
Create a high-performance search system:
#!/bin/bash
# 1. Create collection for articles
curl -X POST https://{EKODB_API_URL}/api/collections/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"fields": {
"title": {"type": "string", "required": true},
"content": {"type": "string", "required": true},
"author_id": {"type": "string", "required": true},
"status": {"type": "string", "enum": ["draft", "published"]},
"embedding": {"type": "array"}
}
}'
# 2. Create query index for common filters
curl -X POST https://{EKODB_API_URL}/api/indexes/query/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"field": "status",
"index_type": "btree"
}'
# 3. Create composite index for author + status
curl -X POST https://{EKODB_API_URL}/api/indexes/query/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"fields": ["author_id", "status"],
"index_type": "btree"
}'
# 4. Create text search index
curl -X POST https://{EKODB_API_URL}/api/indexes/search/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"field": "content",
"index_type": "text",
"options": {
"language": "english",
"stemming": true,
"stop_words": true
}
}'
# 5. Create vector index for semantic search
curl -X POST https://{EKODB_API_URL}/api/indexes/search/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"field": "embedding",
"index_type": "vector",
"options": {
"dimension": 1536,
"metric": "cosine",
"algorithm": "hnsw",
"ef_construction": 200,
"m": 16
}
}'
# 6. Explain query to verify index usage
curl -X POST https://{EKODB_API_URL}/api/query/articles/explain \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"filter": {
"type": "and",
"content": [
{
"type": "expression",
"content": {
"field": "status",
"operator": "Equals",
"value": "published"
}
},
{
"type": "expression",
"content": {
"field": "author_id",
"operator": "Equals",
"value": "author_123"
}
}
]
}
}'
# 7. Test hybrid search explain
curl -X POST https://{EKODB_API_URL}/api/search/hybrid/articles/explain \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"text_query": "machine learning",
"vector": [0.1, 0.2, ...],
"text_weight": 0.4,
"vector_weight": 0.6,
"limit": 10
}'
Best Practices
Index Selection
Create Indexes For:
- ✅ Fields used in WHERE clauses frequently
- ✅ Foreign keys and join fields
- ✅ Fields used for sorting
- ✅ Text fields for search
- ✅ Vector embeddings for similarity search
Avoid Indexing:
- ❌ Low-cardinality fields (e.g., boolean with 2 values)
- ❌ Fields that change frequently
- ❌ Very large text fields (use search index instead)
- ❌ Fields never used in queries
Composite Indexes
Order Matters:
# Good: Index fields in order of selectivity
# (status has few values, created_at is more selective)
fields: ["created_at", "status"]
# Less optimal
fields: ["status", "created_at"]
Use for Common Query Combinations:
# If you often query: WHERE author_id = X AND status = Y
# Create composite index:
fields: ["author_id", "status"]
Monitor Index Usage
# Check which indexes exist
GET /api/indexes/query/articles
# Response shows all indexes
{
"collection": "articles",
"indexes": [
{
"label": "articles_email_index",
"collection": "articles",
"field": "email",
"index_type": "BTree",
"unique": true,
"sparse": false
},
{
"label": "articles_old_field_index",
"collection": "articles",
"field": "old_field",
"index_type": "Hash",
"unique": false,
"sparse": false
}
],
"count": 2
}
# Delete unused indexes to save space
DELETE /api/indexes/query/articles/old_field
Vector Index Tuning
For Accuracy:
{
"algorithm": "hnsw",
"ef_construction": 400,
"m": 32,
"ef_search": 200
}
For Speed:
{
"algorithm": "hnsw",
"ef_construction": 100,
"m": 16,
"ef_search": 50
}
For Balance:
{
"algorithm": "hnsw",
"ef_construction": 200,
"m": 16,
"ef_search": 100
}
Index Maintenance
Rebuild Indexes After:
- Large bulk imports
- Schema changes
- Performance degradation
- Significant data updates
# Delete and recreate index
DELETE /api/indexes/query/articles/email
POST /api/indexes/query/articles
{
"field": "email",
"index_type": "btree"
}
Performance Impact
Query Performance
| Records | No Index | With Index | Speedup |
|---|---|---|---|
| 1,000 | ~5ms | ~1ms | 5x |
| 10,000 | ~50ms | ~2ms | 25x |
| 100,000 | ~500ms | ~3ms | 166x |
| 1M | ~5,000ms | ~5ms | 1,000x |
Space Usage
| Index Type | Space Overhead | Example (1M records) |
|---|---|---|
| Query (btree) | 10-20% | ~50-100 MB |
| Text search | 20-40% | ~100-200 MB |
| Vector (HNSW) | 50-100% | ~500MB-1GB |
Related Documentation
- Basic Operations - Query and search records
- Query Expressions - Filter syntax for search queries
- Collections & Schemas - Create collections
- System Administration - Monitor performance
Example Code
Complete working examples for search and schema management:
- Rust Search -
client_search.rs - Python Search -
client_search.py - TypeScript Search -
client_search.ts - Go Search -
client_search.go - Kotlin Search -
ClientSearch.kt
Schema management examples:
- Rust Schema -
client_schema_management.rs - Python Schema -
client_schema.py - TypeScript Schema -
client_schema.ts - Go Schema -
client_schema.go - Kotlin Schema -
ClientSchemaManagement.kt