Skip to main content

Indexes

Indexes dramatically improve query and search performance by creating optimized data structures for fast lookups. ekoDB supports two types of indexes: query indexes for exact matches and filters, and search indexes for full-text and vector similarity search.

When to Use Indexes
  • Query often on specific fields (e.g., status, email, user_id)
  • Search text content or vector embeddings
  • Performance issues with large collections (> 10,000 records)
  • Filter combinations are common in your queries

Query Indexes

Query indexes optimize exact match and comparison queries using B-tree structures.

Create Query Index

Create Index

Define an index in the collection schema.

use ekodb_client::{Client, Schema, FieldTypeSchema, IndexConfig};

let schema = Schema::new()
.add_field(
"email",
FieldTypeSchema::new("string")
.with_index(IndexConfig::Hash)
);

client.create_collection("users", schema).await?;

Create an index on one or more fields to speed up queries.

POST https://{EKODB_API_URL}/api/indexes/query/{collection}
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}

{
"field": "email",
"index_type": "BTree",
"label": "email_idx"
}

# Response
{
"status": "success",
"message": "Index 'email_idx' created successfully",
"index": {
"label": "email_idx",
"collection": "users",
"field": "email",
"index_type": "BTree",
"unique": false,
"sparse": false
}
}

Optional Parameters:

  • unique (boolean) - Enforce uniqueness constraint (default: false)
  • sparse (boolean) - Index only documents with the field (default: false)

Composite Index (Multiple Fields):

{
"fields": ["status", "created_at"],
"index_type": "btree"
}

List Query Indexes

Get all query indexes for a collection.

Client Library Support

Index listing is not yet available in client libraries. Use the Direct API below.

Delete Query Index

Remove an index when it's no longer needed.

Client Library Support

Index deletion is not yet available in client libraries. Use the Direct API below.

Explain Query Execution

Analyze how a query will be executed and whether indexes are used.

use ekodb_client::QueryBuilder;
use serde_json::json;

let explanation = client.explain_query(
"users",
json!({
"filter": {
"type": "Condition",
"content": {
"field": "email",
"operator": "Eq",
"value": "user@example.com"
}
}
})
).await?;

println!("{:?}", explanation["execution_plan"]);

Search Indexes

Search indexes enable full-text search and vector similarity search.

Create Search Index

Create a search index for text or vector fields.

Text Search Index:

POST https://{EKODB_API_URL}/api/indexes/search/{collection}
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}

{
"field": "description",
"index_type": "text",
"options": {
"language": "english",
"stemming": true,
"stop_words": true,
"min_word_length": 3
}
}

# Response
{
"status": "success",
"message": "Search index created successfully on field 'description'",
"field": "description",
"index_type": "text",
"documents_indexed": 5678
}

Vector Search Index:

POST https://{EKODB_API_URL}/api/indexes/search/{collection}
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}

{
"field": "embedding",
"index_type": "vector",
"options": {
"dimension": 1536,
"metric": "cosine",
"algorithm": "hnsw",
"ef_construction": 200,
"m": 16
}
}

# Response
{
"status": "success",
"message": "Search index created successfully on field 'embedding'",
"field": "embedding",
"index_type": "vector",
"documents_indexed": 10000
}

Analyze text search query execution.

POST https://{EKODB_API_URL}/api/search/text/{collection}/explain
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}

{
"query": "machine learning algorithms",
"field": "description",
"limit": 10
}

# Response
{
"query": "machine learning algorithms",
"parsed_terms": ["machine", "learning", "algorithm"],
"stemmed_terms": ["machin", "learn", "algorithm"],
"execution_plan": {
"index_used": "description_text",
"search_type": "full_text",
"estimated_matches": 156,
"scoring_method": "tf_idf"
},
"performance": {
"estimated_time_ms": 15,
"index_hit_rate": "high"
}
}

Analyze vector similarity search execution.

POST https://{EKODB_API_URL}/api/search/vector/{collection}/explain
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}

{
"vector": [0.1, 0.2, 0.3, ...],
"field": "embedding",
"limit": 10,
"metric": "cosine"
}

# Response
{
"execution_plan": {
"index_used": "embedding_vector",
"algorithm": "hnsw",
"dimension": 1536,
"metric": "cosine",
"estimated_comparisons": 234,
"search_type": "approximate"
},
"performance": {
"estimated_time_ms": 8,
"accuracy": "high",
"speedup_vs_brute_force": "427x"
},
"index_stats": {
"total_vectors": 10000,
"ef_search": 100,
"levels": 4
}
}

Analyze combined text and vector search.

POST https://{EKODB_API_URL}/api/search/hybrid/{collection}/explain
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}

{
"text_query": "neural networks",
"vector": [0.1, 0.2, ...],
"text_weight": 0.3,
"vector_weight": 0.7,
"limit": 10
}

# Response
{
"execution_plan": {
"text_search": {
"index_used": "description_text",
"estimated_matches": 89,
"weight": 0.3
},
"vector_search": {
"index_used": "embedding_vector",
"estimated_matches": 10000,
"weight": 0.7
},
"fusion_method": "reciprocal_rank"
},
"performance": {
"estimated_time_ms": 23,
"combined_accuracy": "very_high"
}
}

Index Types

Query Index Types

TypeUse CasePerformanceSpace Usage
btreeExact matches, range queriesVery fast lookupsModerate
hashExact matches onlyFastest lookupsLow

Search Index Types

TypeUse CasePerformanceSpace Usage
textFull-text searchFastModerate
vectorSemantic/similarity searchVery fastHigh
hybridCombined text + vector searchFastHigh

Vector Index Algorithms

HNSW (Hierarchical Navigable Small World)

Best for most use cases - balances speed and accuracy.

{
"algorithm": "hnsw",
"options": {
"ef_construction": 200,
"m": 16,
"ef_search": 100
}
}

Parameters:

  • ef_construction - Quality during build (higher = better, slower)
  • m - Connections per layer (higher = better recall, more space)
  • ef_search - Search quality (higher = better, slower)

IVF (Inverted File)

Best for very large datasets (> 1M vectors).

{
"algorithm": "ivf",
"options": {
"nlist": 100,
"nprobe": 10
}
}

Parameters:

  • nlist - Number of clusters
  • nprobe - Clusters to search (higher = better, slower)

Distance Metrics

Cosine Similarity

Best for: Text embeddings, normalized vectors

{
"metric": "cosine"
}

Range: -1 to 1 (1 = identical, -1 = opposite)

Euclidean Distance (L2)

Best for: Spatial data, image embeddings

{
"metric": "euclidean"
}

Range: 0 to ∞ (0 = identical)

Dot Product

Best for: Pre-normalized embeddings, fast comparisons

{
"metric": "dot_product"
}

Complete Example

Create a high-performance search system:

#!/bin/bash

# 1. Create collection for articles
curl -X POST https://{EKODB_API_URL}/api/collections/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"fields": {
"title": {"type": "string", "required": true},
"content": {"type": "string", "required": true},
"author_id": {"type": "string", "required": true},
"status": {"type": "string", "enum": ["draft", "published"]},
"embedding": {"type": "array"}
}
}'

# 2. Create query index for common filters
curl -X POST https://{EKODB_API_URL}/api/indexes/query/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"field": "status",
"index_type": "btree"
}'

# 3. Create composite index for author + status
curl -X POST https://{EKODB_API_URL}/api/indexes/query/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"fields": ["author_id", "status"],
"index_type": "btree"
}'

# 4. Create text search index
curl -X POST https://{EKODB_API_URL}/api/indexes/search/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"field": "content",
"index_type": "text",
"options": {
"language": "english",
"stemming": true,
"stop_words": true
}
}'

# 5. Create vector index for semantic search
curl -X POST https://{EKODB_API_URL}/api/indexes/search/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"field": "embedding",
"index_type": "vector",
"options": {
"dimension": 1536,
"metric": "cosine",
"algorithm": "hnsw",
"ef_construction": 200,
"m": 16
}
}'

# 6. Explain query to verify index usage
curl -X POST https://{EKODB_API_URL}/api/query/articles/explain \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"filter": {
"type": "Logical",
"content": [
{
"type": "Condition",
"content": {
"field": "status",
"operator": "Eq",
"value": "published"
}
},
{
"type": "Condition",
"content": {
"field": "author_id",
"operator": "Eq",
"value": "author_123"
}
}
]
}
}'

# 7. Test hybrid search explain
curl -X POST https://{EKODB_API_URL}/api/search/hybrid/articles/explain \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"text_query": "machine learning",
"vector": [0.1, 0.2, ...],
"text_weight": 0.4,
"vector_weight": 0.6,
"limit": 10
}'

Best Practices

Index Selection

Create Indexes For:

  • ✅ Fields used in WHERE clauses frequently
  • ✅ Foreign keys and join fields
  • ✅ Fields used for sorting
  • ✅ Text fields for search
  • ✅ Vector embeddings for similarity search

Avoid Indexing:

  • ❌ Low-cardinality fields (e.g., boolean with 2 values)
  • ❌ Fields that change frequently
  • ❌ Very large text fields (use search index instead)
  • ❌ Fields never used in queries

Composite Indexes

Order Matters:

# Good: Index fields in order of selectivity
# (status has few values, created_at is more selective)
fields: ["created_at", "status"]

# Less optimal
fields: ["status", "created_at"]

Use for Common Query Combinations:

# If you often query: WHERE author_id = X AND status = Y
# Create composite index:
fields: ["author_id", "status"]

Monitor Index Usage

# Check which indexes exist
GET /api/indexes/query/articles

# Response shows all indexes
{
"collection": "articles",
"indexes": [
{
"label": "articles_email_index",
"collection": "articles",
"field": "email",
"index_type": "BTree",
"unique": true,
"sparse": false
},
{
"label": "articles_old_field_index",
"collection": "articles",
"field": "old_field",
"index_type": "Hash",
"unique": false,
"sparse": false
}
],
"count": 2
}

# Delete unused indexes to save space
DELETE /api/indexes/query/articles/old_field

Vector Index Tuning

For Accuracy:

{
"algorithm": "hnsw",
"ef_construction": 400,
"m": 32,
"ef_search": 200
}

For Speed:

{
"algorithm": "hnsw",
"ef_construction": 100,
"m": 16,
"ef_search": 50
}

For Balance:

{
"algorithm": "hnsw",
"ef_construction": 200,
"m": 16,
"ef_search": 100
}

Index Maintenance

Rebuild Indexes After:

  • Large bulk imports
  • Schema changes
  • Performance degradation
  • Significant data updates
# Delete and recreate index
DELETE /api/indexes/query/articles/email
POST /api/indexes/query/articles
{
"field": "email",
"index_type": "btree"
}

Performance Impact

Query Performance

RecordsNo IndexWith IndexSpeedup
1,000~5ms~1ms5x
10,000~50ms~2ms25x
100,000~500ms~3ms166x
1M~5,000ms~5ms1,000x

Space Usage

Index TypeSpace OverheadExample (1M records)
Query (btree)10-20%~50-100 MB
Text search20-40%~100-200 MB
Vector (HNSW)50-100%~500MB-1GB

Example Code

Complete working examples for search and schema management:

Schema management examples: