Indexes
Indexes dramatically improve query and search performance by creating optimized data structures for fast lookups. ekoDB supports two types of indexes: query indexes for exact matches and filters, and search indexes for full-text and vector similarity search.
- Query often on specific fields (e.g.,
status,email,user_id) - Search text content or vector embeddings
- Performance issues with large collections (> 10,000 records)
- Filter combinations are common in your queries
Query Indexes
Query indexes optimize exact match and comparison queries using B-tree structures.
Create Query Index
Create Index
Define an index in the collection schema.
- Client Libraries (Recommended)
- Direct API
- 🦀 Rust
- 🐍 Python
- 📘 TypeScript
- 📦 JavaScript
- 🟣 Kotlin
- 🔷 Go
use ekodb_client::{Client, Schema, FieldTypeSchema, IndexConfig};
let schema = Schema::new()
.add_field(
"email",
FieldTypeSchema::new("string")
.with_index(IndexConfig::Hash)
);
client.create_collection("users", schema).await?;
client.create_schema('users', {
'indexes': [
{
'name': 'email_idx',
'type': 'hash',
'fields': ['email']
}
]
})
await client.createSchema('users', {
indexes: [
{
name: 'email_idx',
type: 'hash',
fields: ['email']
}
]
});
await client.createSchema('users', {
indexes: [
{
name: 'email_idx',
type: 'hash',
fields: ['email']
}
]
});
import io.ekodb.client.EkoDBClient
val schema = NewSchemaBuilder()
.addField("email",
NewFieldTypeSchemaBuilder("string")
.hashIndex()
.build()
)
.build()
client.createCollection("users", schema)
import "github.com/ekoDB/ekodb-client-go"
schema := ekodb.NewSchemaBuilder().
AddField("email",
ekodb.NewFieldTypeSchemaBuilder("string").
HashIndex().
Build(),
).
Build()
err := client.CreateCollection("users", schema)
curl -X POST https://{EKODB_API_URL}/api/schemas/users \
-H "Authorization: Bearer {TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"indexes": [
{
"name": "email_idx",
"type": "hash",
"fields": ["email"]
}
]
}'
# Response
{
"status": "success",
"message": "Schema created successfully"
}
Create an index on one or more fields to speed up queries.
POST https://{EKODB_API_URL}/api/indexes/query/{collection}
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
{
"field": "email",
"index_type": "BTree",
"label": "email_idx"
}
# Response
{
"status": "success",
"message": "Index 'email_idx' created successfully",
"index": {
"label": "email_idx",
"collection": "users",
"field": "email",
"index_type": "BTree",
"unique": false,
"sparse": false
}
}
Optional Parameters:
unique(boolean) - Enforce uniqueness constraint (default: false)sparse(boolean) - Index only documents with the field (default: false)
Composite Index (Multiple Fields):
{
"fields": ["status", "created_at"],
"index_type": "btree"
}
List Query Indexes
Get all query indexes for a collection.
- Client Libraries (Recommended)
- Direct API
Index listing is not yet available in client libraries. Use the Direct API below.
curl https://{EKODB_API_URL}/api/indexes/query/{collection} \
-H "Authorization: Bearer {ADMIN_TOKEN}"
# Response
{
"collection": "users",
"indexes": [
{
"label": "users_email_index",
"collection": "users",
"field": "email",
"index_type": "BTree",
"unique": false,
"sparse": false
},
{
"label": "users_status_index",
"collection": "users",
"field": "status",
"index_type": "Hash",
"unique": false,
"sparse": false
}
],
"count": 2
}
Delete Query Index
Remove an index when it's no longer needed.
- Client Libraries (Recommended)
- Direct API
Index deletion is not yet available in client libraries. Use the Direct API below.
curl -X DELETE https://{EKODB_API_URL}/api/indexes/query/{collection}/{field} \
-H "Authorization: Bearer {ADMIN_TOKEN}"
# Response
{
"status": "success",
"message": "Index on field 'email' deleted successfully",
"index": null
}
Explain Query Execution
Analyze how a query will be executed and whether indexes are used.
- Client Libraries (Recommended)
- Direct API
- 🦀 Rust
- 🐍 Python
- 📘 TypeScript
- 📦 JavaScript
- 🟣 Kotlin
- 🔷 Go
use ekodb_client::QueryBuilder;
use serde_json::json;
let explanation = client.explain_query(
"users",
json!({
"filter": {
"type": "Condition",
"content": {
"field": "email",
"operator": "Eq",
"value": "user@example.com"
}
}
})
).await?;
println!("{:?}", explanation["execution_plan"]);
explanation = client.explain_query('users', {
'filter': {
'type': 'expression',
'content': {
'field': 'email',
'operator': 'Equals',
'value': 'user@example.com'
}
}
})
print(explanation['execution_plan'])
const explanation = await client.explainQuery('users', {
filter: {
type: 'expression',
content: {
field: 'email',
operator: 'Equals',
value: 'user@example.com'
}
}
});
console.log(explanation.execution_plan);
const explanation = await client.explainQuery('users', {
filter: {
type: 'expression',
content: {
field: 'email',
operator: 'Equals',
value: 'user@example.com'
}
}
});
console.log(explanation.execution_plan);
val explanation = client.explainQuery("users", mapOf(
"filter" to mapOf(
"type" to "expression",
"content" to mapOf(
"field" to "email",
"operator" to "Equals",
"value" to "user@example.com"
)
)
))
println(explanation["execution_plan"])
explanation, err := client.ExplainQuery("users", map[string]interface{}{
"filter": map[string]interface{}{
"type": "Condition",
"content": map[string]interface{}{
"field": "email",
"operator": "Eq",
"value": "user@example.com",
},
},
})
fmt.Println(explanation["execution_plan"])
curl -X POST https://{EKODB_API_URL}/api/query/{collection}/explain \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"filter": {
"type": "Condition",
"content": {
"field": "email",
"operator": "Eq",
"value": "user@example.com"
}
}
}'
# Response
{
"query": {
"filter": {
"type": "Condition",
"content": {
"field": "email",
"operator": "Eq",
"value": "user@example.com"
}
}
},
"execution_plan": {
"scan_type": "index_scan",
"indexes_used": [
{
"field": "email",
"index_type": "BTree",
"unique": false,
"sparse": false
}
],
"estimated_rows": 100,
"filter_selectivity": 0.01
},
"estimated_cost": 5.5,
"recommendations": [
"Index on 'email' field is optimal for this query",
"Consider creating a unique index if email values are unique"
]
}
Search Indexes
Search indexes enable full-text search and vector similarity search.
Create Search Index
Create a search index for text or vector fields.
Text Search Index:
POST https://{EKODB_API_URL}/api/indexes/search/{collection}
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
{
"field": "description",
"index_type": "text",
"options": {
"language": "english",
"stemming": true,
"stop_words": true,
"min_word_length": 3
}
}
# Response
{
"status": "success",
"message": "Search index created successfully on field 'description'",
"field": "description",
"index_type": "text",
"documents_indexed": 5678
}
Vector Search Index:
POST https://{EKODB_API_URL}/api/indexes/search/{collection}
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
{
"field": "embedding",
"index_type": "vector",
"options": {
"dimension": 1536,
"metric": "cosine",
"algorithm": "hnsw",
"ef_construction": 200,
"m": 16
}
}
# Response
{
"status": "success",
"message": "Search index created successfully on field 'embedding'",
"field": "embedding",
"index_type": "vector",
"documents_indexed": 10000
}
Explain Text Search
Analyze text search query execution.
POST https://{EKODB_API_URL}/api/search/text/{collection}/explain
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
{
"query": "machine learning algorithms",
"field": "description",
"limit": 10
}
# Response
{
"query": "machine learning algorithms",
"parsed_terms": ["machine", "learning", "algorithm"],
"stemmed_terms": ["machin", "learn", "algorithm"],
"execution_plan": {
"index_used": "description_text",
"search_type": "full_text",
"estimated_matches": 156,
"scoring_method": "tf_idf"
},
"performance": {
"estimated_time_ms": 15,
"index_hit_rate": "high"
}
}
Explain Vector Search
Analyze vector similarity search execution.
POST https://{EKODB_API_URL}/api/search/vector/{collection}/explain
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
{
"vector": [0.1, 0.2, 0.3, ...],
"field": "embedding",
"limit": 10,
"metric": "cosine"
}
# Response
{
"execution_plan": {
"index_used": "embedding_vector",
"algorithm": "hnsw",
"dimension": 1536,
"metric": "cosine",
"estimated_comparisons": 234,
"search_type": "approximate"
},
"performance": {
"estimated_time_ms": 8,
"accuracy": "high",
"speedup_vs_brute_force": "427x"
},
"index_stats": {
"total_vectors": 10000,
"ef_search": 100,
"levels": 4
}
}
Explain Hybrid Search
Analyze combined text and vector search.
POST https://{EKODB_API_URL}/api/search/hybrid/{collection}/explain
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
{
"text_query": "neural networks",
"vector": [0.1, 0.2, ...],
"text_weight": 0.3,
"vector_weight": 0.7,
"limit": 10
}
# Response
{
"execution_plan": {
"text_search": {
"index_used": "description_text",
"estimated_matches": 89,
"weight": 0.3
},
"vector_search": {
"index_used": "embedding_vector",
"estimated_matches": 10000,
"weight": 0.7
},
"fusion_method": "reciprocal_rank"
},
"performance": {
"estimated_time_ms": 23,
"combined_accuracy": "very_high"
}
}
Index Types
Query Index Types
| Type | Use Case | Performance | Space Usage |
|---|---|---|---|
btree | Exact matches, range queries | Very fast lookups | Moderate |
hash | Exact matches only | Fastest lookups | Low |
Search Index Types
| Type | Use Case | Performance | Space Usage |
|---|---|---|---|
text | Full-text search | Fast | Moderate |
vector | Semantic/similarity search | Very fast | High |
hybrid | Combined text + vector search | Fast | High |
Vector Index Algorithms
HNSW (Hierarchical Navigable Small World)
Best for most use cases - balances speed and accuracy.
{
"algorithm": "hnsw",
"options": {
"ef_construction": 200,
"m": 16,
"ef_search": 100
}
}
Parameters:
ef_construction- Quality during build (higher = better, slower)m- Connections per layer (higher = better recall, more space)ef_search- Search quality (higher = better, slower)
IVF (Inverted File)
Best for very large datasets (> 1M vectors).
{
"algorithm": "ivf",
"options": {
"nlist": 100,
"nprobe": 10
}
}
Parameters:
nlist- Number of clustersnprobe- Clusters to search (higher = better, slower)
Distance Metrics
Cosine Similarity
Best for: Text embeddings, normalized vectors
{
"metric": "cosine"
}
Range: -1 to 1 (1 = identical, -1 = opposite)
Euclidean Distance (L2)
Best for: Spatial data, image embeddings
{
"metric": "euclidean"
}
Range: 0 to ∞ (0 = identical)
Dot Product
Best for: Pre-normalized embeddings, fast comparisons
{
"metric": "dot_product"
}
Complete Example
Create a high-performance search system:
#!/bin/bash
# 1. Create collection for articles
curl -X POST https://{EKODB_API_URL}/api/collections/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"fields": {
"title": {"type": "string", "required": true},
"content": {"type": "string", "required": true},
"author_id": {"type": "string", "required": true},
"status": {"type": "string", "enum": ["draft", "published"]},
"embedding": {"type": "array"}
}
}'
# 2. Create query index for common filters
curl -X POST https://{EKODB_API_URL}/api/indexes/query/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"field": "status",
"index_type": "btree"
}'
# 3. Create composite index for author + status
curl -X POST https://{EKODB_API_URL}/api/indexes/query/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"fields": ["author_id", "status"],
"index_type": "btree"
}'
# 4. Create text search index
curl -X POST https://{EKODB_API_URL}/api/indexes/search/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"field": "content",
"index_type": "text",
"options": {
"language": "english",
"stemming": true,
"stop_words": true
}
}'
# 5. Create vector index for semantic search
curl -X POST https://{EKODB_API_URL}/api/indexes/search/articles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"field": "embedding",
"index_type": "vector",
"options": {
"dimension": 1536,
"metric": "cosine",
"algorithm": "hnsw",
"ef_construction": 200,
"m": 16
}
}'
# 6. Explain query to verify index usage
curl -X POST https://{EKODB_API_URL}/api/query/articles/explain \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"filter": {
"type": "Logical",
"content": [
{
"type": "Condition",
"content": {
"field": "status",
"operator": "Eq",
"value": "published"
}
},
{
"type": "Condition",
"content": {
"field": "author_id",
"operator": "Eq",
"value": "author_123"
}
}
]
}
}'
# 7. Test hybrid search explain
curl -X POST https://{EKODB_API_URL}/api/search/hybrid/articles/explain \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"text_query": "machine learning",
"vector": [0.1, 0.2, ...],
"text_weight": 0.4,
"vector_weight": 0.6,
"limit": 10
}'
Best Practices
Index Selection
Create Indexes For:
- ✅ Fields used in WHERE clauses frequently
- ✅ Foreign keys and join fields
- ✅ Fields used for sorting
- ✅ Text fields for search
- ✅ Vector embeddings for similarity search
Avoid Indexing:
- ❌ Low-cardinality fields (e.g., boolean with 2 values)
- ❌ Fields that change frequently
- ❌ Very large text fields (use search index instead)
- ❌ Fields never used in queries
Composite Indexes
Order Matters:
# Good: Index fields in order of selectivity
# (status has few values, created_at is more selective)
fields: ["created_at", "status"]
# Less optimal
fields: ["status", "created_at"]
Use for Common Query Combinations:
# If you often query: WHERE author_id = X AND status = Y
# Create composite index:
fields: ["author_id", "status"]
Monitor Index Usage
# Check which indexes exist
GET /api/indexes/query/articles
# Response shows all indexes
{
"collection": "articles",
"indexes": [
{
"label": "articles_email_index",
"collection": "articles",
"field": "email",
"index_type": "BTree",
"unique": true,
"sparse": false
},
{
"label": "articles_old_field_index",
"collection": "articles",
"field": "old_field",
"index_type": "Hash",
"unique": false,
"sparse": false
}
],
"count": 2
}
# Delete unused indexes to save space
DELETE /api/indexes/query/articles/old_field
Vector Index Tuning
For Accuracy:
{
"algorithm": "hnsw",
"ef_construction": 400,
"m": 32,
"ef_search": 200
}
For Speed:
{
"algorithm": "hnsw",
"ef_construction": 100,
"m": 16,
"ef_search": 50
}
For Balance:
{
"algorithm": "hnsw",
"ef_construction": 200,
"m": 16,
"ef_search": 100
}
Index Maintenance
Rebuild Indexes After:
- Large bulk imports
- Schema changes
- Performance degradation
- Significant data updates
# Delete and recreate index
DELETE /api/indexes/query/articles/email
POST /api/indexes/query/articles
{
"field": "email",
"index_type": "btree"
}
Performance Impact
Query Performance
| Records | No Index | With Index | Speedup |
|---|---|---|---|
| 1,000 | ~5ms | ~1ms | 5x |
| 10,000 | ~50ms | ~2ms | 25x |
| 100,000 | ~500ms | ~3ms | 166x |
| 1M | ~5,000ms | ~5ms | 1,000x |
Space Usage
| Index Type | Space Overhead | Example (1M records) |
|---|---|---|
| Query (btree) | 10-20% | ~50-100 MB |
| Text search | 20-40% | ~100-200 MB |
| Vector (HNSW) | 50-100% | ~500MB-1GB |
Related Documentation
- Basic Operations - Query and search records
- Query Expressions - Filter syntax for search queries
- Collections & Schemas - Create collections
- System Administration - Monitor performance
Example Code
Complete working examples for search and schema management:
- Rust:
client_search.rs - Python:
client_search.py - TypeScript:
client_search.ts - Go:
client_search.go - Kotlin:
ClientSearch.kt
Schema management examples:
- Rust:
client_schema.rs - Python:
client_schema.py - TypeScript:
client_schema.ts - Go:
client_schema.go - Kotlin:
ClientSchemaManagement.kt