Query Patterns
ekoDB automatically tracks query access patterns to identify frequently accessed records and intelligently pre-loads them into cache on database startup.
Query pattern tracking with intelligent cache warming is available in ekoDB v0.30.0+.
Overviewβ
The Pattern Logger monitors which records are accessed, how often, and how recently. On database restart, it uses this information to automatically warm the cache with your most-accessed data, eliminating cold-start delays.
Key Benefitsβ
- π 3x Faster Queries: Cached records served in ~109Β΅s vs ~328Β΅s uncached
- π Zero Configuration: Fully automatic - no setup required
- πΎ Sub-Microsecond Lookups:
find_by_idcompletes in ~2.3Β΅s - π Persistent: Patterns survive database restarts
- β‘ Smart Scoring: Recent accesses weighted higher
How It Worksβ
1. Automatic Pattern Trackingβ
Every query operation automatically logs access patterns:
// JavaScript/TypeScript Client
await client.findById('users', userId); // Tracked: FindById
await client.find('users', { limit: 100 }); // Tracked: Find
await client.textSearch('products', 'laptop'); // Tracked: TextSearch
await client.vectorSearch('images', embedding, 10); // Tracked: VectorSearch
# Python Client
await client.find_by_id('users', user_id) # Tracked: FindById
await client.find('users', limit=100) # Tracked: Find
await client.text_search('products', 'laptop') # Tracked: TextSearch
await client.vector_search('images', embedding, 10) # Tracked: VectorSearch
// Go Client
client.FindByID("users", userId) // Tracked: FindById
client.Find("users", filters) // Tracked: Find
client.TextSearch("products", "laptop") // Tracked: TextSearch
client.VectorSearch("images", embedding, 10) // Tracked: VectorSearch
2. Hotness Scoringβ
Each record receives a hotness score based on how often it's accessed and how recently. Records accessed within the last hour are weighted most heavily, with the score decaying over time. This ensures the cache reflects current workload, not stale history.
Example:
- Record A: Accessed 100 times, last access 30 minutes ago β Very Hot
- Record B: Accessed 50 times, last access 3 days ago β Warm
- Record C: Accessed 10 times, last access 30 days ago β Cold
3. Cache Warming on Startupβ
When your database restarts:
- Load Patterns: Read patterns from disk
- Calculate Scores: Compute hotness for all records
- Pre-load Hot Records: Load top N records into cache
- Ready to Serve: Hot queries are instant
Database Restart Flow:
ββ Load pattern file (~1-2ms)
ββ Calculate hotness scores (~5-10ms)
ββ Pre-load 50 hot records into cache (~100-200ms)
ββ Database ready with warm cache β
Query Performance (from benchmarks):
Uncached query (10 records): 328Β΅s
Cached query (10 records): 109Β΅s β 3x faster!
Single find_by_id: 2.3Β΅s (sub-microsecond)
Performance Impactβ
Cache Performanceβ
Benchmarks compare query performance when records are pre-loaded in cache versus cold (uncached) queries. All timings are local embedded operations with no network overhead.
| Query Size | Speedup |
|---|---|
| 10 records | 3.0x faster |
| 100 records | 3.1x faster |
| 1000 records | 3.6x faster |
Cache warming provides consistent 3x speedups across all query sizes. The improvement comes from avoiding disk I/O on hot pathsβrecords are served directly from memory rather than reading from storage on each access.
Single Record Lookupsβ
Single record operations show ekoDB's raw embedded performance. These are point lookups without any query parsing or filtering overhead.
| Operation | Speedup |
|---|---|
| find_by_id | Sub-microsecond (~2Β΅s) |
| kv_get | Sub-microsecond (~200ns) |
| kv_exists | Sub-microsecond (~100ns) |
These numbers reflect the advantage of an embedded database: no serialization, no network round-trips, no connection pooling. The kv_exists check at ~100ns is essentially a hash table lookup, while find_by_id includes document parsing overhead.
For complete benchmark data including comparisons to other embedded databases, see Performance Benchmarks.
Pattern logging adds minimal overhead but significantly improves cached query performance. See the benchmark table above for details.
Configurationβ
Automatic Configurationβ
Pattern tracking is enabled by default with automatic configuration:
- Buffer size auto-scales with system resources
- Flush frequency optimized for your hardware
- Hot record limit adapts to available memory
No configuration needed! Just use ekoDB normally.
Manual Tuning (Advanced)β
For advanced use cases, you can tune behavior via environment variables:
# Adjust cleanup interval (affects buffer size)
export CLEANUP_INTERVAL_SECONDS=300 # Default: varies by tier
# Buffer size calculation:
# buffer = cleanup_interval * 10
# 300s β 3000 entries buffered before flush
Use Casesβ
E-Commerce Platformβ
// Scenario: Product catalog with 1M products
// Reality: 1000 products generate 80% of traffic
// Before (Without Cache Warming):
// - Database restart
// - First queries: ~328Β΅s each (uncached)
// - User experience: Slightly slower initial loads
// After (With Cache Warming):
// - Database restart
// - Pattern logger pre-loads 1000 hot products
// - First queries served from cache immediately
// - User experience: Fast from the start β
Multi-Tenant SaaSβ
// Scenario: 10,000 tenants
// Reality: 100 active tenants generate 90% of queries
// Cache warming automatically identifies active tenants
// Pre-loads their data on startup
// Active tenants get instant response times
// Inactive tenants use normal lazy-loading
Content Platformβ
// Scenario: News site with trending articles
// Patterns adapt automatically:
// - Morning: Breaking news articles hot
// - Afternoon: Opinion pieces hot
// - Evening: Sports content hot
// Cache warming reflects current trends
// No manual cache management needed
Monitoringβ
Log Outputβ
Pattern Logger provides detailed logs on startup and during operation:
[INFO] PatternLogger: Buffer size set to 3000 entries (based on cleanup_interval: 300s)
[INFO] Loaded 15234 pattern entries
[INFO] Warming cache from query patterns...
[INFO] Cache warming complete: 50 hot records loaded in 125ms
Best Practicesβ
1. Let It Run Automaticallyβ
Don't disable pattern logging unless you have a specific reason. The benefits far outweigh the minimal overhead.
2. Monitor Hot Recordsβ
Periodically review which records are hot:
- Optimize frequently accessed data
- Add indexes for hot queries
- Pre-compute expensive operations
3. Clean Old Patternsβ
Patterns are cleaned automatically based on the cleanup_interval setting. Patterns naturally age out over time, and a database restart will rebuild them from fresh access data.
4. Combine with Indexesβ
Pattern tracking identifies hot records, indexes optimize hot queries:
// Pattern tracking: Pre-loads frequently accessed users
// Index: Makes user lookups by email fast
await client.createIndex('users', ['email']);
// Both work together for optimal performance
Storage & Maintenanceβ
Pattern Storageβ
Patterns are stored in an append-only log managed automatically by ekoDB. Each entry records the collection, record, operation type, and timestamp.
Storage Requirementsβ
Pattern logging is lightweight β approximately 80 bytes per entry:
| Entries | Disk Usage |
|---|---|
| 1,000,000 | ~80 MB |
| 10,000,000 | ~800 MB |
Cleanup Strategyβ
Recommended cleanup schedule:
| Data Size | Cleanup Frequency | Keep Days |
|---|---|---|
| < 1GB | Monthly | 30 days |
| 1-10GB | Weekly | 14 days |
| > 10GB | Daily | 7 days |
Troubleshootingβ
Patterns Not Loadingβ
Check logs for errors:
[WARN] Failed to parse pattern entry: Invalid timestamp
Solution: If pattern data is corrupted, restart the database β patterns will rebuild automatically from new access data.
High Memory Usageβ
Check buffer size:
# In database logs
[INFO] PatternLogger: Buffer size set to 10000 entries
Solution: Reduce cleanup interval to decrease buffer:
export CLEANUP_INTERVAL_SECONDS=60 # Smaller buffer
Slow Startupβ
Check hot record count:
# In database logs
[INFO] Cache warming complete: 500 hot records loaded in 2500ms
Solution: Too many hot records being loaded. This is automatically tuned, but if startup is critical:
- Patterns naturally age out
- Clean old patterns more frequently
- Consider if 500 hot records is appropriate for your use case
Integration with Other Featuresβ
With Multi-Region (Ripple)β
Each region maintains its own patterns:
- Region A: Tracks Region A's access patterns
- Region B: Tracks Region B's access patterns
- Benefit: Each region optimizes for its users
With File Poolβ
Uses global file descriptor management:
- Pattern file = 1 FD
- Respects system limits
- Automatic retry/backoff
With Disk Cacheβ
Works seamlessly together:
- Hot records β Memory cache (instant)
- Warm records β Disk cache (fast)
- Cold records β Database (acceptable)
Related Documentationβ
- System Administration - Admin endpoints and monitoring
- Performance Benchmarks - Performance data and tuning
- Indexes - Create indexes for common queries
- White Paper - Technical architecture details
Summaryβ
Cache warming eliminates cold-start delays automatically. ekoDB learns which data matters, pre-loads it on restart, and adapts as access patterns change β no configuration required.