Skip to main content

Query Patterns

ekoDB automatically tracks query access patterns to identify frequently accessed records and intelligently pre-loads them into cache on database startup.

New in v0.30.0

Query pattern tracking with intelligent cache warming is available in ekoDB v0.30.0+.

Overview​

The Pattern Logger monitors which records are accessed, how often, and how recently. On database restart, it uses this information to automatically warm the cache with your most-accessed data, eliminating cold-start delays.

Key Benefits​

  • πŸš€ 3x Faster Queries: Cached records served in ~109Β΅s vs ~328Β΅s uncached
  • πŸ“Š Zero Configuration: Fully automatic - no setup required
  • πŸ’Ύ Sub-Microsecond Lookups: find_by_id completes in ~2.3Β΅s
  • πŸ”„ Persistent: Patterns survive database restarts
  • ⚑ Smart Scoring: Recent accesses weighted higher

How It Works​

1. Automatic Pattern Tracking​

Every query operation automatically logs access patterns:

// JavaScript/TypeScript Client
await client.findById('users', userId); // Tracked: FindById
await client.find('users', { limit: 100 }); // Tracked: Find
await client.textSearch('products', 'laptop'); // Tracked: TextSearch
await client.vectorSearch('images', embedding, 10); // Tracked: VectorSearch
# Python Client
await client.find_by_id('users', user_id) # Tracked: FindById
await client.find('users', limit=100) # Tracked: Find
await client.text_search('products', 'laptop') # Tracked: TextSearch
await client.vector_search('images', embedding, 10) # Tracked: VectorSearch
// Go Client
client.FindByID("users", userId) // Tracked: FindById
client.Find("users", filters) // Tracked: Find
client.TextSearch("products", "laptop") // Tracked: TextSearch
client.VectorSearch("images", embedding, 10) // Tracked: VectorSearch

2. Hotness Scoring​

Each record receives a hotness score based on how often it's accessed and how recently. Records accessed within the last hour are weighted most heavily, with the score decaying over time. This ensures the cache reflects current workload, not stale history.

Example:

  • Record A: Accessed 100 times, last access 30 minutes ago β†’ Very Hot
  • Record B: Accessed 50 times, last access 3 days ago β†’ Warm
  • Record C: Accessed 10 times, last access 30 days ago β†’ Cold

3. Cache Warming on Startup​

When your database restarts:

  1. Load Patterns: Read patterns from disk
  2. Calculate Scores: Compute hotness for all records
  3. Pre-load Hot Records: Load top N records into cache
  4. Ready to Serve: Hot queries are instant
Database Restart Flow:
β”œβ”€ Load pattern file (~1-2ms)
β”œβ”€ Calculate hotness scores (~5-10ms)
β”œβ”€ Pre-load 50 hot records into cache (~100-200ms)
└─ Database ready with warm cache βœ…

Query Performance (from benchmarks):
Uncached query (10 records): 328Β΅s
Cached query (10 records): 109Β΅s β†’ 3x faster!
Single find_by_id: 2.3Β΅s (sub-microsecond)

Performance Impact​

Cache Performance​

Benchmarks compare query performance when records are pre-loaded in cache versus cold (uncached) queries. All timings are local embedded operations with no network overhead.

Query SizeSpeedup
10 records3.0x faster
100 records3.1x faster
1000 records3.6x faster

Cache warming provides consistent 3x speedups across all query sizes. The improvement comes from avoiding disk I/O on hot pathsβ€”records are served directly from memory rather than reading from storage on each access.

Single Record Lookups​

Single record operations show ekoDB's raw embedded performance. These are point lookups without any query parsing or filtering overhead.

OperationSpeedup
find_by_idSub-microsecond (~2Β΅s)
kv_getSub-microsecond (~200ns)
kv_existsSub-microsecond (~100ns)

These numbers reflect the advantage of an embedded database: no serialization, no network round-trips, no connection pooling. The kv_exists check at ~100ns is essentially a hash table lookup, while find_by_id includes document parsing overhead.

For complete benchmark data including comparisons to other embedded databases, see Performance Benchmarks.

Performance Trade-off

Pattern logging adds minimal overhead but significantly improves cached query performance. See the benchmark table above for details.

Configuration​

Automatic Configuration​

Pattern tracking is enabled by default with automatic configuration:

  • Buffer size auto-scales with system resources
  • Flush frequency optimized for your hardware
  • Hot record limit adapts to available memory

No configuration needed! Just use ekoDB normally.

Manual Tuning (Advanced)​

For advanced use cases, you can tune behavior via environment variables:

# Adjust cleanup interval (affects buffer size)
export CLEANUP_INTERVAL_SECONDS=300 # Default: varies by tier

# Buffer size calculation:
# buffer = cleanup_interval * 10
# 300s β†’ 3000 entries buffered before flush

Use Cases​

E-Commerce Platform​

// Scenario: Product catalog with 1M products
// Reality: 1000 products generate 80% of traffic

// Before (Without Cache Warming):
// - Database restart
// - First queries: ~328Β΅s each (uncached)
// - User experience: Slightly slower initial loads

// After (With Cache Warming):
// - Database restart
// - Pattern logger pre-loads 1000 hot products
// - First queries served from cache immediately
// - User experience: Fast from the start βœ…

Multi-Tenant SaaS​

// Scenario: 10,000 tenants
// Reality: 100 active tenants generate 90% of queries

// Cache warming automatically identifies active tenants
// Pre-loads their data on startup
// Active tenants get instant response times
// Inactive tenants use normal lazy-loading

Content Platform​

// Scenario: News site with trending articles
// Patterns adapt automatically:
// - Morning: Breaking news articles hot
// - Afternoon: Opinion pieces hot
// - Evening: Sports content hot

// Cache warming reflects current trends
// No manual cache management needed

Monitoring​

Log Output​

Pattern Logger provides detailed logs on startup and during operation:

[INFO] PatternLogger: Buffer size set to 3000 entries (based on cleanup_interval: 300s)
[INFO] Loaded 15234 pattern entries
[INFO] Warming cache from query patterns...
[INFO] Cache warming complete: 50 hot records loaded in 125ms

Best Practices​

1. Let It Run Automatically​

Don't disable pattern logging unless you have a specific reason. The benefits far outweigh the minimal overhead.

2. Monitor Hot Records​

Periodically review which records are hot:

  • Optimize frequently accessed data
  • Add indexes for hot queries
  • Pre-compute expensive operations

3. Clean Old Patterns​

Patterns are cleaned automatically based on the cleanup_interval setting. Patterns naturally age out over time, and a database restart will rebuild them from fresh access data.

4. Combine with Indexes​

Pattern tracking identifies hot records, indexes optimize hot queries:

// Pattern tracking: Pre-loads frequently accessed users
// Index: Makes user lookups by email fast
await client.createIndex('users', ['email']);

// Both work together for optimal performance

Storage & Maintenance​

Pattern Storage​

Patterns are stored in an append-only log managed automatically by ekoDB. Each entry records the collection, record, operation type, and timestamp.

Storage Requirements​

Pattern logging is lightweight β€” approximately 80 bytes per entry:

EntriesDisk Usage
1,000,000~80 MB
10,000,000~800 MB

Cleanup Strategy​

Recommended cleanup schedule:

Data SizeCleanup FrequencyKeep Days
< 1GBMonthly30 days
1-10GBWeekly14 days
> 10GBDaily7 days

Troubleshooting​

Patterns Not Loading​

Check logs for errors:

[WARN] Failed to parse pattern entry: Invalid timestamp

Solution: If pattern data is corrupted, restart the database β€” patterns will rebuild automatically from new access data.

High Memory Usage​

Check buffer size:

# In database logs
[INFO] PatternLogger: Buffer size set to 10000 entries

Solution: Reduce cleanup interval to decrease buffer:

export CLEANUP_INTERVAL_SECONDS=60  # Smaller buffer

Slow Startup​

Check hot record count:

# In database logs
[INFO] Cache warming complete: 500 hot records loaded in 2500ms

Solution: Too many hot records being loaded. This is automatically tuned, but if startup is critical:

  • Patterns naturally age out
  • Clean old patterns more frequently
  • Consider if 500 hot records is appropriate for your use case

Integration with Other Features​

With Multi-Region (Ripple)​

Each region maintains its own patterns:

  • Region A: Tracks Region A's access patterns
  • Region B: Tracks Region B's access patterns
  • Benefit: Each region optimizes for its users

With File Pool​

Uses global file descriptor management:

  • Pattern file = 1 FD
  • Respects system limits
  • Automatic retry/backoff

With Disk Cache​

Works seamlessly together:

  • Hot records β†’ Memory cache (instant)
  • Warm records β†’ Disk cache (fast)
  • Cold records β†’ Database (acceptable)

Summary​

Cache warming eliminates cold-start delays automatically. ekoDB learns which data matters, pre-loads it on restart, and adapts as access patterns change β€” no configuration required.