Skip to main content

ekoDB White Paper

A Multi-Model Database System

By Sean M. Vazquez, Creator of ekoDB


Executive Summary

ekoDB is an in-memory database with disk persistence for unstructured data. Built with Rust, it combines in-memory performance with durability guarantees through Write-Ahead Logging (WAL).

Key Capabilities:

  • Multi-Model Architecture: Key-value, document, vector search, and real-time messaging in a single system
  • High Performance: Optimized for high-throughput workloads with configurable durability
  • Adaptive Scaling: Automatically tunes from IoT devices to enterprise servers
  • Secure by Default: HTTPS/WSS-only, AES-GCM encryption at rest, TLS/SSL in transit
  • Memory Efficient: Low memory footprint with adaptive allocation
  • Distributed: Ripple system for horizontal scaling and cross-database propagation

1. Introduction

1.1 The Problem

Applications often require multiple database systems to handle different workloads:

  • MongoDB for document storage
  • Redis for caching and real-time data
  • Elasticsearch for full-text search
  • Pinecone for vector search
  • PostgreSQL for relational data

This multi-database approach introduces complexity, operational overhead, and integration challenges.

1.2 The Solution

ekoDB combines features from multiple database types into a single system. It provides document storage, key-value operations, full-text search, and vector search in one database.

1.3 History

ekoDB originated from a practical challenge: integrating multiple databases required complex layers of abstraction to achieve feature parity and consistent usage patterns.

Development Timeline:

  • 2013: Initial development as SOLO (Single Object Language Operator), an API gateway for multi-database integration
  • 2013-2022: SOLO operated as an API gateway connecting various database systems
  • 2022: Decision to eliminate the abstraction layer and build a unified database
  • 2022-2025: Complete from-scratch rewrite as ekoDB
  • Current: v0.20.0 (October 2025) - Active development

2. Architecture

2.1 Storage Architecture

ekoDB uses a hybrid in-memory architecture with disk persistence:

┌─────────────────────────────────────────┐
│ In-Memory Layer (Hot Data) │
│ ┌────────────────────────────────────┐ │
│ │ Concurrent Hash Maps │ │
│ │ - O(1) lookups │ │
│ │ - Lock-free reads │ │
│ │ - Reader-writer locks │ │
│ └────────────────────────────────────┘ │
└─────────────────────────────────────────┘

┌─────────────────────────────────────────┐
│ Multi-Tier LRU Cache (Warm Data) │
│ ┌────────────────────────────────────┐ │
│ │ - Record Cache │ │
│ │ - Query Cache │ │
│ │ - Search Cache │ │
│ │ - KV Cache │ │
│ └────────────────────────────────────┘ │
└─────────────────────────────────────────┘

┌─────────────────────────────────────────┐
│ Disk Persistence (Cold Data) │
│ ┌────────────────────────────────────┐ │
│ │ Write-Ahead Log (WAL) │ │
│ │ - Encrypted (AES-GCM) │ │
│ │ - Compressed (ZSTD/LZ4) │ │
│ │ - Dual-mode (Fast/Durable) │ │
│ └────────────────────────────────────┘ │
└─────────────────────────────────────────┘

Key Features:

  • In-Memory Primary Storage: All active data stored in memory for fast access
  • Automatic Eviction: LRU-based eviction when memory limits are reached
  • Transparent Reload: Cold data automatically loaded from disk when accessed
  • Larger-than-Memory Support: Datasets can exceed available RAM

2.2 Data Model

ekoDB supports multiple data models in a unified system:

Document Model

{
"name": "John Doe",
"email": "john@example.com",
"age": 30,
"tags": ["developer", "rust"],
"address": {
"city": "San Francisco",
"country": "USA"
}
}

Key-Value Model

{
"session:user123": {
"token": "abc123",
"expires": "2025-10-15T00:00:00Z"
}
}

Vector Model

{
"content": "ekoDB is a high-performance database",
"embedding": [0.1, 0.2, 0.3, ...], // 384-dimensional vector
"metadata": {
"source": "documentation"
}
}

2.3 Type System

ekoDB provides a flexible type system with optional type enforcement:

Supported Types:

  • String - UTF-8 text
  • Integer - 64-bit signed integers
  • Float - 64-bit floating point
  • Decimal - High-precision decimals
  • Boolean - true/false
  • Null - Null/empty value
  • Array - Ordered lists
  • Set - Unordered unique values
  • Map - Key-value pairs
  • Vector - Fixed-dimension float arrays (for embeddings)
  • Object - Nested documents
  • Binary - Raw binary data
  • Date - ISO 8601 date strings
  • DateTime - ISO 8601 date-time strings with timezone

Response Formats:

  • Typed: Includes type metadata (e.g., {"type": "String", "value": "text"})
  • Non-Typed: Traditional NoSQL format (e.g., "text")

2.4 Configuration Options

ekoDB provides flexible configuration options to tune behavior for different use cases:

Response Format Configuration

  • Typed Responses: Include type metadata for strong typing
  • Non-Typed Responses: Traditional NoSQL format for simplicity
  • Per-Request Override: Configure via query parameters or headers

Durability Configuration

  • Fast WAL Mode: High throughput, buffered writes
  • Durable WAL Mode: Guaranteed persistence, immediate fsync
  • Per-Operation Override: Choose durability level per write operation
  • Collection-Level Settings: Configure default durability per collection

Example Configuration

// Configure client for typed responses
const client = new EkoDBClient({
baseURL: "https://your-instance.ekodb.io",
apiKey: "your-api-key",
useTypedValues: true // Enable typed responses
});

// Override durability per operation
await client.insert("critical_data", record, {
durableWrite: true // Force durable WAL for this write
});

await client.insert("logs", logEntry, {
durableWrite: false // Use fast WAL for high throughput
});

Configuration Benefits:

  • Flexibility: Choose the right trade-offs per use case
  • Performance: Optimize for throughput or latency
  • Type Safety: Enable strong typing when needed
  • Durability: Balance speed vs. guaranteed persistence

3. Performance

3.1 Write Performance

ekoDB offers two WAL modes optimized for different use cases:

ModeThroughputDurabilityUse Case
Fast WALHigh throughputBuffered writesHigh-throughput ingestion
Durable WALModerate throughputImmediate fsyncCritical data

Dual-Node Strategy: Deploy a primary node with Fast WAL and a secondary node with Durable WAL (via Ripple replication) to achieve both high performance and durability.

3.2 Read Performance

  • Indexed Lookups: O(1) hash index, O(log n) B-tree index
  • Point Queries: Sub-millisecond latency
  • Range Queries: Logarithmic time complexity
  • Full-Text Search: O(k) where k is number of matching terms
  • Vector Search: Optimized flat index with early termination

3.3 Memory Efficiency

  • Base Memory: Low memory footprint optimized for efficiency
  • Compression: Significant space savings with ZSTD (configurable)
  • Adaptive Allocation: 1%-80% of available RAM
  • LRU Eviction: Automatic memory management

4. Indexing

4.1 Index Types

ekoDB implements multiple index types optimized for different query patterns:

Hash Indexes (Default)

  • Complexity: O(1) for equality queries
  • Use Case: WHERE field = value
  • Automatic: Created for frequently queried fields

B-Tree Indexes

  • Complexity: O(log n) for range queries
  • Use Case: WHERE field > value, sorting
  • Features: Supports <, >, <=, >=, BETWEEN

Inverted Indexes

  • Complexity: O(k) where k = matching terms
  • Use Case: Full-text search
  • Features: Stemming, fuzzy matching, tokenization

Vector Indexes

  • Current: Flat index with optimizations
  • Planned: HNSW (Hierarchical Navigable Small World)
  • Use Case: Semantic similarity search
  • Metrics: Cosine similarity, Euclidean distance, dot product

4.2 Index Management

  • Automatic Creation: Indexes created based on query patterns
  • Automatic Maintenance: Updated on insert/update/delete
  • Concurrent Access: Thread-safe operations
  • Memory Efficient: Weak references and LRU eviction

5. Concurrency & Isolation

5.1 Concurrency Control

ekoDB uses Two-Phase Locking (2PL) with reader-writer locks:

  • Collection-Level Granularity: Locks per collection
  • Multiple Readers: Concurrent reads allowed
  • Single Writer: One writer per collection at a time
  • Lock-Free Cross-Collection: Operations on different collections don't block each other

5.2 Isolation Levels

Within a Single Collection: Serializable

  • Reader-writer locks ensure serializable execution
  • No dirty reads, non-repeatable reads, or phantom reads

Across Multiple Collections: Read Uncommitted (effectively)

  • No coordination between collections
  • Applications must handle cross-collection consistency
  • Multi-collection transactions planned for future release

6. Durability & Recovery

6.1 Write-Ahead Logging (WAL)

ekoDB implements a dual-mode WAL system:

Fast WAL Mode:

  • Buffered writes with periodic fsync
  • High throughput
  • Suitable for high-throughput ingestion

Durable WAL Mode:

  • Immediate fsync after every write
  • Moderate throughput
  • Guaranteed persistence

WAL Management:

  • Automatic rotation at 50MB-1GB (adaptive)
  • Manual rotation available
  • Log compaction removes redundant entries
  • Automatic cleanup after replication

6.2 Recovery Process

  1. Scan WAL Files: Identify all WAL files in order
  2. Validate: Check integrity and checksums
  3. Replay Entries: Apply operations in order
  4. Rebuild Indexes: Reconstruct all indexes
  5. Resume Operations: Database ready for queries

Recovery Time: Depends on WAL size, typically seconds to minutes


7. Search Capabilities

  • Inverted Index: Maps terms to documents
  • Tokenization: Automatic text processing
  • Stemming: Language-aware word normalization
  • Fuzzy Matching: Typo tolerance (Levenshtein distance)
  • Field Weighting: Prioritize specific fields
  • Minimum Score: Filter by relevance threshold
  • Semantic Similarity: Find similar documents by meaning
  • Embedding Support: 384, 768, 1536 dimensions (configurable)
  • Distance Metrics: Cosine similarity, Euclidean, dot product
  • Metadata Filtering: Combine vector search with filters
  • Top-K Results: Efficient heap-based selection

Combine text and vector search:

  • Text search for keyword matching
  • Vector search for semantic similarity
  • Unified scoring and ranking
  • Combined text and semantic matching

8. Distributed Architecture

8.1 Ripple System

ekoDB's distributed architecture uses the Ripple system for horizontal scaling:

┌─────────────────────────────────────────────────┐
│ Regional Cluster │
│ ┌─────────────────────────────────────────┐ │
│ │ Managed Instance Groups (MIGs) │ │
│ │ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Node 1 │ │ Node 2 │ ... │ │
│ │ │ (Primary)│ │(Secondary)│ │ │
│ │ └──────────┘ └──────────┘ │ │
│ └─────────────────────────────────────────┘ │
└─────────────────────────────────────────────────┘
↕ Ripple
┌─────────────────────────────────────────────────┐
│ Multi-Tenant Single Nodes │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Tenant A │ │ Tenant B │ │ Tenant C │ │
│ └──────────┘ └──────────┘ └──────────┘ │
└─────────────────────────────────────────────────┘

Features:

  • Cross-Database Propagation: Replicate data across ekoDB instances
  • Horizontal Scaling: Add nodes for increased capacity
  • Regional Distribution: Deploy across geographic regions
  • Automatic Failover: Secondary nodes take over on failure

8.2 Replication

  • Asynchronous Replication: Non-blocking writes
  • Configurable Targets: Replicate to multiple destinations
  • Selective Replication: Choose which collections to replicate
  • Conflict Resolution: Last-write-wins strategy

9. Security

9.1 Network Security

HTTPS/WSS Only: Unlike traditional databases that use direct TCP connections, ekoDB exclusively uses HTTPS and WSS (Secure WebSocket) protocols.

Benefits:

  • No direct TCP exposure
  • Built-in encryption (TLS/SSL)
  • Standard web protocols
  • Firewall friendly (port 443)
  • Certificate-based security
  • Protection against man-in-the-middle attacks

9.2 Encryption

At Rest:

  • AES-GCM encryption for all disk-persisted data
  • Encrypted WAL files
  • Encrypted indexes
  • Configurable encryption keys

In Transit:

  • TLS/SSL for all network communication
  • Certificate validation
  • Perfect forward secrecy

9.3 Authentication

  • JWT Tokens: Industry-standard authentication
  • API Keys: Simple authentication for services
  • Role-Based Access: Planned for future release
  • Token Expiration: Configurable TTL

10. Use Cases

10.1 AI Agents & Workflows

  • Vector Search: Store and query embeddings
  • Chat History: Built-in chat session management
  • Context Management: Efficient retrieval of relevant context
  • Real-Time Updates: WebSocket for live agent interactions

10.2 Real-Time Analytics

  • High Throughput: 1M-2M records/sec ingestion
  • In-Memory Processing: Sub-millisecond queries
  • Time-Series Data: Efficient storage and retrieval
  • Aggregations: Fast analytical queries

10.3 IoT Data Processing

  • Adaptive Scaling: Runs on resource-constrained devices
  • Edge Computing: Deploy close to data sources
  • Efficient Storage: Compression reduces disk usage
  • Batch Operations: Handle bursts of sensor data

10.4 Session Management

  • TTL Support: Automatic session expiration
  • Fast Lookups: O(1) key-value operations
  • High Concurrency: Handle thousands of sessions
  • Persistence: Optional durability for sessions

10.5 Content Delivery

  • Distributed Caching: Ripple for multi-region deployment
  • Fast Reads: In-memory performance
  • Compression: Reduce bandwidth usage
  • Real-Time Updates: WebSocket for live content

11. Roadmap

11.1 Near-Term (2025)

  • Multi-Collection Transactions: ACID transactions across collections
  • MVCC: Multi-Version Concurrency Control
  • HNSW Vector Index: Improved vector search performance
  • Stored Procedures: Multi-language support (JavaScript, Python, Rust)

11.2 Mid-Term (2026)

  • GPU Acceleration: Hardware acceleration for vector operations
  • Columnar Storage: Optional DSM for analytical workloads
  • Analytics Functions: Aggregation operations
  • Role-Based Access Control: Fine-grained permissions

11.3 Long-Term (2027+)

  • Distributed Transactions: Cross-node ACID transactions
  • SQL Interface: Optional SQL query support
  • Time-Series Optimization: Specialized time-series features
  • Machine Learning Integration: Built-in ML model serving

12. Comparison

12.1 vs. MongoDB

FeatureekoDBMongoDB
PerformanceHigh throughputModerate throughput
MemoryLow footprintHigher footprint
Vector SearchNativeAtlas Search (separate)
Full-Text SearchNativeAtlas Search (separate)
Real-TimeWebSocket nativeChange Streams
EncryptionDefaultEnterprise only

12.2 vs. Redis

FeatureekoDBRedis
Data ModelMulti-modelKey-value
PersistenceWAL (configurable)RDB/AOF
SearchFull-text + VectorRediSearch module
DurabilityConfigurableTrade-off with performance
QueriesComplex filtersLimited
DocumentsNative JSONRedisJSON module

12.3 vs. Elasticsearch

FeatureekoDBElasticsearch
Primary UseMulti-model databaseSearch engine
Write PerformanceHigh throughputModerate throughput
MemoryLow footprintHigh footprint
Vector SearchNativeDense vector field
CRUD OperationsOptimizedSecondary feature
Real-TimeWebSocketPolling

13. Conclusion

ekoDB is a multi-model database that combines features from document stores, key-value databases, and search engines. Built with Rust, it provides in-memory performance with configurable durability.

Key Features:

  1. Multi-Model: Document, key-value, vector, and search operations
  2. In-Memory Architecture: Primary storage in memory with disk persistence
  3. Adaptive Scaling: Configurable resource allocation
  4. HTTPS/WSS Communication: Standard web protocols for client connections
  5. Client Libraries: Rust, Python, TypeScript, Go, Kotlin, and JavaScript support
  6. Encryption: AES-GCM at rest, TLS/SSL in transit

Target Audience:

  • Startups building AI-powered applications
  • Enterprises consolidating database infrastructure
  • IoT deployments requiring edge computing
  • Real-time analytics platforms
  • Content delivery networks
  • Session management systems

14. Getting Started

14.1 Quick Start

# Install client library
npm install @ekodb/ekodb-client

# Connect to ekoDB
import { EkoDBClient } from "@ekodb/ekodb-client";

const client = new EkoDBClient({
baseURL: "https://your-subdomain.production.aws.ekodb.io",
apiKey: "your-api-key"
});

await client.init();

// Insert a document
await client.insert("users", {
name: "John Doe",
email: "john@example.com"
});

// Query documents using ekoDB query builder
const query = {
filter: {
type: "Condition",
content: {
field: "age",
operator: "Gt",
value: 25
}
}
};
const users = await client.find("users", query);

14.2 Resources


15. About the Author

Sean M. Vazquez is the creator and lead developer of ekoDB. With over a decade of experience in software development, Sean founded ekoDB Inc. to solve the challenges of multi-database integration through a unified database system.

Contact: sean@ekodb.io


This white paper describes ekoDB v0.20.0 (October 2025). Features and specifications are subject to change as the system evolves.