System Administration
Administrative endpoints for monitoring, maintenance, and system analysis.
All endpoints in this section require admin permissions.
Health Monitoring
Check System Health
Get overall system health status. This endpoint is partially public — returns basic status without auth, detailed metrics with admin auth.
- Client Libraries
- Direct API
- 🦀 Rust
- 🐍 Python
- 📘 TypeScript
- 📦 JavaScript
- 🟣 Kotlin
- 🔷 Go
// Returns Ok(()) if healthy, Err if not
client.health_check().await?;
# Returns True if healthy
is_healthy = client.health_check()
const isHealthy = await client.health();
const isHealthy = await client.health();
val isHealthy = client.health()
err := client.Health()
if err != nil {
log.Fatal("Server unhealthy:", err)
}
Client libraries provide a simple health check that returns success/failure. For detailed system metrics, use the Direct API with an admin token.
# Public (no auth) - basic status
curl https://{EKODB_API_URL}/api/health
# Response
{
"status": "ok"
}
# Admin auth - detailed metrics
curl https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}"
Response (with admin auth — detailed):
{
"status": "ok",
"version": "1.2.3",
"timestamp": "2026-01-15T15:45:30Z",
"capacity": {
"can_accept_requests": true,
"status": "available",
"retry_after_ms": 0,
"connections_available": 450,
"utilization_percent": 10.0
},
"system_resources": {
"cpu_count": 8,
"total_memory_gb": 16,
"disk_free_gb": 100
},
"batch_settings": {
"parallel_max": 1000,
"sequential_max": 10000,
"insert_batch_size": 500,
"update_batch_size": 200,
"delete_batch_size": 300,
"query_batch_size": 1000
},
"active_operations": {
"has_active": true,
"total": 5,
"by_type": {"insert": 2, "query": 3},
"under_stress": []
},
"file_pool": {
"fd_in_use": 50,
"fd_max": 500,
"fd_available": 450,
"fd_utilization_percent": 10.0,
"eviction_count": 0,
"total_permit_wait_time_ms": 0
},
"retry_after_ms": null,
"adaptive_strategy": {
"parallelism_for_100_items": {
"preparation_cpu_bound": 8,
"validation_mixed": 4,
"disk_io_bound": 16
},
"chunk_sizing": {
"records_1000": 250,
"records_10000": 500
}
},
"performance_settings": {
"max_concurrent_ops": 100,
"operation_timeout_secs": 30,
"compression_level": 6
},
"storage_limits": {
"file_pool_max_size_mb": 256,
"disk_cache_max_size_mb": 1024,
"wal_max_size_mb": 512,
"memory_cache_size_mb": 4096
}
}
Capacity Statuses:
available— Normal operation, accepting requestsbusy— High utilization (70–90%), still accepting requestsoverloaded— Over capacity, checkretry_after_ms
Write-Ahead Log (WAL)
The Write-Ahead Log ensures data durability and enables replication.
Get WAL Health
Check the status and health of the WAL system.
curl https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}"
# Response
{
"is_healthy": true,
"last_flush": "2026-01-15T15:45:30Z",
"buffer_utilization": 0.15,
"retry_rate": 0.0,
"file_size_mb": 100.0,
"max_size_mb": 512
}
Fields:
| Field | Description |
|---|---|
is_healthy | true if WAL file size is below 90% of max_size_mb |
last_flush | Timestamp of the most recent WAL flush to disk |
buffer_utilization | Percentage of WAL write buffer in use (0.0–1.0) |
retry_rate | Rate of write retries (0.0 = no retries) |
file_size_mb | Current WAL file size in megabytes |
max_size_mb | Maximum WAL file size before rotation is recommended |
Rotate WAL
Manually rotate the WAL to a new file.
curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
-H "Authorization: Bearer {ADMIN_TOKEN}"
# Response
{
"status": "success",
"message": "WAL rotated successfully"
}
- Before backups
- When WAL
file_size_mbis approachingmax_size_mb(check via/api/wal/health) - During maintenance windows
- For replication synchronization
Get WAL Entries
Retrieve WAL entries within a time range (used for replication and gap filling).
curl "https://{EKODB_API_URL}/api/wal/entries?from_timestamp=1705329600&to_timestamp=1705333200" \
-H "Authorization: Bearer {ADMIN_TOKEN}"
# Response
{
"entries": [...],
"count": 150,
"from_timestamp": 1705329600,
"to_timestamp": 1705333200
}
Query Parameters:
from_timestamp- Start Unix timestamp in seconds (required)to_timestamp- End Unix timestamp in seconds (required)
Replication
Receive WAL Shipment
Receive and apply WAL entries from a peer instance (for replication and gap filling). This endpoint requires the x-ripple-request-id header to verify the request comes from a configured peer.
POST https://{EKODB_API_URL}/api/replication/wal
Content-Type: application/json
x-ripple-request-id: {REQUEST_ID}
{
"entries": [...],
"timestamp": 1705329600,
"source_deployment_id": "primary-db-01"
}
# Response
{
"status": "ok",
"entries_applied": 2,
"entries_failed": 0,
"total_entries": 2
}
System Analysis
Get System Analysis
Analyze system hardware capabilities and get recommended configuration settings.
GET https://{EKODB_API_URL}/api/system/analysis
Authorization: Bearer {ADMIN_TOKEN}
# Response
{
"current_specs": {
"cpu_count": 8,
"cpu_frequency": 2400,
"total_memory": 17179869184,
"disk_free": 107374182400,
"has_gpu": false,
"gpu_count": 0,
"total_gpu_memory": 0
},
"recommended_tier": "performance",
"current_settings": {
"batch_parallel_max": 1000,
"batch_sequential_max": 10000,
"insert_batch_size": 500,
"update_batch_size": 200,
"delete_batch_size": 300,
"query_batch_size": 1000,
"max_concurrent_ops": 100,
"operation_timeout": 30,
"compression_level": 6
},
"recommended_settings": {
"batch_parallel_max": 2000,
"batch_sequential_max": 20000,
"insert_batch_size": 1000,
"update_batch_size": 500,
"delete_batch_size": 500,
"query_batch_size": 2000,
"max_concurrent_ops": 200,
"operation_timeout": 60,
"compression_level": 3
},
"recommendations": [
"System has 8 CPU cores - optimal for parallel operations",
"16GB RAM available - can handle large batch operations",
"Consider increasing batch sizes for better throughput",
"Fast compression recommended for better performance"
]
}
- System Analysis (
/api/system/analysis) - Hardware specs and configuration recommendations - Analytics (
/api/analytics) - Database statistics, collection sizes, and performance metrics
Get Analytics Data
Get database statistics, collection metrics, and performance data.
GET https://{EKODB_API_URL}/api/analytics
Authorization: Bearer {ADMIN_TOKEN}
# Response
{
"cache": {
"hits": 50000,
"misses": 5000,
"total_requests": 55000,
"hit_rate": 0.909
},
"io": {
"reads": 100000,
"writes": 50000
},
"cpu": {
"cpu_count": 8,
"usage_ratio": 0.35
},
"memory": {
"total_memory": 17179869184,
"used_memory": 6442450944
},
"network": {
"ingress_bytes": 1073741824,
"egress_bytes": 2147483648
},
"database": {
"total_size": 10737418240,
"available_size": 96636764160,
"system_used_space": 5368709120,
"db_used_space": 5368709120,
"max_size": 107374182400
},
"collections": [
["users", {"record_count": 10000, "total_size": 1048576}],
["posts", {"record_count": 50000, "total_size": 5242880}],
["events", {"record_count": 500000, "total_size": 52428800}]
],
"performance_logs": []
}
Get System Logs
Retrieve server logs for troubleshooting. Returns the most recent log lines in chronological order.
curl https://{EKODB_API_URL}/api/system/logs \
-H "Authorization: Bearer {ADMIN_TOKEN}"
# Response
{
"logs": [
"2026-01-15T15:40:00Z INFO [ekodb_server] Database started successfully",
"2026-01-15T15:40:05Z INFO [ekodb_server::handlers] Insert completed for collection 'users'",
"2026-01-15T15:45:30Z ERROR [ekodb_server::handlers] Query timeout after 30s for collection 'events'"
]
}
The response contains an array of log line strings. The number of lines returned is controlled by the MAX_LOG_LINES environment variable (default: 1000).
Complete Example
Here's a complete system monitoring and maintenance workflow:
#!/bin/bash
# 1. Check system health
health=$(curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}")
echo "System Status: $(echo $health | jq -r '.status')"
echo "Capacity: $(echo $health | jq -r '.capacity.status')"
echo "CPU Cores: $(echo $health | jq -r '.system_resources.cpu_count')"
# 2. Check WAL health
wal_health=$(curl -s https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}")
wal_size=$(echo $wal_health | jq -r '.file_size_mb')
wal_max=$(echo $wal_health | jq -r '.max_size_mb')
echo "WAL Size: ${wal_size}MB / ${wal_max}MB"
# 3. Rotate WAL if needed (approaching max size)
is_healthy=$(echo $wal_health | jq -r '.is_healthy')
if [ "$is_healthy" = "false" ]; then
echo "WAL approaching limit, rotating..."
curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
-H "Authorization: Bearer {ADMIN_TOKEN}"
fi
# 4. Get system analysis
analysis=$(curl -s https://{EKODB_API_URL}/api/system/analysis \
-H "Authorization: Bearer {ADMIN_TOKEN}")
echo "Recommendations:"
echo $analysis | jq -r '.recommendations[]'
# 5. Check recent logs for errors
logs=$(curl -s https://{EKODB_API_URL}/api/system/logs \
-H "Authorization: Bearer {ADMIN_TOKEN}")
error_count=$(echo $logs | jq '[.logs[] | select(contains("ERROR"))] | length')
echo "Recent Errors: $error_count"
# 6. Get WAL entries for backup (last hour)
end_time=$(date +%s)
start_time=$((end_time - 3600))
curl -s "https://{EKODB_API_URL}/api/wal/entries?from_timestamp=$start_time&to_timestamp=$end_time" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
> wal_backup_$(date +%Y%m%d_%H%M%S).json
echo "WAL backup complete"
Best Practices
Health Monitoring
Set Up Regular Health Checks:
# Check health every 5 minutes
*/5 * * * * curl -s https://{EKODB_API_URL}/api/health \
| jq -r '.status' | \
grep -q "ok" || echo "ALERT: Database unhealthy!"
WAL Management
Monitor WAL Size:
# Alert if WAL is approaching max size (is_healthy = false means >90% of max)
is_healthy=$(curl -s https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" | jq -r '.is_healthy')
if [ "$is_healthy" = "false" ]; then
echo "WARNING: WAL approaching size limit"
fi
Scheduled Rotation:
# Rotate WAL daily at 3 AM
0 3 * * * curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
-H "Authorization: Bearer {ADMIN_TOKEN}"
Log Retention
Archive Logs:
# Archive current server logs
curl -s https://{EKODB_API_URL}/api/system/logs \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
> archived_logs_$(date +%Y%m%d).json
Replication Setup
Configure Primary-Replica:
# On replica: Fetch and apply WAL entries every minute
*/1 * * * * bash /scripts/replicate_wal.sh
# replicate_wal.sh
#!/bin/bash
PRIMARY_URL="https://primary.ekodb.net"
REPLICA_URL="https://replica.ekodb.net"
# Get latest WAL entries from primary
end_time=$(date +%s)
start_time=$((end_time - 120)) # Last 2 minutes
entries=$(curl -s "$PRIMARY_URL/api/wal/entries?from_timestamp=$start_time&to_timestamp=$end_time" \
-H "Authorization: Bearer {ADMIN_TOKEN}")
# Apply to replica
curl -X POST "$REPLICA_URL/api/replication/wal" \
-H "Content-Type: application/json" \
-H "x-ripple-request-id: manual-sync-$(date +%s)" \
-d "$entries"
Performance Optimization
Act on Recommendations:
# Get and apply system recommendations
recommendations=$(curl -s .../api/system/analysis | jq -r '.recommendations[]')
# Example: Create recommended indexes
echo "$recommendations" | grep "Create index" | while read -r rec; do
echo "Creating index: $rec"
# Parse and create index via API
done
# Example: Delete unused indexes
echo "$recommendations" | grep "Delete unused index" | while read -r rec; do
echo "Cleanup needed: $rec"
done
Monitoring Alerts
Critical Metrics to Monitor
| Metric | Warning Threshold | Critical Threshold | Action |
|---|---|---|---|
| Disk usage | > 75% | > 90% | Rotate WAL, archive old data |
| Memory usage | > 80% | > 95% | Restart, scale up |
| Query time (p95) | > 100ms | > 500ms | Create indexes, optimize queries |
| WAL file size | > 800MB | > 1GB | Force rotation |
| Error rate | > 1% | > 5% | Investigate logs |
| Active transactions | > 100 | > 500 | Check for stuck transactions |
Sample Monitoring Script
#!/bin/bash
ALERT_EMAIL="ops@example.com"
THRESHOLD_ERRORS=10
# Get health data
health=$(curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}")
# Check capacity status
capacity=$(echo $health | jq -r '.capacity.status')
if [ "$capacity" = "overloaded" ]; then
echo "ALERT: Server overloaded" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi
# Check WAL health
wal_healthy=$(curl -s https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" | jq -r '.is_healthy')
if [ "$wal_healthy" = "false" ]; then
echo "ALERT: WAL approaching size limit" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi
# Check recent errors in logs
error_count=$(curl -s https://{EKODB_API_URL}/api/system/logs \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '[.logs[] | select(contains("ERROR"))] | length')
if [ $error_count -gt $THRESHOLD_ERRORS ]; then
echo "ALERT: $error_count errors in recent logs" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi
Troubleshooting
High Memory Usage
# Check analytics for collection sizes
curl -s https://{EKODB_API_URL}/api/analytics \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.collections[] | {name: .[0], size: .[1].total_size, records: .[1].record_count}' \
| jq -s 'sort_by(.size) | reverse | .[0]'
# Check memory metrics
curl -s https://{EKODB_API_URL}/api/analytics \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.memory'
# Check health status for memory usage
curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.system_resources.total_memory_gb'
# Consider:
# - Archiving old data from large collections
# - Implementing pagination
# - Adding query result limits
# - Clearing unused indexes
Slow Query Performance
# Check if indexes exist for frequently queried fields
curl -s https://{EKODB_API_URL}/api/indexes/query/{collection} \
-H "Authorization: Bearer {ADMIN_TOKEN}"
# Use explain to analyze query performance
curl -X POST https://{EKODB_API_URL}/api/query/{collection}/explain \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"filter": {
"type": "Condition",
"content": {
"field": "your_field",
"operator": "Eq",
"value": "your_value"
}
}
}'
# Create indexes for fields without them
curl -X POST https://{EKODB_API_URL}/api/indexes/query/{collection} \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{"field": "frequently_queried_field", "index_type": "btree"}'
# Check batch settings and adjust if needed
curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.batch_settings'
WAL Disk Space Issues
# Check WAL health and size
curl -s https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '{is_healthy, file_size_mb, max_size_mb}'
# Rotate immediately if unhealthy
is_healthy=$(curl -s https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq -r '.is_healthy')
if [ "$is_healthy" = "false" ]; then
echo "WAL approaching limit, rotating..."
curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
-H "Authorization: Bearer {ADMIN_TOKEN}"
fi
# Get WAL entries for backup before cleanup
end_time=$(date +%s)
start_time=$((end_time - 86400)) # Last 24 hours
curl -s "https://{EKODB_API_URL}/api/wal/entries?from_timestamp=$start_time&to_timestamp=$end_time" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
> wal_backup_$(date +%Y%m%d).json
# After backing up, rotation will free space
Ripple Configuration
For multi-node deployments, configure data propagation between instances:
# Configure ripples on a node (add each peer separately)
curl -X POST https://{EKODB_API_URL}/api/ripples/config \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"name": "peer1",
"url": "https://peer1.ekodb.net:8080",
"api_key": "peer1-admin-key",
"mode": "Operations",
"enabled": true
}'
curl -X POST https://{EKODB_API_URL}/api/ripples/config \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"name": "peer2",
"url": "https://peer2.ekodb.net:8080",
"api_key": "peer2-admin-key",
"mode": "Operations",
"enabled": true
}'
# List configured ripples
curl -X GET https://{EKODB_API_URL}/api/ripples/config \
-H "Authorization: Bearer {ADMIN_TOKEN}"
# Check ripple health
curl -X GET https://{EKODB_API_URL}/api/ripples/health \
-H "Authorization: Bearer {ADMIN_TOKEN}"
Replication Roles:
primary- Propagate writes to peers (primary nodes)replica- Receive updates from peers (read replicas)peer- Full bidirectional sync (multi-master)standalone- No replication (isolated nodes)
See Ripples - Data Propagation for comprehensive guides on multi-region deployments, read scaling, high availability architectures, and data pipeline patterns.
Manifest Recovery
Rebuild Collection Manifests
Rebuild collection manifests from individual record files. Useful for recovering from schema evolution issues, stale manifest data after upgrades, or collection loading problems after crashes.
POST https://{EKODB_API_URL}/api/admin/rebuild-manifests
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
# Rebuild specific collections
{
"collections": ["users", "orders"]
}
# Rebuild ALL collections (omit collections field)
{}
# Response
{
"rebuilt": ["users", "orders"],
"message": "Successfully rebuilt 2 collection(s)"
}
This operation automatically backs up existing manifests (.backup extension) before rebuilding, but it's recommended to have a full backup before running this in production.
Use Cases:
- Schema Evolution Issues - Records missing after adding new required fields
- Stale Manifest Data - Manifest doesn't reflect actual records on disk
- Post-Crash Recovery - Collection loading problems after unexpected shutdown
- Version Upgrades - Manifest format changes between versions
What It Does:
- Backs up existing manifest files
- Scans individual record files
- Rebuilds manifest from actual record data
- Updates in-memory collections
Public Endpoint Rate Limiting
ekoDB includes configurable IP-based rate limiting for public endpoints that don't require authentication. This protects against DDoS and brute force attacks.
Protected Endpoints
/api/health- Health check/api/auth/register- API key registration/api/auth/token- Token generation/api/auth/{api_key}/admin_key- Admin key retrieval
Configuration
Configure via environment variables or database config:
| Setting | Environment Variable | Default | Description |
|---|---|---|---|
| Enabled | PUBLIC_ENDPOINT_RATE_LIMIT_ENABLED | true | Enable/disable rate limiting |
| Limit | PUBLIC_ENDPOINT_RATE_LIMIT_PER_MINUTE | 60 | Max requests per IP per minute |
Rate Limit Response
When rate limited, the server returns:
HTTP/1.1 429 Too Many Requests
Retry-After: 45
Content-Type: application/json
{
"error": "Rate limit exceeded",
"retry_after_secs": 45
}
IP Detection
The rate limiter detects client IP from (in order):
X-Forwarded-Forheader (first IP in list)X-Real-IPheader- Direct connection IP
If using a load balancer, ensure it forwards the original client IP via X-Forwarded-For or X-Real-IP headers for accurate rate limiting.
Related Documentation
- Configuration - All database configuration options
- Ripples - Data Propagation - Multi-node data propagation and use cases
- Indexes - Create indexes for performance
- Transactions - Understand transaction impact on WAL
- Collections & Schemas - Collection management
- Authentication - Admin key management