Skip to main content

System Administration

Administrative endpoints for monitoring, maintenance, and system analysis.

Admin Access Required

All endpoints in this section require admin permissions.

Health Monitoring

Check System Health

Get overall system health status. This endpoint is partially public — returns basic status without auth, detailed metrics with admin auth.

// Returns Ok(()) if healthy, Err if not
client.health_check().await?;
info

Client libraries provide a simple health check that returns success/failure. For detailed system metrics, use the Direct API with an admin token.

Response (with admin auth — detailed):

{
"status": "ok",
"version": "1.2.3",
"timestamp": "2026-01-15T15:45:30Z",
"capacity": {
"can_accept_requests": true,
"status": "available",
"retry_after_ms": 0,
"connections_available": 450,
"utilization_percent": 10.0
},
"system_resources": {
"cpu_count": 8,
"total_memory_gb": 16,
"disk_free_gb": 100
},
"batch_settings": {
"parallel_max": 1000,
"sequential_max": 10000,
"insert_batch_size": 500,
"update_batch_size": 200,
"delete_batch_size": 300,
"query_batch_size": 1000
},
"active_operations": {
"has_active": true,
"total": 5,
"by_type": {"insert": 2, "query": 3},
"under_stress": []
},
"file_pool": {
"fd_in_use": 50,
"fd_max": 500,
"fd_available": 450,
"fd_utilization_percent": 10.0,
"eviction_count": 0,
"total_permit_wait_time_ms": 0
},
"retry_after_ms": null,
"adaptive_strategy": {
"parallelism_for_100_items": {
"preparation_cpu_bound": 8,
"validation_mixed": 4,
"disk_io_bound": 16
},
"chunk_sizing": {
"records_1000": 250,
"records_10000": 500
}
},
"performance_settings": {
"max_concurrent_ops": 100,
"operation_timeout_secs": 30,
"compression_level": 6
},
"storage_limits": {
"file_pool_max_size_mb": 256,
"disk_cache_max_size_mb": 1024,
"wal_max_size_mb": 512,
"memory_cache_size_mb": 4096
}
}

Capacity Statuses:

  • available — Normal operation, accepting requests
  • busy — High utilization (70–90%), still accepting requests
  • overloaded — Over capacity, check retry_after_ms

Write-Ahead Log (WAL)

The Write-Ahead Log ensures data durability and enables replication.

Get WAL Health

Check the status and health of the WAL system.

curl https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}"

# Response
{
"is_healthy": true,
"last_flush": "2026-01-15T15:45:30Z",
"buffer_utilization": 0.15,
"retry_rate": 0.0,
"file_size_mb": 100.0,
"max_size_mb": 512
}

Fields:

FieldDescription
is_healthytrue if WAL file size is below 90% of max_size_mb
last_flushTimestamp of the most recent WAL flush to disk
buffer_utilizationPercentage of WAL write buffer in use (0.0–1.0)
retry_rateRate of write retries (0.0 = no retries)
file_size_mbCurrent WAL file size in megabytes
max_size_mbMaximum WAL file size before rotation is recommended

Rotate WAL

Manually rotate the WAL to a new file.

curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
-H "Authorization: Bearer {ADMIN_TOKEN}"

# Response
{
"status": "success",
"message": "WAL rotated successfully"
}
When to Rotate
  • Before backups
  • When WAL file_size_mb is approaching max_size_mb (check via /api/wal/health)
  • During maintenance windows
  • For replication synchronization

Get WAL Entries

Retrieve WAL entries within a time range (used for replication and gap filling).

curl "https://{EKODB_API_URL}/api/wal/entries?from_timestamp=1705329600&to_timestamp=1705333200" \
-H "Authorization: Bearer {ADMIN_TOKEN}"

# Response
{
"entries": [...],
"count": 150,
"from_timestamp": 1705329600,
"to_timestamp": 1705333200
}

Query Parameters:

  • from_timestamp - Start Unix timestamp in seconds (required)
  • to_timestamp - End Unix timestamp in seconds (required)

Replication

Receive WAL Shipment

Receive and apply WAL entries from a peer instance (for replication and gap filling). This endpoint requires the x-ripple-request-id header to verify the request comes from a configured peer.

POST https://{EKODB_API_URL}/api/replication/wal
Content-Type: application/json
x-ripple-request-id: {REQUEST_ID}

{
"entries": [...],
"timestamp": 1705329600,
"source_deployment_id": "primary-db-01"
}

# Response
{
"status": "ok",
"entries_applied": 2,
"entries_failed": 0,
"total_entries": 2
}

System Analysis

Get System Analysis

Analyze system hardware capabilities and get recommended configuration settings.

GET https://{EKODB_API_URL}/api/system/analysis
Authorization: Bearer {ADMIN_TOKEN}

# Response
{
"current_specs": {
"cpu_count": 8,
"cpu_frequency": 2400,
"total_memory": 17179869184,
"disk_free": 107374182400,
"has_gpu": false,
"gpu_count": 0,
"total_gpu_memory": 0
},
"recommended_tier": "performance",
"current_settings": {
"batch_parallel_max": 1000,
"batch_sequential_max": 10000,
"insert_batch_size": 500,
"update_batch_size": 200,
"delete_batch_size": 300,
"query_batch_size": 1000,
"max_concurrent_ops": 100,
"operation_timeout": 30,
"compression_level": 6
},
"recommended_settings": {
"batch_parallel_max": 2000,
"batch_sequential_max": 20000,
"insert_batch_size": 1000,
"update_batch_size": 500,
"delete_batch_size": 500,
"query_batch_size": 2000,
"max_concurrent_ops": 200,
"operation_timeout": 60,
"compression_level": 3
},
"recommendations": [
"System has 8 CPU cores - optimal for parallel operations",
"16GB RAM available - can handle large batch operations",
"Consider increasing batch sizes for better throughput",
"Fast compression recommended for better performance"
]
}
System Analysis vs Analytics
  • System Analysis (/api/system/analysis) - Hardware specs and configuration recommendations
  • Analytics (/api/analytics) - Database statistics, collection sizes, and performance metrics

Get Analytics Data

Get database statistics, collection metrics, and performance data.

GET https://{EKODB_API_URL}/api/analytics
Authorization: Bearer {ADMIN_TOKEN}

# Response
{
"cache": {
"hits": 50000,
"misses": 5000,
"total_requests": 55000,
"hit_rate": 0.909
},
"io": {
"reads": 100000,
"writes": 50000
},
"cpu": {
"cpu_count": 8,
"usage_ratio": 0.35
},
"memory": {
"total_memory": 17179869184,
"used_memory": 6442450944
},
"network": {
"ingress_bytes": 1073741824,
"egress_bytes": 2147483648
},
"database": {
"total_size": 10737418240,
"available_size": 96636764160,
"system_used_space": 5368709120,
"db_used_space": 5368709120,
"max_size": 107374182400
},
"collections": [
["users", {"record_count": 10000, "total_size": 1048576}],
["posts", {"record_count": 50000, "total_size": 5242880}],
["events", {"record_count": 500000, "total_size": 52428800}]
],
"performance_logs": []
}

Get System Logs

Retrieve server logs for troubleshooting. Returns the most recent log lines in chronological order.

curl https://{EKODB_API_URL}/api/system/logs \
-H "Authorization: Bearer {ADMIN_TOKEN}"

# Response
{
"logs": [
"2026-01-15T15:40:00Z INFO [ekodb_server] Database started successfully",
"2026-01-15T15:40:05Z INFO [ekodb_server::handlers] Insert completed for collection 'users'",
"2026-01-15T15:45:30Z ERROR [ekodb_server::handlers] Query timeout after 30s for collection 'events'"
]
}

The response contains an array of log line strings. The number of lines returned is controlled by the MAX_LOG_LINES environment variable (default: 1000).

Complete Example

Here's a complete system monitoring and maintenance workflow:

#!/bin/bash

# 1. Check system health
health=$(curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}")

echo "System Status: $(echo $health | jq -r '.status')"
echo "Capacity: $(echo $health | jq -r '.capacity.status')"
echo "CPU Cores: $(echo $health | jq -r '.system_resources.cpu_count')"

# 2. Check WAL health
wal_health=$(curl -s https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}")

wal_size=$(echo $wal_health | jq -r '.file_size_mb')
wal_max=$(echo $wal_health | jq -r '.max_size_mb')
echo "WAL Size: ${wal_size}MB / ${wal_max}MB"

# 3. Rotate WAL if needed (approaching max size)
is_healthy=$(echo $wal_health | jq -r '.is_healthy')
if [ "$is_healthy" = "false" ]; then
echo "WAL approaching limit, rotating..."
curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
-H "Authorization: Bearer {ADMIN_TOKEN}"
fi

# 4. Get system analysis
analysis=$(curl -s https://{EKODB_API_URL}/api/system/analysis \
-H "Authorization: Bearer {ADMIN_TOKEN}")

echo "Recommendations:"
echo $analysis | jq -r '.recommendations[]'

# 5. Check recent logs for errors
logs=$(curl -s https://{EKODB_API_URL}/api/system/logs \
-H "Authorization: Bearer {ADMIN_TOKEN}")

error_count=$(echo $logs | jq '[.logs[] | select(contains("ERROR"))] | length')
echo "Recent Errors: $error_count"

# 6. Get WAL entries for backup (last hour)
end_time=$(date +%s)
start_time=$((end_time - 3600))

curl -s "https://{EKODB_API_URL}/api/wal/entries?from_timestamp=$start_time&to_timestamp=$end_time" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
> wal_backup_$(date +%Y%m%d_%H%M%S).json

echo "WAL backup complete"

Best Practices

Health Monitoring

Set Up Regular Health Checks:

# Check health every 5 minutes
*/5 * * * * curl -s https://{EKODB_API_URL}/api/health \
| jq -r '.status' | \
grep -q "ok" || echo "ALERT: Database unhealthy!"

WAL Management

Monitor WAL Size:

# Alert if WAL is approaching max size (is_healthy = false means >90% of max)
is_healthy=$(curl -s https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" | jq -r '.is_healthy')
if [ "$is_healthy" = "false" ]; then
echo "WARNING: WAL approaching size limit"
fi

Scheduled Rotation:

# Rotate WAL daily at 3 AM
0 3 * * * curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
-H "Authorization: Bearer {ADMIN_TOKEN}"

Log Retention

Archive Logs:

# Archive current server logs
curl -s https://{EKODB_API_URL}/api/system/logs \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
> archived_logs_$(date +%Y%m%d).json

Replication Setup

Configure Primary-Replica:

# On replica: Fetch and apply WAL entries every minute
*/1 * * * * bash /scripts/replicate_wal.sh

# replicate_wal.sh
#!/bin/bash
PRIMARY_URL="https://primary.ekodb.net"
REPLICA_URL="https://replica.ekodb.net"

# Get latest WAL entries from primary
end_time=$(date +%s)
start_time=$((end_time - 120)) # Last 2 minutes

entries=$(curl -s "$PRIMARY_URL/api/wal/entries?from_timestamp=$start_time&to_timestamp=$end_time" \
-H "Authorization: Bearer {ADMIN_TOKEN}")

# Apply to replica
curl -X POST "$REPLICA_URL/api/replication/wal" \
-H "Content-Type: application/json" \
-H "x-ripple-request-id: manual-sync-$(date +%s)" \
-d "$entries"

Performance Optimization

Act on Recommendations:

# Get and apply system recommendations
recommendations=$(curl -s .../api/system/analysis | jq -r '.recommendations[]')

# Example: Create recommended indexes
echo "$recommendations" | grep "Create index" | while read -r rec; do
echo "Creating index: $rec"
# Parse and create index via API
done

# Example: Delete unused indexes
echo "$recommendations" | grep "Delete unused index" | while read -r rec; do
echo "Cleanup needed: $rec"
done

Monitoring Alerts

Critical Metrics to Monitor

MetricWarning ThresholdCritical ThresholdAction
Disk usage> 75%> 90%Rotate WAL, archive old data
Memory usage> 80%> 95%Restart, scale up
Query time (p95)> 100ms> 500msCreate indexes, optimize queries
WAL file size> 800MB> 1GBForce rotation
Error rate> 1%> 5%Investigate logs
Active transactions> 100> 500Check for stuck transactions

Sample Monitoring Script

#!/bin/bash

ALERT_EMAIL="ops@example.com"
THRESHOLD_ERRORS=10

# Get health data
health=$(curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}")

# Check capacity status
capacity=$(echo $health | jq -r '.capacity.status')
if [ "$capacity" = "overloaded" ]; then
echo "ALERT: Server overloaded" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi

# Check WAL health
wal_healthy=$(curl -s https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" | jq -r '.is_healthy')
if [ "$wal_healthy" = "false" ]; then
echo "ALERT: WAL approaching size limit" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi

# Check recent errors in logs
error_count=$(curl -s https://{EKODB_API_URL}/api/system/logs \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '[.logs[] | select(contains("ERROR"))] | length')

if [ $error_count -gt $THRESHOLD_ERRORS ]; then
echo "ALERT: $error_count errors in recent logs" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi

Troubleshooting

High Memory Usage

# Check analytics for collection sizes
curl -s https://{EKODB_API_URL}/api/analytics \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.collections[] | {name: .[0], size: .[1].total_size, records: .[1].record_count}' \
| jq -s 'sort_by(.size) | reverse | .[0]'

# Check memory metrics
curl -s https://{EKODB_API_URL}/api/analytics \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.memory'

# Check health status for memory usage
curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.system_resources.total_memory_gb'

# Consider:
# - Archiving old data from large collections
# - Implementing pagination
# - Adding query result limits
# - Clearing unused indexes

Slow Query Performance

# Check if indexes exist for frequently queried fields
curl -s https://{EKODB_API_URL}/api/indexes/query/{collection} \
-H "Authorization: Bearer {ADMIN_TOKEN}"

# Use explain to analyze query performance
curl -X POST https://{EKODB_API_URL}/api/query/{collection}/explain \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"filter": {
"type": "Condition",
"content": {
"field": "your_field",
"operator": "Eq",
"value": "your_value"
}
}
}'

# Create indexes for fields without them
curl -X POST https://{EKODB_API_URL}/api/indexes/query/{collection} \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{"field": "frequently_queried_field", "index_type": "btree"}'

# Check batch settings and adjust if needed
curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.batch_settings'

WAL Disk Space Issues

# Check WAL health and size
curl -s https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '{is_healthy, file_size_mb, max_size_mb}'

# Rotate immediately if unhealthy
is_healthy=$(curl -s https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq -r '.is_healthy')

if [ "$is_healthy" = "false" ]; then
echo "WAL approaching limit, rotating..."
curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
-H "Authorization: Bearer {ADMIN_TOKEN}"
fi

# Get WAL entries for backup before cleanup
end_time=$(date +%s)
start_time=$((end_time - 86400)) # Last 24 hours
curl -s "https://{EKODB_API_URL}/api/wal/entries?from_timestamp=$start_time&to_timestamp=$end_time" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
> wal_backup_$(date +%Y%m%d).json

# After backing up, rotation will free space

Ripple Configuration

For multi-node deployments, configure data propagation between instances:

# Configure ripples on a node (add each peer separately)
curl -X POST https://{EKODB_API_URL}/api/ripples/config \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"name": "peer1",
"url": "https://peer1.ekodb.net:8080",
"api_key": "peer1-admin-key",
"mode": "Operations",
"enabled": true
}'

curl -X POST https://{EKODB_API_URL}/api/ripples/config \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"name": "peer2",
"url": "https://peer2.ekodb.net:8080",
"api_key": "peer2-admin-key",
"mode": "Operations",
"enabled": true
}'

# List configured ripples
curl -X GET https://{EKODB_API_URL}/api/ripples/config \
-H "Authorization: Bearer {ADMIN_TOKEN}"

# Check ripple health
curl -X GET https://{EKODB_API_URL}/api/ripples/health \
-H "Authorization: Bearer {ADMIN_TOKEN}"

Replication Roles:

  • primary - Propagate writes to peers (primary nodes)
  • replica - Receive updates from peers (read replicas)
  • peer - Full bidirectional sync (multi-master)
  • standalone - No replication (isolated nodes)
Ripple Use Cases

See Ripples - Data Propagation for comprehensive guides on multi-region deployments, read scaling, high availability architectures, and data pipeline patterns.

Manifest Recovery

Rebuild Collection Manifests

Rebuild collection manifests from individual record files. Useful for recovering from schema evolution issues, stale manifest data after upgrades, or collection loading problems after crashes.

POST https://{EKODB_API_URL}/api/admin/rebuild-manifests
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}

# Rebuild specific collections
{
"collections": ["users", "orders"]
}

# Rebuild ALL collections (omit collections field)
{}

# Response
{
"rebuilt": ["users", "orders"],
"message": "Successfully rebuilt 2 collection(s)"
}
Backup First

This operation automatically backs up existing manifests (.backup extension) before rebuilding, but it's recommended to have a full backup before running this in production.

Use Cases:

  • Schema Evolution Issues - Records missing after adding new required fields
  • Stale Manifest Data - Manifest doesn't reflect actual records on disk
  • Post-Crash Recovery - Collection loading problems after unexpected shutdown
  • Version Upgrades - Manifest format changes between versions

What It Does:

  1. Backs up existing manifest files
  2. Scans individual record files
  3. Rebuilds manifest from actual record data
  4. Updates in-memory collections

Public Endpoint Rate Limiting

ekoDB includes configurable IP-based rate limiting for public endpoints that don't require authentication. This protects against DDoS and brute force attacks.

Protected Endpoints

  • /api/health - Health check
  • /api/auth/register - API key registration
  • /api/auth/token - Token generation
  • /api/auth/{api_key}/admin_key - Admin key retrieval

Configuration

Configure via environment variables or database config:

SettingEnvironment VariableDefaultDescription
EnabledPUBLIC_ENDPOINT_RATE_LIMIT_ENABLEDtrueEnable/disable rate limiting
LimitPUBLIC_ENDPOINT_RATE_LIMIT_PER_MINUTE60Max requests per IP per minute

Rate Limit Response

When rate limited, the server returns:

HTTP/1.1 429 Too Many Requests
Retry-After: 45
Content-Type: application/json

{
"error": "Rate limit exceeded",
"retry_after_secs": 45
}

IP Detection

The rate limiter detects client IP from (in order):

  1. X-Forwarded-For header (first IP in list)
  2. X-Real-IP header
  3. Direct connection IP
Load Balancer Configuration

If using a load balancer, ensure it forwards the original client IP via X-Forwarded-For or X-Real-IP headers for accurate rate limiting.