System Administration

Administrative endpoints for monitoring, maintenance, and system analysis.

Admin Access Required

All endpoints in this section require admin permissions.

Health Monitoring

Check System Health

Get overall system health status. This endpoint is partially public — returns basic status without auth, detailed metrics with admin auth.

Client Libraries
Direct API

// Returns Ok(()) if healthy, Err if not
client.health_check().await?;

# Returns True if healthy
is_healthy = client.health_check()

const isHealthy = await client.health();

const isHealthy = await client.health();

val isHealthy = client.health()

err := client.Health()
if err != nil {
    log.Fatal("Server unhealthy:", err)
}

info

Client libraries provide a simple health check that returns success/failure. For detailed system metrics, use the Direct API with an admin token.

# Public (no auth) - basic status
curl https://{EKODB_API_URL}/api/health

# Response
{
  "status": "ok"
}

# Admin auth - detailed metrics
curl https://{EKODB_API_URL}/api/health \
  -H "Authorization: Bearer {ADMIN_TOKEN}"

Response (with admin auth — detailed):

{
  "status": "ok",
  "version": "1.2.3",
  "timestamp": "2026-01-15T15:45:30Z",
  "capacity": {
    "can_accept_requests": true,
    "status": "available",
    "retry_after_ms": 0,
    "connections_available": 450,
    "utilization_percent": 10.0
  },
  "system_resources": {
    "cpu_count": 8,
    "total_memory_gb": 16,
    "disk_free_gb": 100
  },
  "batch_settings": {
    "parallel_max": 1000,
    "sequential_max": 10000,
    "insert_batch_size": 500,
    "update_batch_size": 200,
    "delete_batch_size": 300,
    "query_batch_size": 1000
  },
  "active_operations": {
    "has_active": true,
    "total": 5,
    "by_type": {"insert": 2, "query": 3},
    "under_stress": []
  },
  "file_pool": {
    "fd_in_use": 50,
    "fd_max": 500,
    "fd_available": 450,
    "fd_utilization_percent": 10.0,
    "eviction_count": 0,
    "total_permit_wait_time_ms": 0
  },
  "retry_after_ms": null,
  "adaptive_strategy": {
    "parallelism_for_100_items": {
      "preparation_cpu_bound": 8,
      "validation_mixed": 4,
      "disk_io_bound": 16
    },
    "chunk_sizing": {
      "records_1000": 250,
      "records_10000": 500
    }
  },
  "performance_settings": {
    "max_concurrent_ops": 100,
    "operation_timeout_secs": 30,
    "compression_level": 6
  },
  "storage_limits": {
    "file_pool_max_size_mb": 256,
    "disk_cache_max_size_mb": 1024,
    "wal_max_size_mb": 512,
    "memory_cache_size_mb": 4096
  }
}

Capacity Statuses:

available — Normal operation, accepting requests
busy — High utilization (70–90%), still accepting requests
overloaded — Over capacity, check retry_after_ms

Write-Ahead Log (WAL)

The Write-Ahead Log ensures data durability and enables replication.

Get WAL Health

Check the status and health of the WAL system.

curl https://{EKODB_API_URL}/api/wal/health \
  -H "Authorization: Bearer {ADMIN_TOKEN}"

# Response
{
  "is_healthy": true,
  "last_flush": "2026-01-15T15:45:30Z",
  "buffer_utilization": 0.15,
  "retry_rate": 0.0,
  "file_size_mb": 100.0,
  "max_size_mb": 512
}

Fields:

Field	Description
`is_healthy`	`true` if WAL file size is below 90% of `max_size_mb`
`last_flush`	Timestamp of the most recent WAL flush to disk
`buffer_utilization`	Percentage of WAL write buffer in use (0.0–1.0)
`retry_rate`	Rate of write retries (0.0 = no retries)
`file_size_mb`	Current WAL file size in megabytes
`max_size_mb`	Maximum WAL file size before rotation is recommended

Rotate WAL

Manually rotate the WAL to a new file.

curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
  -H "Authorization: Bearer {ADMIN_TOKEN}"

# Response
{
  "status": "success",
  "message": "WAL rotated successfully"
}

When to Rotate

Before backups
When WAL file_size_mb is approaching max_size_mb (check via /api/wal/health)
During maintenance windows
For replication synchronization

Get WAL Entries

Retrieve WAL entries within a time range (used for replication and gap filling).

curl "https://{EKODB_API_URL}/api/wal/entries?from_timestamp=1705329600&to_timestamp=1705333200" \
  -H "Authorization: Bearer {ADMIN_TOKEN}"

# Response
{
  "entries": [...],
  "count": 150,
  "from_timestamp": 1705329600,
  "to_timestamp": 1705333200
}

Query Parameters:

from_timestamp - Start Unix timestamp in seconds (required)
to_timestamp - End Unix timestamp in seconds (required)

Replication

Receive WAL Shipment

Receive and apply WAL entries from a peer instance (for replication and gap filling). This endpoint requires the x-ripple-request-id header to verify the request comes from a configured peer.

POST https://{EKODB_API_URL}/api/replication/wal
Content-Type: application/json
x-ripple-request-id: {REQUEST_ID}

{
  "entries": [...],
  "timestamp": 1705329600,
  "source_deployment_id": "primary-db-01"
}

# Response
{
  "status": "ok",
  "entries_applied": 2,
  "entries_failed": 0,
  "total_entries": 2
}

System Analysis

Get System Analysis

Analyze system hardware capabilities and get recommended configuration settings.

GET https://{EKODB_API_URL}/api/system/analysis
Authorization: Bearer {ADMIN_TOKEN}

# Response
{
  "current_specs": {
    "cpu_count": 8,
    "cpu_frequency": 2400,
    "total_memory": 17179869184,
    "disk_free": 107374182400,
    "has_gpu": false,
    "gpu_count": 0,
    "total_gpu_memory": 0
  },
  "recommended_tier": "performance",
  "current_settings": {
    "batch_parallel_max": 1000,
    "batch_sequential_max": 10000,
    "insert_batch_size": 500,
    "update_batch_size": 200,
    "delete_batch_size": 300,
    "query_batch_size": 1000,
    "max_concurrent_ops": 100,
    "operation_timeout": 30,
    "compression_level": 6
  },
  "recommended_settings": {
    "batch_parallel_max": 2000,
    "batch_sequential_max": 20000,
    "insert_batch_size": 1000,
    "update_batch_size": 500,
    "delete_batch_size": 500,
    "query_batch_size": 2000,
    "max_concurrent_ops": 200,
    "operation_timeout": 60,
    "compression_level": 3
  },
  "recommendations": [
    "System has 8 CPU cores - optimal for parallel operations",
    "16GB RAM available - can handle large batch operations",
    "Consider increasing batch sizes for better throughput",
    "Fast compression recommended for better performance"
  ]
}

System Analysis vs Analytics

System Analysis (/api/system/analysis) - Hardware specs and configuration recommendations
Analytics (/api/analytics) - Database statistics, collection sizes, and performance metrics

Get Analytics Data

Get database statistics, collection metrics, and performance data.

GET https://{EKODB_API_URL}/api/analytics
Authorization: Bearer {ADMIN_TOKEN}

# Response
{
  "cache": {
    "hits": 50000,
    "misses": 5000,
    "total_requests": 55000,
    "hit_rate": 0.909
  },
  "io": {
    "reads": 100000,
    "writes": 50000
  },
  "cpu": {
    "cpu_count": 8,
    "usage_ratio": 0.35
  },
  "memory": {
    "total_memory": 17179869184,
    "used_memory": 6442450944
  },
  "network": {
    "ingress_bytes": 1073741824,
    "egress_bytes": 2147483648
  },
  "database": {
    "total_size": 10737418240,
    "available_size": 96636764160,
    "system_used_space": 5368709120,
    "db_used_space": 5368709120,
    "max_size": 107374182400
  },
  "collections": [
    ["users", {"record_count": 10000, "total_size": 1048576}],
    ["posts", {"record_count": 50000, "total_size": 5242880}],
    ["events", {"record_count": 500000, "total_size": 52428800}]
  ],
  "performance_logs": []
}

Get System Logs

Retrieve server logs for troubleshooting. Returns the most recent log lines in chronological order.

curl https://{EKODB_API_URL}/api/system/logs \
  -H "Authorization: Bearer {ADMIN_TOKEN}"

# Response
{
  "logs": [
    "2026-01-15T15:40:00Z INFO  [ekodb_server] Database started successfully",
    "2026-01-15T15:40:05Z INFO  [ekodb_server::handlers] Insert completed for collection 'users'",
    "2026-01-15T15:45:30Z ERROR [ekodb_server::handlers] Query timeout after 30s for collection 'events'"
  ]
}

The response contains an array of log line strings. The number of lines returned is controlled by the MAX_LOG_LINES environment variable (default: 1000).

Complete Example

Here's a complete system monitoring and maintenance workflow:

#!/bin/bash

# 1. Check system health
health=$(curl -s https://{EKODB_API_URL}/api/health \
  -H "Authorization: Bearer {ADMIN_TOKEN}")

echo "System Status: $(echo $health | jq -r '.status')"
echo "Capacity: $(echo $health | jq -r '.capacity.status')"
echo "CPU Cores: $(echo $health | jq -r '.system_resources.cpu_count')"

# 2. Check WAL health
wal_health=$(curl -s https://{EKODB_API_URL}/api/wal/health \
  -H "Authorization: Bearer {ADMIN_TOKEN}")

wal_size=$(echo $wal_health | jq -r '.file_size_mb')
wal_max=$(echo $wal_health | jq -r '.max_size_mb')
echo "WAL Size: ${wal_size}MB / ${wal_max}MB"

# 3. Rotate WAL if needed (approaching max size)
is_healthy=$(echo $wal_health | jq -r '.is_healthy')
if [ "$is_healthy" = "false" ]; then
  echo "WAL approaching limit, rotating..."
  curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
    -H "Authorization: Bearer {ADMIN_TOKEN}"
fi

# 4. Get system analysis
analysis=$(curl -s https://{EKODB_API_URL}/api/system/analysis \
  -H "Authorization: Bearer {ADMIN_TOKEN}")

echo "Recommendations:"
echo $analysis | jq -r '.recommendations[]'

# 5. Check recent logs for errors
logs=$(curl -s https://{EKODB_API_URL}/api/system/logs \
  -H "Authorization: Bearer {ADMIN_TOKEN}")

error_count=$(echo $logs | jq '[.logs[] | select(contains("ERROR"))] | length')
echo "Recent Errors: $error_count"

# 6. Get WAL entries for backup (last hour)
end_time=$(date +%s)
start_time=$((end_time - 3600))

curl -s "https://{EKODB_API_URL}/api/wal/entries?from_timestamp=$start_time&to_timestamp=$end_time" \
  -H "Authorization: Bearer {ADMIN_TOKEN}" \
  > wal_backup_$(date +%Y%m%d_%H%M%S).json

echo "WAL backup complete"

Best Practices

Health Monitoring

Set Up Regular Health Checks:

# Check health every 5 minutes
*/5 * * * * curl -s https://{EKODB_API_URL}/api/health \
  | jq -r '.status' | \
  grep -q "ok" || echo "ALERT: Database unhealthy!"

WAL Management

Monitor WAL Size:

# Alert if WAL is approaching max size (is_healthy = false means >90% of max)
is_healthy=$(curl -s https://{EKODB_API_URL}/api/wal/health \
  -H "Authorization: Bearer {ADMIN_TOKEN}" | jq -r '.is_healthy')
if [ "$is_healthy" = "false" ]; then
  echo "WARNING: WAL approaching size limit"
fi

Scheduled Rotation:

# Rotate WAL daily at 3 AM
0 3 * * * curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
  -H "Authorization: Bearer {ADMIN_TOKEN}"

Log Retention

Archive Logs:

# Archive current server logs
curl -s https://{EKODB_API_URL}/api/system/logs \
  -H "Authorization: Bearer {ADMIN_TOKEN}" \
  > archived_logs_$(date +%Y%m%d).json

Replication Setup

Configure Primary-Replica:

# On replica: Fetch and apply WAL entries every minute
*/1 * * * * bash /scripts/replicate_wal.sh

# replicate_wal.sh
#!/bin/bash
PRIMARY_URL="https://primary.ekodb.net"
REPLICA_URL="https://replica.ekodb.net"

# Get latest WAL entries from primary
end_time=$(date +%s)
start_time=$((end_time - 120))  # Last 2 minutes

entries=$(curl -s "$PRIMARY_URL/api/wal/entries?from_timestamp=$start_time&to_timestamp=$end_time" \
  -H "Authorization: Bearer {ADMIN_TOKEN}")

# Apply to replica
curl -X POST "$REPLICA_URL/api/replication/wal" \
  -H "Content-Type: application/json" \
  -H "x-ripple-request-id: manual-sync-$(date +%s)" \
  -d "$entries"

Performance Optimization

Act on Recommendations:

# Get and apply system recommendations
recommendations=$(curl -s .../api/system/analysis | jq -r '.recommendations[]')

# Example: Create recommended indexes
echo "$recommendations" | grep "Create index" | while read -r rec; do
  echo "Creating index: $rec"
  # Parse and create index via API
done

# Example: Delete unused indexes
echo "$recommendations" | grep "Delete unused index" | while read -r rec; do
  echo "Cleanup needed: $rec"
done

Monitoring Alerts

Critical Metrics to Monitor

Metric	Warning Threshold	Critical Threshold	Action
Disk usage	> 75%	> 90%	Rotate WAL, archive old data
Memory usage	> 80%	> 95%	Restart, scale up
Query time (p95)	> 100ms	> 500ms	Create indexes, optimize queries
WAL file size	> 800MB	> 1GB	Force rotation
Error rate	> 1%	> 5%	Investigate logs
Active transactions	> 100	> 500	Check for stuck transactions

Sample Monitoring Script

#!/bin/bash

ALERT_EMAIL="ops@example.com"
THRESHOLD_ERRORS=10

# Get health data
health=$(curl -s https://{EKODB_API_URL}/api/health \
  -H "Authorization: Bearer {ADMIN_TOKEN}")

# Check capacity status
capacity=$(echo $health | jq -r '.capacity.status')
if [ "$capacity" = "overloaded" ]; then
  echo "ALERT: Server overloaded" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi

# Check WAL health
wal_healthy=$(curl -s https://{EKODB_API_URL}/api/wal/health \
  -H "Authorization: Bearer {ADMIN_TOKEN}" | jq -r '.is_healthy')
if [ "$wal_healthy" = "false" ]; then
  echo "ALERT: WAL approaching size limit" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi

# Check recent errors in logs
error_count=$(curl -s https://{EKODB_API_URL}/api/system/logs \
  -H "Authorization: Bearer {ADMIN_TOKEN}" \
  | jq '[.logs[] | select(contains("ERROR"))] | length')

if [ $error_count -gt $THRESHOLD_ERRORS ]; then
  echo "ALERT: $error_count errors in recent logs" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi

Troubleshooting

High Memory Usage

# Check analytics for collection sizes
curl -s https://{EKODB_API_URL}/api/analytics \
  -H "Authorization: Bearer {ADMIN_TOKEN}" \
  | jq '.collections[] | {name: .[0], size: .[1].total_size, records: .[1].record_count}' \
  | jq -s 'sort_by(.size) | reverse | .[0]'

# Check memory metrics
curl -s https://{EKODB_API_URL}/api/analytics \
  -H "Authorization: Bearer {ADMIN_TOKEN}" \
  | jq '.memory'

# Check health status for memory usage
curl -s https://{EKODB_API_URL}/api/health \
  -H "Authorization: Bearer {ADMIN_TOKEN}" \
  | jq '.system_resources.total_memory_gb'

# Consider:
# - Archiving old data from large collections
# - Implementing pagination
# - Adding query result limits
# - Clearing unused indexes

Slow Query Performance

# Check if indexes exist for frequently queried fields
curl -s https://{EKODB_API_URL}/api/indexes/query/{collection} \
  -H "Authorization: Bearer {ADMIN_TOKEN}"

# Use explain to analyze query performance
curl -X POST https://{EKODB_API_URL}/api/query/{collection}/explain \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {ADMIN_TOKEN}" \
  -d '{
    "filter": {
      "type": "Condition",
      "content": {
        "field": "your_field",
        "operator": "Eq",
        "value": "your_value"
      }
    }
  }'

# Create indexes for fields without them
curl -X POST https://{EKODB_API_URL}/api/indexes/query/{collection} \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {ADMIN_TOKEN}" \
  -d '{"field": "frequently_queried_field", "index_type": "btree"}'

# Check batch settings and adjust if needed
curl -s https://{EKODB_API_URL}/api/health \
  -H "Authorization: Bearer {ADMIN_TOKEN}" \
  | jq '.batch_settings'

WAL Disk Space Issues

# Check WAL health and size
curl -s https://{EKODB_API_URL}/api/wal/health \
  -H "Authorization: Bearer {ADMIN_TOKEN}" \
  | jq '{is_healthy, file_size_mb, max_size_mb}'

# Rotate immediately if unhealthy
is_healthy=$(curl -s https://{EKODB_API_URL}/api/wal/health \
  -H "Authorization: Bearer {ADMIN_TOKEN}" \
  | jq -r '.is_healthy')

if [ "$is_healthy" = "false" ]; then
  echo "WAL approaching limit, rotating..."
  curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
    -H "Authorization: Bearer {ADMIN_TOKEN}"
fi

# Get WAL entries for backup before cleanup
end_time=$(date +%s)
start_time=$((end_time - 86400))  # Last 24 hours
curl -s "https://{EKODB_API_URL}/api/wal/entries?from_timestamp=$start_time&to_timestamp=$end_time" \
  -H "Authorization: Bearer {ADMIN_TOKEN}" \
  > wal_backup_$(date +%Y%m%d).json

# After backing up, rotation will free space

Ripple Configuration

For multi-node deployments, configure data propagation between instances:

# Configure ripples on a node (add each peer separately)
curl -X POST https://{EKODB_API_URL}/api/ripples/config \
  -H "Authorization: Bearer {ADMIN_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "peer1",
    "url": "https://peer1.ekodb.net:8080",
    "api_key": "peer1-admin-key",
    "mode": "Operations",
    "enabled": true
  }'

curl -X POST https://{EKODB_API_URL}/api/ripples/config \
  -H "Authorization: Bearer {ADMIN_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "peer2",
    "url": "https://peer2.ekodb.net:8080",
    "api_key": "peer2-admin-key",
    "mode": "Operations",
    "enabled": true
  }'

# List configured ripples
curl -X GET https://{EKODB_API_URL}/api/ripples/config \
  -H "Authorization: Bearer {ADMIN_TOKEN}"

# Check ripple health
curl -X GET https://{EKODB_API_URL}/api/ripples/health \
  -H "Authorization: Bearer {ADMIN_TOKEN}"

Replication Roles:

primary - Propagate writes to peers (primary nodes)
replica - Receive updates from peers (read replicas)
peer - Full bidirectional sync (multi-master)
standalone - No replication (isolated nodes)

Ripple Use Cases

See Ripples - Data Propagation for comprehensive guides on multi-region deployments, read scaling, high availability architectures, and data pipeline patterns.

Manifest Recovery

Rebuild Collection Manifests

Rebuild collection manifests from individual record files. Useful for recovering from schema evolution issues, stale manifest data after upgrades, or collection loading problems after crashes.

POST https://{EKODB_API_URL}/api/admin/rebuild-manifests
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}

# Rebuild specific collections
{
  "collections": ["users", "orders"]
}

# Rebuild ALL collections (omit collections field)
{}

# Response
{
  "rebuilt": ["users", "orders"],
  "message": "Successfully rebuilt 2 collection(s)"
}

Backup First

This operation automatically backs up existing manifests (.backup extension) before rebuilding, but it's recommended to have a full backup before running this in production.

Use Cases:

Schema Evolution Issues - Records missing after adding new required fields
Stale Manifest Data - Manifest doesn't reflect actual records on disk
Post-Crash Recovery - Collection loading problems after unexpected shutdown
Version Upgrades - Manifest format changes between versions

What It Does:

Backs up existing manifest files
Scans individual record files
Rebuilds manifest from actual record data
Updates in-memory collections

Public Endpoint Rate Limiting

ekoDB includes configurable IP-based rate limiting for public endpoints that don't require authentication. This protects against DDoS and brute force attacks.

Protected Endpoints

/api/health - Health check
/api/auth/register - API key registration
/api/auth/token - Token generation
/api/auth/{api_key}/admin_key - Admin key retrieval

Configuration

Configure via environment variables or database config:

Setting	Environment Variable	Default	Description
Enabled	`PUBLIC_ENDPOINT_RATE_LIMIT_ENABLED`	`true`	Enable/disable rate limiting
Limit	`PUBLIC_ENDPOINT_RATE_LIMIT_PER_MINUTE`	`60`	Max requests per IP per minute

Rate Limit Response

When rate limited, the server returns:

HTTP/1.1 429 Too Many Requests
Retry-After: 45
Content-Type: application/json

{
  "error": "Rate limit exceeded",
  "retry_after_secs": 45
}

IP Detection

The rate limiter detects client IP from (in order):

X-Forwarded-For header (first IP in list)
X-Real-IP header
Direct connection IP

Load Balancer Configuration

If using a load balancer, ensure it forwards the original client IP via X-Forwarded-For or X-Real-IP headers for accurate rate limiting.

Configuration - All database configuration options
Ripples - Data Propagation - Multi-node data propagation and use cases
Indexes - Create indexes for performance
Transactions - Understand transaction impact on WAL
Collections & Schemas - Collection management
Authentication - Admin key management

Health Monitoring​

Check System Health​

Write-Ahead Log (WAL)​

Get WAL Health​

Rotate WAL​

Get WAL Entries​

Replication​

Receive WAL Shipment​

System Analysis​

Get System Analysis​

Get Analytics Data​

Get System Logs​

Complete Example​

Best Practices​

Health Monitoring​

WAL Management​

Log Retention​

Replication Setup​

Performance Optimization​

Monitoring Alerts​

Critical Metrics to Monitor​

Sample Monitoring Script​

Troubleshooting​

High Memory Usage​

Slow Query Performance​

WAL Disk Space Issues​

Ripple Configuration​

Manifest Recovery​

Rebuild Collection Manifests​

Public Endpoint Rate Limiting​

Protected Endpoints​

Configuration​

Rate Limit Response​

IP Detection​

Related Documentation​

Health Monitoring

Check System Health

Write-Ahead Log (WAL)

Get WAL Health

Rotate WAL

Get WAL Entries

Replication

Receive WAL Shipment

System Analysis

Get System Analysis

Get Analytics Data

Get System Logs

Complete Example

Best Practices

Health Monitoring

WAL Management

Log Retention

Replication Setup

Performance Optimization

Monitoring Alerts

Critical Metrics to Monitor

Sample Monitoring Script

Troubleshooting

High Memory Usage

Slow Query Performance

WAL Disk Space Issues

Ripple Configuration

Manifest Recovery

Rebuild Collection Manifests

Public Endpoint Rate Limiting

Protected Endpoints

Configuration

Rate Limit Response

IP Detection

Related Documentation