Skip to main content

System Administration

Administrative endpoints for monitoring, maintenance, and system analysis.

Admin Access Required

All endpoints in this section require admin permissions.

Health Monitoring

Check System Health

Get overall system health status. This endpoint is partially public - returns basic status without auth, detailed metrics with admin auth.

GET https://{EKODB_API_URL}/api/health
# Optional: Authorization: Bearer {ADMIN_TOKEN}

# Response (without auth - basic)
{
"status": "healthy",
"uptime_seconds": 86400,
"version": "1.2.3"
}

# Response (with admin auth - detailed)
{
"status": "healthy",
"uptime_seconds": 86400,
"version": "1.2.3",
"database": {
"status": "healthy",
"collections": 12,
"total_records": 1234567,
"size_bytes": 5242880000
},
"memory": {
"used_bytes": 2147483648,
"available_bytes": 4294967296,
"usage_percent": 50
},
"wal": {
"status": "healthy",
"current_file": "wal_00000042.log",
"size_bytes": 104857600,
"entries_count": 50000
},
"batch_processing": {
"active_operations": 5,
"queued_operations": 12,
"capacity_used_percent": 25
},
"performance": {
"avg_query_time_ms": 15,
"requests_per_second": 1250,
"cache_hit_rate": 0.85
}
}

Write-Ahead Log (WAL)

The Write-Ahead Log ensures data durability and enables replication.

Get WAL Health

Check the status and health of the WAL system.

GET https://{EKODB_API_URL}/api/wal/health
Authorization: Bearer {ADMIN_TOKEN}

# Response
{
"status": "healthy",
"current_file": "wal_00000042.log",
"current_file_size_bytes": 104857600,
"total_entries": 50000,
"last_entry_timestamp": "2024-01-15T15:45:30Z",
"disk_space": {
"used_bytes": 524288000,
"available_bytes": 10737418240,
"usage_percent": 4.8
},
"rotation": {
"auto_rotation_enabled": true,
"rotation_threshold_bytes": 1073741824,
"last_rotation": "2024-01-15T10:00:00Z"
},
"warnings": []
}

Possible Statuses:

  • healthy - WAL is functioning normally
  • warning - Approaching disk space limits
  • critical - Immediate attention required

Rotate WAL File

Manually trigger WAL file rotation.

POST https://{EKODB_API_URL}/api/wal/rotate
Authorization: Bearer {ADMIN_TOKEN}

# Response
{
"status": "rotated",
"old_file": "wal_00000042.log",
"new_file": "wal_00000043.log",
"old_file_size_bytes": 104857600,
"entries_moved": 50000,
"rotation_time_ms": 342
}
When to Rotate
  • Before backups
  • When approaching file size limits
  • During maintenance windows
  • For replication synchronization

Get WAL Entries

Retrieve WAL entries within a time range (used for replication and gap filling).

GET https://{EKODB_API_URL}/api/wal/entries?start_time=1705329600&end_time=1705333200
Authorization: Bearer {ADMIN_TOKEN}

# Response
{
"entries": [
{
"sequence": 50001,
"timestamp": "2024-01-15T15:40:00Z",
"operation": "insert",
"collection": "users",
"record_id": "user_123",
"data": {...},
"transaction_id": null
},
{
"sequence": 50002,
"timestamp": "2024-01-15T15:40:05Z",
"operation": "update",
"collection": "posts",
"record_id": "post_456",
"changes": {...},
"transaction_id": "tx-001"
},
{
"sequence": 50003,
"timestamp": "2024-01-15T15:40:10Z",
"operation": "delete",
"collection": "comments",
"record_id": "comment_789",
"transaction_id": null
}
],
"total": 3,
"start_sequence": 50001,
"end_sequence": 50003
}

Query Parameters:

  • start_time - Unix timestamp (seconds)
  • end_time - Unix timestamp (seconds)
  • limit - Max entries to return (default: 1000)
  • offset - Pagination offset

Replication

Receive WAL Shipment

Receive and apply WAL entries from a peer instance (for replication and gap filling).

POST https://{EKODB_API_URL}/api/replication/wal
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}

{
"source_instance": "primary-db-01",
"entries": [
{
"sequence": 50001,
"timestamp": "2024-01-15T15:40:00Z",
"operation": "insert",
"collection": "users",
"record_id": "user_123",
"data": {...}
},
{
"sequence": 50002,
"timestamp": "2024-01-15T15:40:05Z",
"operation": "update",
"collection": "posts",
"record_id": "post_456",
"changes": {...}
}
],
"checksum": "a3f5b8c9d2e1f4a7"
}

# Response
{
"status": "applied",
"entries_received": 2,
"entries_applied": 2,
"entries_skipped": 0,
"last_sequence_applied": 50002,
"conflicts": [],
"apply_time_ms": 45
}

Conflict Handling:

{
"status": "partial",
"entries_received": 3,
"entries_applied": 2,
"entries_skipped": 1,
"conflicts": [
{
"sequence": 50003,
"reason": "record_already_exists",
"collection": "users",
"record_id": "user_123",
"resolution": "skipped"
}
]
}

System Analysis

Get System Analysis

Analyze system hardware capabilities and get recommended configuration settings.

GET https://{EKODB_API_URL}/api/system/analysis
Authorization: Bearer {ADMIN_TOKEN}

# Response
{
"current_specs": {
"cpu_count": 8,
"cpu_frequency": 2400,
"total_memory": 17179869184,
"disk_free": 107374182400,
"has_gpu": false,
"gpu_count": 0,
"total_gpu_memory": 0
},
"recommended_tier": "performance",
"current_settings": {
"batch_parallel_max": 1000,
"batch_sequential_max": 10000,
"insert_batch_size": 500,
"update_batch_size": 200,
"delete_batch_size": 300,
"query_batch_size": 1000,
"max_concurrent_ops": 100,
"operation_timeout": 30,
"compression_level": 6
},
"recommended_settings": {
"batch_parallel_max": 2000,
"batch_sequential_max": 20000,
"insert_batch_size": 1000,
"update_batch_size": 500,
"delete_batch_size": 500,
"query_batch_size": 2000,
"max_concurrent_ops": 200,
"operation_timeout": 60,
"compression_level": 3
},
"recommendations": [
"System has 8 CPU cores - optimal for parallel operations",
"16GB RAM available - can handle large batch operations",
"Consider increasing batch sizes for better throughput",
"Fast compression recommended for better performance"
]
}
System Analysis vs Analytics
  • System Analysis (/api/system/analysis) - Hardware specs and configuration recommendations
  • Analytics (/api/analytics) - Database statistics, collection sizes, and performance metrics

Get Analytics Data

Get database statistics, collection metrics, and performance data.

GET https://{EKODB_API_URL}/api/analytics
Authorization: Bearer {ADMIN_TOKEN}

# Response
{
"cache": {
"hits": 50000,
"misses": 5000,
"total_requests": 55000,
"hit_rate": 0.909
},
"io": {
"reads": 100000,
"writes": 50000
},
"cpu": {
"cpu_count": 8,
"usage_ratio": 0.35
},
"memory": {
"total_memory": 17179869184,
"used_memory": 6442450944
},
"network": {
"ingress_bytes": 1073741824,
"egress_bytes": 2147483648
},
"database": {
"total_size": 10737418240,
"available_size": 96636764160,
"system_used_space": 5368709120,
"db_used_space": 5368709120,
"max_size": 107374182400
},
"collections": [
["users", {"record_count": 10000, "total_size": 1048576}],
["posts", {"record_count": 50000, "total_size": 5242880}],
["events", {"record_count": 500000, "total_size": 52428800}]
],
"performance_logs": []
}

Get System Logs

Retrieve system logs for troubleshooting.

GET https://{EKODB_API_URL}/api/system/logs?level=error&limit=100
Authorization: Bearer {ADMIN_TOKEN}

# Response
{
"logs": [
{
"timestamp": "2024-01-15T15:45:30Z",
"level": "error",
"component": "query_engine",
"message": "Query timeout after 30s for collection 'events'",
"context": {
"collection": "events",
"query": {...},
"timeout_ms": 30000
}
},
{
"timestamp": "2024-01-15T15:40:00Z",
"level": "warning",
"component": "wal",
"message": "WAL file approaching size limit",
"context": {
"current_size_bytes": 943718400,
"limit_bytes": 1073741824,
"usage_percent": 88
}
}
],
"total": 2,
"oldest_timestamp": "2024-01-15T10:00:00Z",
"newest_timestamp": "2024-01-15T15:45:30Z"
}

Query Parameters:

  • level - Log level filter (error, warning, info, debug)
  • component - Filter by system component
  • start_time - Unix timestamp
  • end_time - Unix timestamp
  • limit - Max logs to return (default: 100)

Complete Example

Here's a complete system monitoring and maintenance workflow:

#!/bin/bash

# 1. Check system health
health=$(curl -s -X GET https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}")

echo "System Status: $(echo $health | jq -r '.status')"
echo "Memory Usage: $(echo $health | jq -r '.memory.usage_percent')%"
echo "WAL Status: $(echo $health | jq -r '.wal.status')"

# 2. Check WAL health
wal_health=$(curl -s -X GET https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}")

wal_usage=$(echo $wal_health | jq -r '.disk_space.usage_percent')
echo "WAL Disk Usage: $wal_usage%"

# 3. Rotate WAL if needed (> 80% full)
if (( $(echo "$wal_usage > 80" | bc -l) )); then
echo "WAL disk usage high, rotating..."
curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
-H "Authorization: Bearer {ADMIN_TOKEN}"
fi

# 4. Get system analysis
analysis=$(curl -s -X GET https://{EKODB_API_URL}/api/system/analysis \
-H "Authorization: Bearer {ADMIN_TOKEN}")

echo "Total Records: $(echo $analysis | jq -r '.database.total_records')"
echo "Total Size: $(echo $analysis | jq -r '.database.total_size_bytes' | numfmt --to=iec)"

# 5. Check for performance recommendations
echo "Recommendations:"
echo $analysis | jq -r '.recommendations[]'

# 6. Check for errors in logs
errors=$(curl -s -X GET "https://{EKODB_API_URL}/api/system/logs?level=error&limit=10" \
-H "Authorization: Bearer {ADMIN_TOKEN}")

error_count=$(echo $errors | jq '.logs | length')
echo "Recent Errors: $error_count"

if [ $error_count -gt 0 ]; then
echo "Latest error: $(echo $errors | jq -r '.logs[0].message')"
fi

# 7. Get WAL entries for backup (last hour)
end_time=$(date +%s)
start_time=$((end_time - 3600))

curl -s -X GET "https://{EKODB_API_URL}/api/wal/entries?start_time=$start_time&end_time=$end_time&limit=1000" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
> wal_backup_$(date +%Y%m%d_%H%M%S).json

echo "WAL backup complete"

Best Practices

Health Monitoring

Set Up Regular Health Checks:

# Check health every 5 minutes
*/5 * * * * curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq -r '.status' | \
grep -q "healthy" || echo "ALERT: Database unhealthy!"

WAL Management

Monitor Disk Usage:

# Alert if WAL disk usage > 75%
wal_usage=$(curl -s .../api/wal/health | jq -r '.disk_space.usage_percent')
if (( $(echo "$wal_usage > 75" | bc -l) )); then
echo "WARNING: WAL disk usage at $wal_usage%"
fi

Scheduled Rotation:

# Rotate WAL daily at 3 AM
0 3 * * * curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
-H "Authorization: Bearer {ADMIN_TOKEN}"

Log Retention

Archive Old Logs:

# Export and archive logs older than 7 days
seven_days_ago=$(date -d '7 days ago' +%s)
now=$(date +%s)

curl -s "https://{EKODB_API_URL}/api/system/logs?end_time=$seven_days_ago&limit=10000" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
> archived_logs_$(date +%Y%m%d).json

Replication Setup

Configure Primary-Replica:

# On replica: Fetch and apply WAL entries every minute
*/1 * * * * bash /scripts/replicate_wal.sh

# replicate_wal.sh
#!/bin/bash
PRIMARY_URL="https://primary.ekodb.net"
REPLICA_URL="https://replica.ekodb.net"

# Get latest WAL entries from primary
end_time=$(date +%s)
start_time=$((end_time - 120)) # Last 2 minutes

entries=$(curl -s "$PRIMARY_URL/api/wal/entries?start_time=$start_time&end_time=$end_time" \
-H "Authorization: Bearer {ADMIN_TOKEN}")

# Apply to replica
curl -X POST "$REPLICA_URL/api/replication/wal" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d "$entries"

Performance Optimization

Act on Recommendations:

# Get and apply system recommendations
recommendations=$(curl -s .../api/system/analysis | jq -r '.recommendations[]')

# Example: Create recommended indexes
echo "$recommendations" | grep "Create index" | while read -r rec; do
echo "TODO: $rec"
# Parse and create index
done

# Example: Delete unused indexes
echo "$recommendations" | grep "Delete unused index" | while read -r rec; do
echo "Cleanup needed: $rec"
done

Monitoring Alerts

Critical Metrics to Monitor

MetricWarning ThresholdCritical ThresholdAction
Disk usage> 75%> 90%Rotate WAL, archive old data
Memory usage> 80%> 95%Restart, scale up
Query time (p95)> 100ms> 500msCreate indexes, optimize queries
WAL file size> 800MB> 1GBForce rotation
Error rate> 1%> 5%Investigate logs
Active transactions> 100> 500Check for stuck transactions

Sample Monitoring Script

#!/bin/bash

ALERT_EMAIL="ops@example.com"
THRESHOLD_DISK=75
THRESHOLD_MEM=80
THRESHOLD_ERRORS=10

# Get health data
health=$(curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}")

# Check disk usage
disk_usage=$(echo $health | jq -r '.wal.disk_space.usage_percent')
if (( $(echo "$disk_usage > $THRESHOLD_DISK" | bc -l) )); then
echo "ALERT: Disk usage at $disk_usage%" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi

# Check memory
mem_usage=$(echo $health | jq -r '.memory.usage_percent')
if (( $(echo "$mem_usage > $THRESHOLD_MEM" | bc -l) )); then
echo "ALERT: Memory usage at $mem_usage%" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi

# Check recent errors
error_count=$(curl -s "https://{EKODB_API_URL}/api/system/logs?level=error&limit=100" \
-H "Authorization: Bearer {ADMIN_TOKEN}" | jq '.logs | length')

if [ $error_count -gt $THRESHOLD_ERRORS ]; then
echo "ALERT: $error_count errors in last period" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi

Troubleshooting

High Memory Usage

# Check analytics for collection sizes
curl -s https://{EKODB_API_URL}/api/analytics \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.collections[] | {name: .[0], size: .[1].total_size, records: .[1].record_count}' \
| jq -s 'sort_by(.size) | reverse | .[0]'

# Check memory metrics
curl -s https://{EKODB_API_URL}/api/analytics \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.memory'

# Check health status for memory usage
curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.system_resources.total_memory_gb'

# Consider:
# - Archiving old data from large collections
# - Implementing pagination
# - Adding query result limits
# - Clearing unused indexes

Slow Query Performance

# Check if indexes exist for frequently queried fields
curl -s https://{EKODB_API_URL}/api/indexes/query/{collection} \
-H "Authorization: Bearer {ADMIN_TOKEN}"

# Use explain to analyze query performance
curl -X POST https://{EKODB_API_URL}/api/query/{collection}/explain \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"filter": {
"type": "expression",
"content": {
"field": "your_field",
"operator": "Equals",
"value": "your_value"
}
}
}'

# Create indexes for fields without them
curl -X POST https://{EKODB_API_URL}/api/indexes/query/{collection} \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{"field": "frequently_queried_field", "index_type": "btree"}'

# Check batch settings and adjust if needed
curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.batch_settings'

WAL Disk Space Issues

# Check WAL health and disk usage
curl -s https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '{status, disk_usage: .disk_space.usage_percent, size_gb: (.current_file_size_bytes / 1073741824)}'

# Rotate immediately if usage > 80%
usage=$(curl -s https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq -r '.disk_space.usage_percent')

if (( $(echo "$usage > 80" | bc -l) )); then
echo "WAL disk usage critical at $usage%, rotating..."
curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
-H "Authorization: Bearer {ADMIN_TOKEN}"
fi

# Get WAL entries for backup before cleanup
end_time=$(date +%s)
start_time=$((end_time - 86400)) # Last 24 hours
curl -s "https://{EKODB_API_URL}/api/wal/entries?start_time=$start_time&end_time=$end_time" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
> wal_backup_$(date +%Y%m%d).json

# After backing up, rotation will free space

Ripple Configuration

For multi-node deployments, configure data propagation between instances:

# Configure ripples on a node (add each peer separately)
curl -X POST https://{EKODB_API_URL}/api/ripples/config \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"name": "peer1",
"url": "https://peer1.ekodb.net:8080",
"api_key": "peer1-admin-key",
"mode": "Operations",
"enabled": true
}'

curl -X POST https://{EKODB_API_URL}/api/ripples/config \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"name": "peer2",
"url": "https://peer2.ekodb.net:8080",
"api_key": "peer2-admin-key",
"mode": "Operations",
"enabled": true
}'

# List configured ripples
curl -X GET https://{EKODB_API_URL}/api/ripples/list \
-H "Authorization: Bearer {ADMIN_TOKEN}"

# Check ripple health
curl -X GET https://{EKODB_API_URL}/api/ripples/list \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.ripples[] | {url, status, last_sync}'

Ripple Modes:

  • send - Propagate writes to peers (primary nodes)
  • receive - Receive updates from peers (read replicas)
  • both - Full bidirectional sync (multi-master)
  • none - No propagation (isolated nodes)
Ripple Use Cases

See Ripples - Data Propagation for comprehensive guides on multi-region deployments, read scaling, high availability architectures, and data pipeline patterns.

Manifest Recovery

Rebuild Collection Manifests

Rebuild collection manifests from individual record files. Useful for recovering from schema evolution issues, stale manifest data after upgrades, or collection loading problems after crashes.

POST https://{EKODB_API_URL}/api/admin/rebuild-manifests
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}

# Rebuild specific collections
{
"collections": ["users", "orders"]
}

# Rebuild ALL collections (omit collections field)
{}

# Response
{
"success": true,
"rebuilt_collections": ["users", "orders"],
"message": "Successfully rebuilt 2 collection(s)"
}
Backup First

This operation automatically backs up existing manifests (.backup extension) before rebuilding, but it's recommended to have a full backup before running this in production.

Use Cases:

  • Schema Evolution Issues - Records missing after adding new required fields
  • Stale Manifest Data - Manifest doesn't reflect actual records on disk
  • Post-Crash Recovery - Collection loading problems after unexpected shutdown
  • Version Upgrades - Manifest format changes between versions

What It Does:

  1. Backs up existing manifest files
  2. Scans individual record files
  3. Rebuilds manifest from actual record data
  4. Updates in-memory collections

Public Endpoint Rate Limiting

ekoDB includes configurable IP-based rate limiting for public endpoints that don't require authentication. This protects against DDoS and brute force attacks.

Protected Endpoints

  • /api/health - Health check
  • /api/auth/register - API key registration
  • /api/auth/token - Token generation
  • /api/auth/{api_key}/admin_key - Admin key retrieval

Configuration

Configure via environment variables or database config:

SettingEnvironment VariableDefaultDescription
EnabledPUBLIC_ENDPOINT_RATE_LIMIT_ENABLEDtrueEnable/disable rate limiting
LimitPUBLIC_ENDPOINT_RATE_LIMIT_PER_MINUTE60Max requests per IP per minute

Rate Limit Response

When rate limited, the server returns:

HTTP/1.1 429 Too Many Requests
Retry-After: 45
Content-Type: application/json

{
"error": "Rate limit exceeded",
"retry_after_secs": 45
}

IP Detection

The rate limiter detects client IP from (in order):

  1. X-Forwarded-For header (first IP in list)
  2. X-Real-IP header
  3. Direct connection IP
Load Balancer Configuration

If using a load balancer, ensure it forwards the original client IP via X-Forwarded-For or X-Real-IP headers for accurate rate limiting.