System Administration
Administrative endpoints for monitoring, maintenance, and system analysis.
All endpoints in this section require admin permissions.
Health Monitoring
Check System Health
Get overall system health status. This endpoint is partially public - returns basic status without auth, detailed metrics with admin auth.
GET https://{EKODB_API_URL}/api/health
# Optional: Authorization: Bearer {ADMIN_TOKEN}
# Response (without auth - basic)
{
"status": "healthy",
"uptime_seconds": 86400,
"version": "1.2.3"
}
# Response (with admin auth - detailed)
{
"status": "healthy",
"uptime_seconds": 86400,
"version": "1.2.3",
"database": {
"status": "healthy",
"collections": 12,
"total_records": 1234567,
"size_bytes": 5242880000
},
"memory": {
"used_bytes": 2147483648,
"available_bytes": 4294967296,
"usage_percent": 50
},
"wal": {
"status": "healthy",
"current_file": "wal_00000042.log",
"size_bytes": 104857600,
"entries_count": 50000
},
"batch_processing": {
"active_operations": 5,
"queued_operations": 12,
"capacity_used_percent": 25
},
"performance": {
"avg_query_time_ms": 15,
"requests_per_second": 1250,
"cache_hit_rate": 0.85
}
}
Write-Ahead Log (WAL)
The Write-Ahead Log ensures data durability and enables replication.
Get WAL Health
Check the status and health of the WAL system.
GET https://{EKODB_API_URL}/api/wal/health
Authorization: Bearer {ADMIN_TOKEN}
# Response
{
"status": "healthy",
"current_file": "wal_00000042.log",
"current_file_size_bytes": 104857600,
"total_entries": 50000,
"last_entry_timestamp": "2024-01-15T15:45:30Z",
"disk_space": {
"used_bytes": 524288000,
"available_bytes": 10737418240,
"usage_percent": 4.8
},
"rotation": {
"auto_rotation_enabled": true,
"rotation_threshold_bytes": 1073741824,
"last_rotation": "2024-01-15T10:00:00Z"
},
"warnings": []
}
Possible Statuses:
healthy- WAL is functioning normallywarning- Approaching disk space limitscritical- Immediate attention required
Rotate WAL File
Manually trigger WAL file rotation.
POST https://{EKODB_API_URL}/api/wal/rotate
Authorization: Bearer {ADMIN_TOKEN}
# Response
{
"status": "rotated",
"old_file": "wal_00000042.log",
"new_file": "wal_00000043.log",
"old_file_size_bytes": 104857600,
"entries_moved": 50000,
"rotation_time_ms": 342
}
- Before backups
- When approaching file size limits
- During maintenance windows
- For replication synchronization
Get WAL Entries
Retrieve WAL entries within a time range (used for replication and gap filling).
GET https://{EKODB_API_URL}/api/wal/entries?start_time=1705329600&end_time=1705333200
Authorization: Bearer {ADMIN_TOKEN}
# Response
{
"entries": [
{
"sequence": 50001,
"timestamp": "2024-01-15T15:40:00Z",
"operation": "insert",
"collection": "users",
"record_id": "user_123",
"data": {...},
"transaction_id": null
},
{
"sequence": 50002,
"timestamp": "2024-01-15T15:40:05Z",
"operation": "update",
"collection": "posts",
"record_id": "post_456",
"changes": {...},
"transaction_id": "tx-001"
},
{
"sequence": 50003,
"timestamp": "2024-01-15T15:40:10Z",
"operation": "delete",
"collection": "comments",
"record_id": "comment_789",
"transaction_id": null
}
],
"total": 3,
"start_sequence": 50001,
"end_sequence": 50003
}
Query Parameters:
start_time- Unix timestamp (seconds)end_time- Unix timestamp (seconds)limit- Max entries to return (default: 1000)offset- Pagination offset
Replication
Receive WAL Shipment
Receive and apply WAL entries from a peer instance (for replication and gap filling).
POST https://{EKODB_API_URL}/api/replication/wal
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
{
"source_instance": "primary-db-01",
"entries": [
{
"sequence": 50001,
"timestamp": "2024-01-15T15:40:00Z",
"operation": "insert",
"collection": "users",
"record_id": "user_123",
"data": {...}
},
{
"sequence": 50002,
"timestamp": "2024-01-15T15:40:05Z",
"operation": "update",
"collection": "posts",
"record_id": "post_456",
"changes": {...}
}
],
"checksum": "a3f5b8c9d2e1f4a7"
}
# Response
{
"status": "applied",
"entries_received": 2,
"entries_applied": 2,
"entries_skipped": 0,
"last_sequence_applied": 50002,
"conflicts": [],
"apply_time_ms": 45
}
Conflict Handling:
{
"status": "partial",
"entries_received": 3,
"entries_applied": 2,
"entries_skipped": 1,
"conflicts": [
{
"sequence": 50003,
"reason": "record_already_exists",
"collection": "users",
"record_id": "user_123",
"resolution": "skipped"
}
]
}
System Analysis
Get System Analysis
Analyze system hardware capabilities and get recommended configuration settings.
GET https://{EKODB_API_URL}/api/system/analysis
Authorization: Bearer {ADMIN_TOKEN}
# Response
{
"current_specs": {
"cpu_count": 8,
"cpu_frequency": 2400,
"total_memory": 17179869184,
"disk_free": 107374182400,
"has_gpu": false,
"gpu_count": 0,
"total_gpu_memory": 0
},
"recommended_tier": "performance",
"current_settings": {
"batch_parallel_max": 1000,
"batch_sequential_max": 10000,
"insert_batch_size": 500,
"update_batch_size": 200,
"delete_batch_size": 300,
"query_batch_size": 1000,
"max_concurrent_ops": 100,
"operation_timeout": 30,
"compression_level": 6
},
"recommended_settings": {
"batch_parallel_max": 2000,
"batch_sequential_max": 20000,
"insert_batch_size": 1000,
"update_batch_size": 500,
"delete_batch_size": 500,
"query_batch_size": 2000,
"max_concurrent_ops": 200,
"operation_timeout": 60,
"compression_level": 3
},
"recommendations": [
"System has 8 CPU cores - optimal for parallel operations",
"16GB RAM available - can handle large batch operations",
"Consider increasing batch sizes for better throughput",
"Fast compression recommended for better performance"
]
}
- System Analysis (
/api/system/analysis) - Hardware specs and configuration recommendations - Analytics (
/api/analytics) - Database statistics, collection sizes, and performance metrics
Get Analytics Data
Get database statistics, collection metrics, and performance data.
GET https://{EKODB_API_URL}/api/analytics
Authorization: Bearer {ADMIN_TOKEN}
# Response
{
"cache": {
"hits": 50000,
"misses": 5000,
"total_requests": 55000,
"hit_rate": 0.909
},
"io": {
"reads": 100000,
"writes": 50000
},
"cpu": {
"cpu_count": 8,
"usage_ratio": 0.35
},
"memory": {
"total_memory": 17179869184,
"used_memory": 6442450944
},
"network": {
"ingress_bytes": 1073741824,
"egress_bytes": 2147483648
},
"database": {
"total_size": 10737418240,
"available_size": 96636764160,
"system_used_space": 5368709120,
"db_used_space": 5368709120,
"max_size": 107374182400
},
"collections": [
["users", {"record_count": 10000, "total_size": 1048576}],
["posts", {"record_count": 50000, "total_size": 5242880}],
["events", {"record_count": 500000, "total_size": 52428800}]
],
"performance_logs": []
}
Get System Logs
Retrieve system logs for troubleshooting.
GET https://{EKODB_API_URL}/api/system/logs?level=error&limit=100
Authorization: Bearer {ADMIN_TOKEN}
# Response
{
"logs": [
{
"timestamp": "2024-01-15T15:45:30Z",
"level": "error",
"component": "query_engine",
"message": "Query timeout after 30s for collection 'events'",
"context": {
"collection": "events",
"query": {...},
"timeout_ms": 30000
}
},
{
"timestamp": "2024-01-15T15:40:00Z",
"level": "warning",
"component": "wal",
"message": "WAL file approaching size limit",
"context": {
"current_size_bytes": 943718400,
"limit_bytes": 1073741824,
"usage_percent": 88
}
}
],
"total": 2,
"oldest_timestamp": "2024-01-15T10:00:00Z",
"newest_timestamp": "2024-01-15T15:45:30Z"
}
Query Parameters:
level- Log level filter (error,warning,info,debug)component- Filter by system componentstart_time- Unix timestampend_time- Unix timestamplimit- Max logs to return (default: 100)
Complete Example
Here's a complete system monitoring and maintenance workflow:
#!/bin/bash
# 1. Check system health
health=$(curl -s -X GET https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}")
echo "System Status: $(echo $health | jq -r '.status')"
echo "Memory Usage: $(echo $health | jq -r '.memory.usage_percent')%"
echo "WAL Status: $(echo $health | jq -r '.wal.status')"
# 2. Check WAL health
wal_health=$(curl -s -X GET https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}")
wal_usage=$(echo $wal_health | jq -r '.disk_space.usage_percent')
echo "WAL Disk Usage: $wal_usage%"
# 3. Rotate WAL if needed (> 80% full)
if (( $(echo "$wal_usage > 80" | bc -l) )); then
echo "WAL disk usage high, rotating..."
curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
-H "Authorization: Bearer {ADMIN_TOKEN}"
fi
# 4. Get system analysis
analysis=$(curl -s -X GET https://{EKODB_API_URL}/api/system/analysis \
-H "Authorization: Bearer {ADMIN_TOKEN}")
echo "Total Records: $(echo $analysis | jq -r '.database.total_records')"
echo "Total Size: $(echo $analysis | jq -r '.database.total_size_bytes' | numfmt --to=iec)"
# 5. Check for performance recommendations
echo "Recommendations:"
echo $analysis | jq -r '.recommendations[]'
# 6. Check for errors in logs
errors=$(curl -s -X GET "https://{EKODB_API_URL}/api/system/logs?level=error&limit=10" \
-H "Authorization: Bearer {ADMIN_TOKEN}")
error_count=$(echo $errors | jq '.logs | length')
echo "Recent Errors: $error_count"
if [ $error_count -gt 0 ]; then
echo "Latest error: $(echo $errors | jq -r '.logs[0].message')"
fi
# 7. Get WAL entries for backup (last hour)
end_time=$(date +%s)
start_time=$((end_time - 3600))
curl -s -X GET "https://{EKODB_API_URL}/api/wal/entries?start_time=$start_time&end_time=$end_time&limit=1000" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
> wal_backup_$(date +%Y%m%d_%H%M%S).json
echo "WAL backup complete"
Best Practices
Health Monitoring
Set Up Regular Health Checks:
# Check health every 5 minutes
*/5 * * * * curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq -r '.status' | \
grep -q "healthy" || echo "ALERT: Database unhealthy!"
WAL Management
Monitor Disk Usage:
# Alert if WAL disk usage > 75%
wal_usage=$(curl -s .../api/wal/health | jq -r '.disk_space.usage_percent')
if (( $(echo "$wal_usage > 75" | bc -l) )); then
echo "WARNING: WAL disk usage at $wal_usage%"
fi
Scheduled Rotation:
# Rotate WAL daily at 3 AM
0 3 * * * curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
-H "Authorization: Bearer {ADMIN_TOKEN}"
Log Retention
Archive Old Logs:
# Export and archive logs older than 7 days
seven_days_ago=$(date -d '7 days ago' +%s)
now=$(date +%s)
curl -s "https://{EKODB_API_URL}/api/system/logs?end_time=$seven_days_ago&limit=10000" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
> archived_logs_$(date +%Y%m%d).json
Replication Setup
Configure Primary-Replica:
# On replica: Fetch and apply WAL entries every minute
*/1 * * * * bash /scripts/replicate_wal.sh
# replicate_wal.sh
#!/bin/bash
PRIMARY_URL="https://primary.ekodb.net"
REPLICA_URL="https://replica.ekodb.net"
# Get latest WAL entries from primary
end_time=$(date +%s)
start_time=$((end_time - 120)) # Last 2 minutes
entries=$(curl -s "$PRIMARY_URL/api/wal/entries?start_time=$start_time&end_time=$end_time" \
-H "Authorization: Bearer {ADMIN_TOKEN}")
# Apply to replica
curl -X POST "$REPLICA_URL/api/replication/wal" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d "$entries"
Performance Optimization
Act on Recommendations:
# Get and apply system recommendations
recommendations=$(curl -s .../api/system/analysis | jq -r '.recommendations[]')
# Example: Create recommended indexes
echo "$recommendations" | grep "Create index" | while read -r rec; do
echo "TODO: $rec"
# Parse and create index
done
# Example: Delete unused indexes
echo "$recommendations" | grep "Delete unused index" | while read -r rec; do
echo "Cleanup needed: $rec"
done
Monitoring Alerts
Critical Metrics to Monitor
| Metric | Warning Threshold | Critical Threshold | Action |
|---|---|---|---|
| Disk usage | > 75% | > 90% | Rotate WAL, archive old data |
| Memory usage | > 80% | > 95% | Restart, scale up |
| Query time (p95) | > 100ms | > 500ms | Create indexes, optimize queries |
| WAL file size | > 800MB | > 1GB | Force rotation |
| Error rate | > 1% | > 5% | Investigate logs |
| Active transactions | > 100 | > 500 | Check for stuck transactions |
Sample Monitoring Script
#!/bin/bash
ALERT_EMAIL="ops@example.com"
THRESHOLD_DISK=75
THRESHOLD_MEM=80
THRESHOLD_ERRORS=10
# Get health data
health=$(curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}")
# Check disk usage
disk_usage=$(echo $health | jq -r '.wal.disk_space.usage_percent')
if (( $(echo "$disk_usage > $THRESHOLD_DISK" | bc -l) )); then
echo "ALERT: Disk usage at $disk_usage%" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi
# Check memory
mem_usage=$(echo $health | jq -r '.memory.usage_percent')
if (( $(echo "$mem_usage > $THRESHOLD_MEM" | bc -l) )); then
echo "ALERT: Memory usage at $mem_usage%" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi
# Check recent errors
error_count=$(curl -s "https://{EKODB_API_URL}/api/system/logs?level=error&limit=100" \
-H "Authorization: Bearer {ADMIN_TOKEN}" | jq '.logs | length')
if [ $error_count -gt $THRESHOLD_ERRORS ]; then
echo "ALERT: $error_count errors in last period" | mail -s "ekoDB Alert" $ALERT_EMAIL
fi
Troubleshooting
High Memory Usage
# Check analytics for collection sizes
curl -s https://{EKODB_API_URL}/api/analytics \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.collections[] | {name: .[0], size: .[1].total_size, records: .[1].record_count}' \
| jq -s 'sort_by(.size) | reverse | .[0]'
# Check memory metrics
curl -s https://{EKODB_API_URL}/api/analytics \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.memory'
# Check health status for memory usage
curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.system_resources.total_memory_gb'
# Consider:
# - Archiving old data from large collections
# - Implementing pagination
# - Adding query result limits
# - Clearing unused indexes
Slow Query Performance
# Check if indexes exist for frequently queried fields
curl -s https://{EKODB_API_URL}/api/indexes/query/{collection} \
-H "Authorization: Bearer {ADMIN_TOKEN}"
# Use explain to analyze query performance
curl -X POST https://{EKODB_API_URL}/api/query/{collection}/explain \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{
"filter": {
"type": "expression",
"content": {
"field": "your_field",
"operator": "Equals",
"value": "your_value"
}
}
}'
# Create indexes for fields without them
curl -X POST https://{EKODB_API_URL}/api/indexes/query/{collection} \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-d '{"field": "frequently_queried_field", "index_type": "btree"}'
# Check batch settings and adjust if needed
curl -s https://{EKODB_API_URL}/api/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.batch_settings'
WAL Disk Space Issues
# Check WAL health and disk usage
curl -s https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '{status, disk_usage: .disk_space.usage_percent, size_gb: (.current_file_size_bytes / 1073741824)}'
# Rotate immediately if usage > 80%
usage=$(curl -s https://{EKODB_API_URL}/api/wal/health \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq -r '.disk_space.usage_percent')
if (( $(echo "$usage > 80" | bc -l) )); then
echo "WAL disk usage critical at $usage%, rotating..."
curl -X POST https://{EKODB_API_URL}/api/wal/rotate \
-H "Authorization: Bearer {ADMIN_TOKEN}"
fi
# Get WAL entries for backup before cleanup
end_time=$(date +%s)
start_time=$((end_time - 86400)) # Last 24 hours
curl -s "https://{EKODB_API_URL}/api/wal/entries?start_time=$start_time&end_time=$end_time" \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
> wal_backup_$(date +%Y%m%d).json
# After backing up, rotation will free space
Ripple Configuration
For multi-node deployments, configure data propagation between instances:
# Configure ripples on a node (add each peer separately)
curl -X POST https://{EKODB_API_URL}/api/ripples/config \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"name": "peer1",
"url": "https://peer1.ekodb.net:8080",
"api_key": "peer1-admin-key",
"mode": "Operations",
"enabled": true
}'
curl -X POST https://{EKODB_API_URL}/api/ripples/config \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"name": "peer2",
"url": "https://peer2.ekodb.net:8080",
"api_key": "peer2-admin-key",
"mode": "Operations",
"enabled": true
}'
# List configured ripples
curl -X GET https://{EKODB_API_URL}/api/ripples/list \
-H "Authorization: Bearer {ADMIN_TOKEN}"
# Check ripple health
curl -X GET https://{EKODB_API_URL}/api/ripples/list \
-H "Authorization: Bearer {ADMIN_TOKEN}" \
| jq '.ripples[] | {url, status, last_sync}'
Ripple Modes:
send- Propagate writes to peers (primary nodes)receive- Receive updates from peers (read replicas)both- Full bidirectional sync (multi-master)none- No propagation (isolated nodes)
See Ripples - Data Propagation for comprehensive guides on multi-region deployments, read scaling, high availability architectures, and data pipeline patterns.
Manifest Recovery
Rebuild Collection Manifests
Rebuild collection manifests from individual record files. Useful for recovering from schema evolution issues, stale manifest data after upgrades, or collection loading problems after crashes.
POST https://{EKODB_API_URL}/api/admin/rebuild-manifests
Content-Type: application/json
Authorization: Bearer {ADMIN_TOKEN}
# Rebuild specific collections
{
"collections": ["users", "orders"]
}
# Rebuild ALL collections (omit collections field)
{}
# Response
{
"success": true,
"rebuilt_collections": ["users", "orders"],
"message": "Successfully rebuilt 2 collection(s)"
}
This operation automatically backs up existing manifests (.backup extension) before rebuilding, but it's recommended to have a full backup before running this in production.
Use Cases:
- Schema Evolution Issues - Records missing after adding new required fields
- Stale Manifest Data - Manifest doesn't reflect actual records on disk
- Post-Crash Recovery - Collection loading problems after unexpected shutdown
- Version Upgrades - Manifest format changes between versions
What It Does:
- Backs up existing manifest files
- Scans individual record files
- Rebuilds manifest from actual record data
- Updates in-memory collections
Public Endpoint Rate Limiting
ekoDB includes configurable IP-based rate limiting for public endpoints that don't require authentication. This protects against DDoS and brute force attacks.
Protected Endpoints
/api/health- Health check/api/auth/register- API key registration/api/auth/token- Token generation/api/auth/{api_key}/admin_key- Admin key retrieval
Configuration
Configure via environment variables or database config:
| Setting | Environment Variable | Default | Description |
|---|---|---|---|
| Enabled | PUBLIC_ENDPOINT_RATE_LIMIT_ENABLED | true | Enable/disable rate limiting |
| Limit | PUBLIC_ENDPOINT_RATE_LIMIT_PER_MINUTE | 60 | Max requests per IP per minute |
Rate Limit Response
When rate limited, the server returns:
HTTP/1.1 429 Too Many Requests
Retry-After: 45
Content-Type: application/json
{
"error": "Rate limit exceeded",
"retry_after_secs": 45
}
IP Detection
The rate limiter detects client IP from (in order):
X-Forwarded-Forheader (first IP in list)X-Real-IPheader- Direct connection IP
If using a load balancer, ensure it forwards the original client IP via X-Forwarded-For or X-Real-IP headers for accurate rate limiting.
Related Documentation
- Ripples - Data Propagation - Multi-node data propagation and use cases
- Indexes - Create indexes for performance
- Transactions - Understand transaction impact on WAL
- Collections & Schemas - Collection management
- Authentication - Admin key management