infrastructure-specs

Guide for infrastructure-specs

Infrastructure Specifications

Target: 10,000 Concurrent Users (Self-Hosted -> Cloud) Architecture: Cloudflare Tunnel → Game Server Cluster (Self-Hosted) → Database Last Updated: 2026-01-15

Game Server (Primary)

For 2500 Concurrent Users:
CPU: 8 cores / 16 threads (AMD EPYC or Intel Xeon)
RAM: 32 GB DDR4
Storage: 500 GB NVMe SSD
Network: 1 Gbps uplink
Provider Options:
  - Hetzner AX52: €59/month (~$65/month) ⭐ RECOMMENDED
  - OVH Rise-2: $72/month
  - AWS c7g.2xlarge: $160/month (reserved)
  - DigitalOcean CPU-Optimized 16GB: $144/month
Sizing Calculation:
  • Each WebSocket connection: ~10-15 MB RAM
  • 2500 connections × 12 MB = 30 GB base
  • Add 2 GB for OS + game logic = 32 GB total
  • CPU: Each core handles ~300-400 active connections
  • 2500 ÷ 350 = ~7 cores needed
OS: Ubuntu 22.04 LTS (Server)

Database Server

For 2500 Concurrent Users:
Managed PostgreSQL 15:
  - 4 vCPU
  - 16 GB RAM
  - 200 GB SSD
  - Max connections: 200
  
Provider Options:
  - Hetzner Cloud CPX31 + Postgres: $25/month ⭐ BUDGET
  - DigitalOcean Managed DB: $60/month
  - AWS RDS db.t4g.large: $120/month (reserved)
  - Supabase Pro: $25/month (good for MVP)
Sizing Calculation:
  • Active queries: ~50-100 concurrent
  • Connection pool: 100 connections
  • Data growth: ~500 MB/month (10k users)
  • Backup needs: 7-day retention = 3.5 GB

NETWORK ARCHITECTURE

Internet
   ↓
Cloudflare (Global CDN + DDoS Protection)
   ↓
Cloudflare Tunnel (cloudflared)
   ↓
Game Server :3000 (Rust Backend)
   ↓
PostgreSQL :5432 (Database)
Benefits of Cloudflare Tunnel:
  • ✅ No public IP exposure
  • ✅ Automatic TLS termination
  • ✅ Built-in DDoS protection
  • ✅ Global anycast routing
  • ✅ Zero-trust access
  • ✅ Automatic failover

HIGH AVAILABILITY ARCHITECTURE (Target: 10,000 CCU)

Cluster Design

To support 10,000 Concurrent Users with 99.9% uptime, we employ a horizontally scalable architecture.
graph TD
    User((User)) --> CF[Cloudflare Global Network]
    CF --> CLB[Cloudflare Load Balancer]
    
    subgraph "Game Cluster (Private Network)"
        CLB -->|Tunnel 1| GS1[Game Server 1]
        CLB -->|Tunnel 2| GS2[Game Server 2]
        CLB -->|Tunnel 3| GS3[Game Server 3]
        CLB -->|Tunnel 4| GS4[Game Server 4]
        
        GS1 & GS2 & GS3 & GS4 --> Redis[Redis Cluster]
        GS1 & GS2 & GS3 & GS4 --> PG_Master[PostgreSQL Primary]
        
        PG_Master -->|Async Replication| PG_Read1[Read Replica 1]
        PG_Master -->|Async Replication| PG_Read2[Read Replica 2]
    end

Component Specifications

1. Game Server Nodes (x4)

Each node handles ~2,500 active connections.
  • CPU: 8 cores / 16 threads (AMD EPYC preferred)
  • RAM: 32 GB ECC DDR4
  • Network: 1 Gbps Dedicated
  • Role: WebSocket handling, Game Logic, Physics.
  • Redundancy: N+1 Layout (Cluster can survive 1 node failure with degraded performance).

2. Database Tier

  • Primary (Write):
    • 8 vCPU / 32 GB RAM / 500 GB NVMe
    • Handles all transactional writes (Inventory, Trade, Save).
  • Read Replicas (x2):
    • 4 vCPU / 16 GB RAM
    • Handles Highscores, Analytics, and Read-heavy queries.

3. State Layer (Redis Cluster)

  • Configuration: 3 Masters + 3 Slaves.
  • Specs: 4 GB RAM per node.
  • Purpose:
    • Hot Session State
    • Pub/Sub (Chat, Cross-Server Events)
    • Rate Limiting Counters

Load Balancing Strategy

  • Layer 7 (Cloudflare): Route user to healthiest Tunnel endpoint using "Least Connections" algorithm.
  • Session Stickiness: Not required strictly if state is externalized to Redis (Stateless Game Server Goal), but recommended for cache locality.

DEVELOPMENT ENVIRONMENT SPECS

Developer Workstation

Minimum:
  - CPU: 4 cores
  - RAM: 16 GB
  - SSD: 256 GB
  
Recommended:
  - CPU: 8 cores
  - RAM: 32 GB
  - SSD: 512 GB NVMe

Staging Server

CPU: 4 cores
RAM: 16 GB
Storage: 200 GB SSD
Network: 500 Mbps
Cost: ~$30-40/month

Purpose: Test with 500-1000 concurrent users

CLOUDFLARE CONFIGURATION

Required Cloudflare Plan

Pro Plan: $20/month
  • Includes:
    • WAF (Web Application Firewall)
    • 20 Page Rules
    • Image optimization
    • Better SSL options
    • 50 GB video uploads

Cloudflare Tunnel Setup

# Install cloudflared
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o cloudflared
sudo mv cloudflared /usr/local/bin/
sudo chmod +x /usr/local/bin/cloudflared

# Authenticate
cloudflared tunnel login

# Create tunnel
cloudflared tunnel create loh-game

# Configure tunnel (config.yml)
tunnel: <TUNNEL-ID>
credentials-file: /etc/cloudflared/<TUNNEL-ID>.json

ingress:
  - hostname: game.yourdomain.com
    service: ws://localhost:3000
    originRequest:
       noTLSVerify: true
  - service: http_status:404
  
# Run as service
cloudflared tunnel run loh-game

DNS Configuration

Type: CNAME
Name: game
Target: <TUNNEL-ID>.cfargotunnel.com
Proxy: Enabled (orange cloud)

SECURITY HARDENING

Firewall Rules (iptables/nftables)

# Only allow SSH (from your IP) and local traffic
# Everything else goes through Cloudflare Tunnel

iptables -P INPUT DROP
iptables -P FORWARD DROP
iptables -P OUTPUT ACCEPT

# Allow loopback
iptables -A INPUT -i lo -j ACCEPT

# Allow established connections
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT

# Allow SSH from specific IP
iptables -A INPUT -p tcp --dport 22 -s YOUR_IP -j ACCEPT

# Allow database connection (if on separate server)
iptables -A INPUT -p tcp --dport 5432 -s DATABASE_IP -j ACCEPT

# Save rules
iptables-save > /etc/iptables/rules.v4

Fail2ban Configuration

# Install fail2ban
sudo apt install fail2ban

# Configure for SSH
# /etc/fail2ban/jail.local
[sshd]
enabled = true
maxretry = 3
bantime = 3600

MONITORING STACK

Prometheus + Grafana (Self-Hosted)

Server Requirements:
  - CPU: 2 cores
  - RAM: 4 GB
  - Storage: 100 GB SSD (metrics retention)
  - Cost: ~$15-20/month (small VPS)

OR use Grafana Cloud (free tier):
  - 10k series
  - 50 GB logs
  - 50 GB traces
  - 14-day retention
  1. System Metrics:
    • CPU/RAM/Disk usage
    • Network throughput
    • Process counts
  2. Game Metrics:
    • Active WebSocket connections
    • Messages per second
    • Player actions/minute
    • Authentication success/failure rate
  3. Business Metrics:
    • Daily Active Users (DAU)
    • New registrations
    • Retention rate
    • Average session duration

BACKUP STRATEGY

Game Server Backups

# Daily automated backups
0 2 * * * /usr/local/bin/backup-game-data.sh

# Backup script covers:
- Game configuration files
- Player data exports
- Log archives (last 7 days)

# Storage: Cloudflare R2 or AWS S3
# Retention: 7 daily, 4 weekly, 12 monthly
# Cost: ~$5-10/month

Database Backups

# Automated via managed service OR:
pg_dump -Fc loh_game > backup_$(date +%Y%m%d).dump

# Encrypt before upload
gpg --encrypt --recipient admin@yourdomain.com backup.dump

# Upload to S3-compatible storage
s3cmd put backup.dump.gpg s3://loh-backups/

DISASTER RECOVERY

Recovery Time Objective (RTO)

Target: < 1 hour to restore service

Recovery Point Objective (RPO)

Target: < 15 minutes of data loss

DR Runbook

  1. Database corruption:
    • Promote read replica to primary (5 min)
    • OR restore from last backup (30 min)
  2. Server failure:
    • Spin up new server from image (10 min)
    • Configure Cloudflare Tunnel (5 min)
    • Restore data (15 min)
  3. DDoS attack:
    • Cloudflare Auto-mitigates
    • Enable "I'm Under Attack" mode if needed
    • No server-side action required

COST BREAKDOWN (Monthly)

ItemBudgetStandardPremium
Game Server$65$120$160
Database$25$60$120
Cloudflare Pro$20$20$20
Monitoring$0$15$50
Sentry$26$26$99
Backups$10$15$30
Domain$2$2$2
Buffer (10%)$15$26$48
TOTAL$163$284$529
Recommended Start: Standard tier ($284/month)

PERFORMANCE TARGETS

Latency

  • WebSocket connection: < 100ms p95
  • Authentication: < 200ms p95
  • Game action processing: < 50ms p95
  • Database queries: < 10ms p95

Throughput

  • 2500 concurrent connections
  • 10,000 messages/second peak
  • 500 authentications/minute
  • 100 database writes/second

Reliability

  • 99.9% uptime (< 45min downtime/month)
  • < 0.1% error rate
  • Zero data loss

FUTURE SCALING PLAN

5,000 CCU

  • Add 1 more game server
  • Upgrade database to 8 vCPU, 32 GB RAM
  • Cost: +$150/month

10,000 CCU

  • 4 game servers behind load balancer
  • Database primary + 2 read replicas
  • Cost: +$400/month

25,000 CCU

  • Regional deployments (US, EU, APAC)
  • Microservices architecture
  • Cost: +$1,500/month