postgres scalability

Guide for postgres scalability

PostgreSQL Scalability Analysis for Game Backend

This document outlines the infrastructure requirements to self-host PostgreSQL for a game backend, scaling from 500k to 100M MAU.

1. Assumptions & Workload Profile

To size the database, we must define the "Atomic Unit of Load" per user.

Per-User Data (Storage)

For a typical RPG/MMO/Service game (like loh seems to be):

Player Profile: ~10 KB (Stats, basic info)
Inventory/Items: ~20 KB (100-200 items)
Quest/Progress: ~50 KB (History, flags)
Social/Guild: ~5 KB
Cold Data: History, logs (Archive elsewhere, don't keep in hot DB)
Total Hot Data per User: ~100 KB (Conservative estimate)

Per-CCU Load (Throughput)

Write Heavy: Games are write-heavy (autosaves, loot pickups, xp gain).
Transactions per second (TPS) per CCU: ~0.5 - 1 TPS (aggressive autosave) or 0.1 TPS (lazy save).
Target: 1 TPS per CCU (Worst case / highly interactive).
Read/Write Ratio: 50/50 (highly interactive) or 80/20 (mostly reading static data). We assume 60/40 Read/Write.

2. Resource Requirements by Tier

Tier 1: 25k Concurrent Users (CCU)

Base load for all MAU tiers if throughput is capped by CCU.

Throughput: 25,000 TPS (Very high for a single node writing).
Optimization:
- Redis Write-Behind: Buffer high-frequency updates (XP, position) in Redis, flush to DB every 10-30s.
- Real TPS to DB: ~2,500 TPS (with Redis buffering).
IOPS: ~10,000 - 20,000 Random IOPS.

3. Scalability Matrix

MAU	Est. DB Size (Raw)	Est. Disk Needed (2x Overhead)	CCU (Peak)	Write TPS	Total Physical Servers (Min)	Architecture Strategy
500k	50 GB	100 GB	~5k	500	1	Monolith
1M	100 GB	200 GB	~10k	1,000	2 (1 Primary + 1 Standby)	Monolith + HA
10M	1 TB	2 TB	~25k	2,500	3 (1 Primary + 2 Replicas)	Read-Replicas required
25M	2.5 TB	5 TB	~25k*	2,500	4 (1 Primary + 3 Replicas)	Partitioning recommended
50M	5 TB	10 TB	~25k*	2,500	6 (1 Primary + 2 Hot Rep + 3 Cold Storage)	Manual Sharding / Archiving
100M	10 TB	20 TB	~25k*	2,500	8+ (Depends on Archive Strategy)	Distributed SQL / Heavy Archiving

> Note: Assuming CCU stays capped at 25k as requested. If CCU scales linearly with MAU, the Write TPS and Server Count for 50M/100M would replace the "Archiving" strategy with a "Sharding" strategy (dozens of nodes).

The "High MAU, Low CCU" Specific Case (100M MAU, 25k CCU)

If you scale to 100M users but keep only 25k online (typical for long-tail games):

The Challenge: Managing 20TB of data is harder than managing the traffic.
The Solution: Aggressive Archiving.
- Only keep "Active Last 30 Days" users in the Hot DB (~1TB for 10M active users).
- Move cold users to Cold Storage (S3 / Cheap HDD Server).
- On login, fetch from Cold Storage to Hot DB (Async Load).
Result: You can run 100M MAU on the Tier 3 Hardware (3 Nodes) if you implement this "Hot/Cold" application logic. Without it, you need a $50k+ storage array.

4. Hardware Specifications (Refurb/Self-Host)

The "All-Rounder" Box (Supports up to 10M MAU / 25k CCU)

Model: Dell PowerEdge R630 / R730 or HP DL360 Gen9
CPU: Dual Xeon E5-2667 v4 (High frequency is better than core count for DB)
RAM: 256 GB DDR4 ECC (Caches the user working set)
Storage: 4x 1.92TB Enterprise NVMe (U.2) in RAID 10 (Critical for IOPS)
Network: 10Gbps SFP+

The "Storage Monster" (For 50M+ MAU Data)

Storage Array: You need expansion. ZFS compressed storage or separate storage server (TrueNAS) connected via iSCSI/NFS (slower) or local SAS SSD arrays.

5. Software Stack Optimization

Connection Pooling: PgBouncer is mandatory. 25k connections will kill Postgres. Multithreaded pooling restricts active queries to ~Core count * 2.
OS Tuning: Huge Pages enabled, vm.swappiness=1, standard Linux DB tuning.
Postgres Tuning:
- shared_buffers: 25% RAM
- work_mem: Tuned per query
- checkpoint_completion_target: 0.9 (spread out writes)
- wal_buffers: 16MB

6. Infrastructure Diagram (10M MAU / 25k CCU Target)

graph TD
    Client --> LoadBalancer[Load Balancer / Proxy]
    LoadBalancer --> API[Game Servers (Rust)]
    API --> Cache[Redis Cluster (Hot State)]
    API --> Pool[PgBouncer]
    
    subgraph Database Layer
        Pool --> Primary[Postgres Primary (Writes)]
        Primary --> Rep1[Replica 1 (Reads / Failover)]
        Primary --> Rep2[Replica 2 (Reads / Analytics)]
        
        Primary -- WAL Streaming --> Rep1
        Primary -- WAL Streaming --> Rep2
    end
    
    subgraph Backups
        Rep2 --> WALG[WAL-G / Barman]
        WALG --> S3[Cold Storage / S3]
    end