security-infrastructure-todo
Guide for security-infrastructure-todo
Security & Infrastructure TODO - Production Deployment
Target: 2500 Concurrent Users via Cloudflare Tunnel
Last Updated: 2026-01-05
Last Updated: 2026-01-05
Phase 1: IMMEDIATE SECURITY FIXES (Week 1)
Critical Security Issues
- Replace hardcoded WebSocket URL (4h)
- File:
loh-game/src/systems/network/mod.rs:99 - Action: Use environment variable
GAME_SERVER_URL - Add
.envsupport withdotenvcrate - Create
.env.exampletemplate
- File:
- Implement TLS for WebSocket (2h)
- Change
ws://towss:// - Behind Cloudflare Tunnel, server can use self-signed cert
- Cloudflare handles public TLS termination
- Change
- Add password hashing (6h)
- File:
loh-game/src/systems/ui/login.rs - Verify Salted SHA256 (PBKDF2) implementation
- CONSTRAINT: Do NOT use Argon2id (CPU limit)hashing
- NEVER send plaintext passwords over network
- Use challenge-response or JWT-based auth
- File:
- Fix panic vulnerabilities (8h)
- Replace
.unwrap()in network code (35+ instances) - Priority files:
src/systems/network/mod.rs(lines 97, 99)src/systems/editor/ui.rs(JSON serialization)src/plugin_api/manager.rs
- Use proper error handling:
Result<T, E>with?operator
- Replace
- Add input validation (4h)
- Implement message size limits: 1MB max
- Add rate limiting per connection
- Sanitize player names/chat inputs
Estimated Time: 24 hours (3 days)
Phase 2: INFRASTRUCTURE SETUP (Week 1-2)
Cloudflare Configuration
- Set up Cloudflare Tunnel (2h)
- Install
cloudflaredon game server - Configure tunnel for WebSocket endpoint
- Create tunnel with:
cloudflared tunnel create loh-game - Route:
wss://game.yourdomain.com -> localhost:3000/ws
- Install
- Configure Cloudflare Security (4h)
- Enable Bot Fight Mode
- Set up WAF rules for WebSocket
- Configure rate limiting (2500 concurrent, ~10k requests/min)
- Enable DDoS protection
- Set up IP Access Rules if needed
Environment & Secrets Management
- Create environment structure (3h)
loh-game/ .env.development .env.staging .env.production .env.example (committed to git)- Add to
.gitignore:.env,.env.*(except .env.example) - Use
dotenvcrate to load environment-specific configs
- Add to
- Set up secrets (2h)
GAME_SERVER_URLDATABASE_URLJWT_SECRET(generate:openssl rand -base64 64)SENTRY_DSNCLOUDFLARE_TUNNEL_TOKEN
Estimated Time: 11 hours (1.5 days)
Phase 3: OBSERVABILITY & MONITORING (Week 2)
Error Tracking
- Integrate Sentry (4h)
- Add to
Cargo.toml:sentry = "0.32" - Initialize in
main.rs:let _guard = sentry::init(( env::var("SENTRY_DSN").unwrap(), sentry::ClientOptions { release: sentry::release_name!(), ..Default::default() }, )); - Cost: $26/month (Team plan, 50k events/month)
- Add to
Structured Logging
- Implement tracing (6h)
- Replace
logwithtracing+tracing-subscriber - Add correlation IDs to network requests
- Log to JSON for structured parsing
- Configure log levels per environment
- Replace
- Set up log aggregation (4h)
- Options:
- Free: Cloudflare Logpush to R2 ($15/TB/month storage)
- Paid: Datadog ($15/host/month), New Relic ($25/month)
- Recommendation: Start with Cloudflare R2 + custom dashboard
- Options:
Metrics & Monitoring
- Add metrics instrumentation (8h)
- Use
metricscrate - Track:
- Active WebSocket connections
- Messages/second
- Authentication attempts
- Player actions/second
- Memory usage per connection
- Export to Prometheus
- Use
- Create monitoring dashboard (4h)
- Cloudflare Analytics (free with tunnel)
- Grafana Cloud (free tier: 10k series) or self-hosted
- Key metrics:
- Concurrent connections (target: 2500)
- CPU/Memory usage
- Network bandwidth
- Error rates
Estimated Time: 22 hours (3 days)
Phase 4: BACKEND HARDENING (Week 3)
Database Security
- Review database access (4h)
- Confirm prepared statements usage
- Add connection pool limits (max: 100 connections)
- Enable SSL for database connections
- Implement backup encryption
- Set up database backups (3h)
- Automated daily backups
- Retain: 7 dailies, 4 weeklies, 3 monthlies
- Test restore procedure
- Store backups in separate region
Rate Limiting & Anti-Abuse
- Implement server-side rate limiting (6h)
- Per-IP: 100 requests/minute
- Per-user: 1000 actions/minute
- Connection limit per IP: 5 concurrent
- Use
governorcrate or Redis-based limiting
- Add abuse detection (4h)
- Detect spam patterns
- Flag suspicious behavior
- Implement temporary bans
- Log all moderation actions
Session Management
- Implement JWT tokens (8h)
- Short-lived access tokens (15 min)
- Refresh tokens (7 days)
- Secure token storage on client
- Token revocation mechanism
Estimated Time: 25 hours (3 days)
Phase 5: CI/CD & AUTOMATION (Week 3-4)
GitHub Actions Setup
- Create security pipeline (6h)
.github/workflows/security.yml:cargo auditon every PRcargo denyfor license compliance- Dependency vulnerability scanning
- SAST with Semgrep
- Create build pipeline (4h)
.github/workflows/build.yml:- Build on push to main
- Run all tests
- Generate artifacts
- Upload to release
- Create deployment pipeline (8h)
.github/workflows/deploy.yml:- Deploy to staging on merge to
develop - Deploy to production on tag
v* - Automated rollback on failure
- Slack/Discord notifications
- Deploy to staging on merge to
Testing & Quality
- Add security tests (6h)
- Network layer fuzz testing
- Authentication bypass tests
- SQL injection tests (if applicable)
- XSS/injection in chat system
Estimated Time: 24 hours (3 days)
Phase 6: PLUGIN SYSTEM SECURITY (Week 4)
- Audit Lua sandbox (8h)
- Review
mluasecurity settings - Disable dangerous functions (io, os, debug)
- Implement resource limits (CPU, memory)
- Test for sandbox escapes
- Review
- Audit WASM runtime (8h)
- Review
wasmtimeconfiguration - Implement capability-based security
- Add memory/CPU limits
- Test for privilege escalation
- Review
- Plugin signature verification (6h)
- Sign official plugins with GPG
- Verify signatures before loading
- Implement plugin allowlist
Estimated Time: 22 hours (3 days)
Phase 7: COMPLIANCE & DOCUMENTATION (Week 5)
- GDPR compliance (8h)
- Implement data export API
- Add account deletion flow
- Create privacy policy
- Add consent management
- Security documentation (6h)
- Document threat model
- Create incident response playbook
- Write security policies
- Prepare security.txt file
- Runbook creation (8h)
- Deployment procedures
- Rollback procedures
- Disaster recovery plan
- On-call runbook
Estimated Time: 22 hours (3 days)
TOTAL ESTIMATED TIME
- Phase 1 (Critical): 3 days
- Phase 2 (Infra): 1.5 days
- Phase 3 (Observability): 3 days
- Phase 4 (Hardening): 3 days
- Phase 5 (CI/CD): 3 days
- Phase 6 (Plugins): 3 days
- Phase 7 (Compliance): 3 days
Total: ~19 days of focused work (4 engineering-weeks)
DEPENDENCIES TO ADD
# Security
argon2 = "0.5"
jsonwebtoken = "9.2"
# Configuration
dotenv = "0.15"
# Observability
sentry = "0.32"
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter", "json"] }
metrics = "0.21"
metrics-exporter-prometheus = "0.13"
# Rate Limiting
governor = "0.6"
# Code Quality
cargo-audit = "0.18"
cargo-deny = "0.14"MONTHLY COST ESTIMATE (2500 CCU)
Infrastructure (Cloudflare Tunnel + Backend)
- Cloudflare Tunnel: $0 (included in Pro plan)
- Cloudflare Pro Plan: $20/month
- Game Server (see specs below): $80-160/month
- Database (managed Postgres): $25-75/month
Observability
- Sentry (Team): $26/month
- Grafana Cloud (free tier): $0
- Log storage (Cloudflare R2): $5-15/month
Other
- Backups (S3-compatible): $10-20/month
- Domain + SSL: $15/year (~$1.25/month)
TOTAL: $156-297/month (starting conservative: ~$200/month)
As you scale to 5000+ CCU, costs will increase linearly with server resources.
ROLLOUT STRATEGY
- Week 1-2: Fix critical security issues, basic infra
- Week 3: Beta testing with 100 users
- Week 4: Staged rollout to 500 users
- Week 5: Scale to 1500 users, monitor
- Week 6+: Full capacity (2500 CCU)
- Ongoing: Continuous monitoring and optimization
SUCCESS METRICS
- Zero critical vulnerabilities in
cargo audit - < 0.1% error rate in production
- < 100ms p95 WebSocket latency
- 99.9% uptime SLA
- All secrets in environment, none in code
- Automated deployments with < 5min downtime
- < 1 hour MTTR (Mean Time To Recovery)