logging architecture design

Guide for logging architecture design

Logging Architecture Design (PLG Stack)

1. Overview

Current requirements dictate a lightweight, storage-efficient logging stack suitable for "Refurbished Hardware" running Docker Compose. We will replace the hypothetical Filebeat+ELK stack with Promtail + Loki.
Why Loki?
  • No Full-Text Index: Indexes only labels (timestamp, container_name, level).
  • Tiny Footprint: Promtail uses <100MB RAM. Loki is efficient on CPU.
  • Native S3 Support: Tiering to S3 is built-in.

2. Architecture Diagram

graph LR
    subgraph Game_Server_Node
        Container[Game Container] -- stdout --> DockerLog[JSON File\n/var/lib/docker/...]
        Promtail[Promtail Agent] -- reads --> DockerLog
    end
    
    Promtail -- pushes (snappy) --> Loki[Loki Server]
    
    subgraph Storage_Tier
        Loki -- chunks --> S3[Minio / AWS S3]
        Loki -- index --> BoltDB[Local NVMe]
    end
    
    User[Grafana] -- query --> Loki

3. Implementation Details

A. Data Collection (Promtail)

We will run Promtail as a global service (one per physical node) rather than a sidecar per container.
  • Access Method: Bind mount /var/lib/docker/containers (ReadOnly).
  • Discovery: Promtail's docker_sd_configs (Service Discovery) automatically finds new containers and attaches labels:
    • container_name -> game-server-central
    • image -> rust-engine:latest
    • job -> docker

B. Retention Policy (Configured in Loki)

To handle the ~1PB/month potential volume, we aggressively tier data:
  1. Retention Period:
    • Table Manager: 168h (7 Days) Global Retention.
  2. Storage Config:
    • Schema: v11 (tsdb for index).
    • Object Store: S3 (for the massive chunks).
    • Compaction: Active.

C. Log Processing Pipeline

Promtail will parse the JSON stdout from the Rust backend to extract fields for structured querying.
pipeline_stages:
  - json:
      expressions:
        level: level
        msg: message
        zone: zone_id
  - labels:
      level:
      zone:
Result: You can query {zone="hastinapur_central", level="error"} Instantly.

4. Configuration Changes Required

1. loh-devops/infrastructure/game/logging.yml (NEW)

A new specific compose file for the logging stack to keep docker-compose.prod.yml clean.
  • Services: loki, promtail

2. promtail-config.yml

Must be created in infrastructure/game/config/.

3. Update docker-compose.prod.yml

Remove any "logging" env vars that might conflict (currently none), ensure json-file driver is default (standard Docker).

5. Migration Plan

  1. Create logging-docker-compose.yml.
  2. Create loki-config.yml (Minio/S3 setup).
  3. Create promtail-config.yml (Docker Service Discovery).
  4. Deploy and verify logs appear in Grafana.