qa testing principles

Guide for qa testing principles

AI QA Testing Agent Principles

Philosophy

Setup is cheating, Gameplay is not.
The test framework can cheat to create test conditions (spawn items, set levels), but must play fairly during the actual test (eat food, walk places, fight normally).

Allowed vs Forbidden Actions

✅ SETUP Commands (Test Preparation Only)

These prepare the test environment quickly, but do not bypass mechanics:
-- Item spawning (items still need to be used normally)
test_api.spawn_item("shrimp", 10)
test_api.spawn_item("bronze_sword", 1)
test_api.spawn_item("teleport_scroll", 5)

-- Skill levels (for testing level-gated content)
test_api.set_skill_level("attack", 50)

-- Entity spawning (for creating test targets)
automation.spawn_entity("rat", 10, 0, 10)

✅ GAMEPLAY Commands (Real Player Actions)

These test the actual game mechanics:
-- Movement (tests pathfinding, stuck detection)
game_api.click_tile(x, y)
game_api.walk_to(x, y)  -- Only if it triggers real pathfinding

-- Interactions (tests combat, skilling, NPC dialogs)
game_api.click_entity(npc_id)
automation.interact(entity_id, "Attack")
automation.interact(entity_id, "mine")

-- Inventory (tests item usage, eating, equipping)
game_api.click_inventory_slot(1)  -- Eat food, equip gear

-- In-game teleports (tests the teleport system itself)
game_api.use_item("teleport_scroll")

❌ FORBIDDEN Commands (Bypass Mechanics)

These would hide bugs and must NEVER be used:
-- Direct coordinate teleport (bypasses pathfinding)
automation.teleport(x, y, z)    -- ❌ FORBIDDEN
game_api.goto(x, y, z)          -- ❌ FORBIDDEN

-- Direct state modification (bypasses consumption)
game_api.set_hp(100)            -- ❌ Use: eat food instead
game_api.set_prayer(100)        -- ❌ Use: drink potion instead
game_api.set_run_energy(100)    -- ❌ Use: rest or stamina potion

-- Direct kill/damage (bypasses combat)
game_api.kill_npc(id)           -- ❌ FORBIDDEN
game_api.damage_npc(id, 50)     -- ❌ FORBIDDEN

-- Skip animations/cooldowns
game_api.skip_animation()       -- ❌ FORBIDDEN

State Restoration Rules

StateHow to Restore
HPSpawn food → eat it
PrayerSpawn prayer potion → drink it
Run EnergyRest or spawn stamina potion → drink it
InventorySpawn items (allowed for setup)
PositionUse in-game teleport items OR walk there
Skill LevelsCan set directly for setup (tests level-gated content)

Movement Rules

MethodAllowed?Why
click_tile(x, y)✅ YesTests click → pathfinding pipeline
walk_to(x, y)✅ YesTests pathfinding (if uses real system)
automation.teleport(x, y, z)❌ NoBypasses all pathing, hides stuck bugs
Using teleport scroll/item✅ YesTests the teleport mechanic itself

Current Test Files Audit

Violations Found

FileViolationLineFix
e2e_skills_test.luaautomation.set_skill_level()59⚠️ Acceptable for SETUP only
e2e_combat_test.luaautomation.spawn_entity()88✅ OK (setup)
comprehensive_test.luaautomation.teleport()86❌ Should use walk_to
automated_tests/main.luaautomation.teleport()134, 177, 209❌ Should use walk_to
automated_tests/main.luaautomation.modify_inventory()70-82⚠️ OK for setup, but healing should EAT
automated_tests/main.luaautomation.set_skill_level()254⚠️ Acceptable for SETUP only
  1. Replace automation.teleport() with:
    • automation.walk_to() for nearby destinations
    • automation.spawn_item("teleport_scroll") + game_api.click_inventory_slot() for far destinations
  2. For HP restoration tests, don't skip - spawn food and eat it:
    -- Wrong
    player.set_hp(100)
    
    -- Correct
    test_api.spawn_item("shrimp", 5)
    game_api.click_inventory_slot(1)  -- Eat shrimp

Test Agent Architecture

┌─────────────────────────────────────────┐
│         Test Agent = Virtual Player     │
├─────────────────────────────────────────┤
│  CAN READ:           CAN DO (SETUP):    │
│  - HP, XP, levels    - Spawn items      │
│  - Inventory         - Set skill levels │
│  - Position          - Spawn entities   │
│  - NPC positions                        │
│  - UI state          CAN DO (GAMEPLAY): │
│                      - Click tiles      │
│  CANNOT DO:          - Click entities   │
│  - Teleport directly - Click UI         │
│  - Set HP/Prayer     - Use items        │
│  - Skip animations   - Type in chat     │
│  - Bypass cooldowns                     │

Failure Handling & Retry Strategy

Player Death During Test

  • Auto-Respawn: Game respawns player at configured respawn point.
  • Survival Mode: Agent should do everything to survive:
    • Spawn and eat food when HP low
    • Activate protection prayers
    • Retreat if overwhelmed
  • Max Retries: Configurable (default: 3). If exceeded → mark test FAILED.
-- Test config
test.config = {
    max_retries = 3,
    survival_hp_threshold = 30,
    auto_spawn_food = true,
}

Timeout Handling

  • Retry Action: On timeout, retry the action (click again).
  • Max Retries: Configurable per action.
  • Dependency Check: If subsequent tests depend on failed step → fail entire suite.
function on_timeout()
    if current_retry < test.config.max_retries then
        current_retry = current_retry + 1
        retry_current_action()
    else
        if is_blocking_dependency() then
            fail_entire_suite("Blocking dependency failed: " .. current_test.name)
        else
            mark_test_failed()
            skip_to_next_test()
        end
    end
end

Resource Exhaustion

ScenarioAction
Out of food (normal test)Spawn more food, continue
Out of food (boss test)Log failure (intentional challenge)
Out of potionsSpawn more, continue
Out of ammoSpawn more, continue

Retry Scope (Configurable Hooks)

-- Per-test retry configuration
register_test("Combat Test", {
    retry_scope = "full",     -- Options: "full", "step", "checkpoint"
    max_retries = 3,
    on_retry = function()
        reset_to_start_position()
        clear_inventory()
        spawn_required_items()
    end
})
Retry ScopeBehavior
fullRestart entire test from step 1
stepRetry only the failed step
checkpointRestart from last checkpoint

State Cleanup Between Retries

  • ✅ Reset position to test starting point
  • ✅ Clear inventory, re-spawn required items
  • ✅ Wait for NPC respawns if needed
  • ✅ Reset any modified game state

Flaky Test Detection

  • Definition: Test passes sometimes, fails sometimes.
  • Detection: If test fails after passing previously (or vice versa) within same run.
  • Handling:
    • Mark as FLAKY in report
    • Set to higher priority for investigation
    • Flaky tests often indicate race conditions
-- Flaky detection
if test.failed and test.previous_result == "PASS" then
    test.status = "FLAKY"
    test.priority = "HIGH"
    report_bug("FLAKY", "Test is non-deterministic", {
        test_name = test.name,
        fail_rate = test.fail_count / test.run_count
    })
end

Failure Logging

What to Capture on Failure

DataCapturedConfigurable
Screenshot✅ Yes-
Video recording✅ YesDuration (default: last 30s)
Game log✅ YesPeriod (default: last 60s)
Full state dump✅ YesHP, inventory, position, buffs
Network trafficOptionalFor multiplayer debugging
-- Failure logging config
test.logging = {
    video_duration = 30,      -- seconds before failure
    log_period = 60,          -- seconds of log history
    screenshot = true,
    state_dump = true,
}

Failure Report Format

{
  "test_name": "Combat Test",
  "status": "FAILED",
  "failure_reason": "Player died unexpectedly",
  "retry_count": 3,
  "is_flaky": false,
  "attachments": {
    "screenshot": "failure_1234.png",
    "video": "failure_1234.webm",
    "log": "failure_1234.log",
    "state_dump": {
      "hp": 0,
      "position": [3200, 0, 3150],
      "inventory": ["bronze_sword", "shrimp", "shrimp"],
      "target": "goblin_123"
    }
  }
}

Ticket Filing Policy

ScenarioFile Ticket?
Every failure✅ Yes
Flaky test detection✅ Yes (HIGH priority)
Timeout✅ Yes
Player death (unexpected)✅ Yes
Resource exhaustion⚠️ Depends on test type

LLM Integration (Gemini Flash)

For ambiguous bugs, send to LLM for triage:
  • Only when assertion-based detection fails
  • Optimize prompts to stay within free tier limits
  • Include: screenshot description, game state, specific question

Execution Mode

  • Local only (no cloud for now)
  • Manual trigger (not CI/nightly yet)
  • Full coverage (all skills, combat, UI, economy)