qa testing principles

Guide for qa testing principles

AI QA Testing Agent Principles

Philosophy

Setup is cheating, Gameplay is not.

The test framework can cheat to create test conditions (spawn items, set levels), but must play fairly during the actual test (eat food, walk places, fight normally).

Allowed vs Forbidden Actions

✅ SETUP Commands (Test Preparation Only)

These prepare the test environment quickly, but do not bypass mechanics:

-- Item spawning (items still need to be used normally)
test_api.spawn_item("shrimp", 10)
test_api.spawn_item("bronze_sword", 1)
test_api.spawn_item("teleport_scroll", 5)

-- Skill levels (for testing level-gated content)
test_api.set_skill_level("attack", 50)

-- Entity spawning (for creating test targets)
automation.spawn_entity("rat", 10, 0, 10)

✅ GAMEPLAY Commands (Real Player Actions)

These test the actual game mechanics:

-- Movement (tests pathfinding, stuck detection)
game_api.click_tile(x, y)
game_api.walk_to(x, y)  -- Only if it triggers real pathfinding

-- Interactions (tests combat, skilling, NPC dialogs)
game_api.click_entity(npc_id)
automation.interact(entity_id, "Attack")
automation.interact(entity_id, "mine")

-- Inventory (tests item usage, eating, equipping)
game_api.click_inventory_slot(1)  -- Eat food, equip gear

-- In-game teleports (tests the teleport system itself)
game_api.use_item("teleport_scroll")

❌ FORBIDDEN Commands (Bypass Mechanics)

These would hide bugs and must NEVER be used:

-- Direct coordinate teleport (bypasses pathfinding)
automation.teleport(x, y, z)    -- ❌ FORBIDDEN
game_api.goto(x, y, z)          -- ❌ FORBIDDEN

-- Direct state modification (bypasses consumption)
game_api.set_hp(100)            -- ❌ Use: eat food instead
game_api.set_prayer(100)        -- ❌ Use: drink potion instead
game_api.set_run_energy(100)    -- ❌ Use: rest or stamina potion

-- Direct kill/damage (bypasses combat)
game_api.kill_npc(id)           -- ❌ FORBIDDEN
game_api.damage_npc(id, 50)     -- ❌ FORBIDDEN

-- Skip animations/cooldowns
game_api.skip_animation()       -- ❌ FORBIDDEN

State Restoration Rules

State	How to Restore
HP	Spawn food → eat it
Prayer	Spawn prayer potion → drink it
Run Energy	Rest or spawn stamina potion → drink it
Inventory	Spawn items (allowed for setup)
Position	Use in-game teleport items OR walk there
Skill Levels	Can set directly for setup (tests level-gated content)

Movement Rules

Method	Allowed?	Why
`click_tile(x, y)`	✅ Yes	Tests click → pathfinding pipeline
`walk_to(x, y)`	✅ Yes	Tests pathfinding (if uses real system)
`automation.teleport(x, y, z)`	❌ No	Bypasses all pathing, hides stuck bugs
Using teleport scroll/item	✅ Yes	Tests the teleport mechanic itself

Current Test Files Audit

Violations Found

File	Violation	Line	Fix
`e2e_skills_test.lua`	`automation.set_skill_level()`	59	⚠️ Acceptable for SETUP only
`e2e_combat_test.lua`	`automation.spawn_entity()`	88	✅ OK (setup)
`comprehensive_test.lua`	`automation.teleport()`	86	❌ Should use walk_to
`automated_tests/main.lua`	`automation.teleport()`	134, 177, 209	❌ Should use walk_to
`automated_tests/main.lua`	`automation.modify_inventory()`	70-82	⚠️ OK for setup, but healing should EAT
`automated_tests/main.lua`	`automation.set_skill_level()`	254	⚠️ Acceptable for SETUP only

Recommended Fixes

Replace automation.teleport() with:
- automation.walk_to() for nearby destinations
- automation.spawn_item("teleport_scroll") + game_api.click_inventory_slot() for far destinations

For HP restoration tests, don't skip - spawn food and eat it:

-- Wrong
player.set_hp(100)

-- Correct
test_api.spawn_item("shrimp", 5)
game_api.click_inventory_slot(1)  -- Eat shrimp

Test Agent Architecture

┌─────────────────────────────────────────┐
│         Test Agent = Virtual Player     │
├─────────────────────────────────────────┤
│  CAN READ:           CAN DO (SETUP):    │
│  - HP, XP, levels    - Spawn items      │
│  - Inventory         - Set skill levels │
│  - Position          - Spawn entities   │
│  - NPC positions                        │
│  - UI state          CAN DO (GAMEPLAY): │
│                      - Click tiles      │
│  CANNOT DO:          - Click entities   │
│  - Teleport directly - Click UI         │
│  - Set HP/Prayer     - Use items        │
│  - Skip animations   - Type in chat     │
│  - Bypass cooldowns                     │

Failure Handling & Retry Strategy

Player Death During Test

Auto-Respawn: Game respawns player at configured respawn point.
Survival Mode: Agent should do everything to survive:
- Spawn and eat food when HP low
- Activate protection prayers
- Retreat if overwhelmed
Max Retries: Configurable (default: 3). If exceeded → mark test FAILED.

-- Test config
test.config = {
    max_retries = 3,
    survival_hp_threshold = 30,
    auto_spawn_food = true,
}

Timeout Handling

Retry Action: On timeout, retry the action (click again).
Max Retries: Configurable per action.
Dependency Check: If subsequent tests depend on failed step → fail entire suite.

function on_timeout()
    if current_retry < test.config.max_retries then
        current_retry = current_retry + 1
        retry_current_action()
    else
        if is_blocking_dependency() then
            fail_entire_suite("Blocking dependency failed: " .. current_test.name)
        else
            mark_test_failed()
            skip_to_next_test()
        end
    end
end

Resource Exhaustion

Scenario	Action
Out of food (normal test)	Spawn more food, continue
Out of food (boss test)	Log failure (intentional challenge)
Out of potions	Spawn more, continue
Out of ammo	Spawn more, continue

Retry Scope (Configurable Hooks)

-- Per-test retry configuration
register_test("Combat Test", {
    retry_scope = "full",     -- Options: "full", "step", "checkpoint"
    max_retries = 3,
    on_retry = function()
        reset_to_start_position()
        clear_inventory()
        spawn_required_items()
    end
})

Retry Scope	Behavior
`full`	Restart entire test from step 1
`step`	Retry only the failed step
`checkpoint`	Restart from last checkpoint

State Cleanup Between Retries

✅ Reset position to test starting point
✅ Clear inventory, re-spawn required items
✅ Wait for NPC respawns if needed
✅ Reset any modified game state

Flaky Test Detection

Definition: Test passes sometimes, fails sometimes.
Detection: If test fails after passing previously (or vice versa) within same run.
Handling:
- Mark as FLAKY in report
- Set to higher priority for investigation
- Flaky tests often indicate race conditions

-- Flaky detection
if test.failed and test.previous_result == "PASS" then
    test.status = "FLAKY"
    test.priority = "HIGH"
    report_bug("FLAKY", "Test is non-deterministic", {
        test_name = test.name,
        fail_rate = test.fail_count / test.run_count
    })
end

Failure Logging

What to Capture on Failure

Data	Captured	Configurable
Screenshot	✅ Yes	-
Video recording	✅ Yes	Duration (default: last 30s)
Game log	✅ Yes	Period (default: last 60s)
Full state dump	✅ Yes	HP, inventory, position, buffs
Network traffic	Optional	For multiplayer debugging

-- Failure logging config
test.logging = {
    video_duration = 30,      -- seconds before failure
    log_period = 60,          -- seconds of log history
    screenshot = true,
    state_dump = true,
}

Failure Report Format

{
  "test_name": "Combat Test",
  "status": "FAILED",
  "failure_reason": "Player died unexpectedly",
  "retry_count": 3,
  "is_flaky": false,
  "attachments": {
    "screenshot": "failure_1234.png",
    "video": "failure_1234.webm",
    "log": "failure_1234.log",
    "state_dump": {
      "hp": 0,
      "position": [3200, 0, 3150],
      "inventory": ["bronze_sword", "shrimp", "shrimp"],
      "target": "goblin_123"
    }
  }
}

Ticket Filing Policy

Scenario	File Ticket?
Every failure	✅ Yes
Flaky test detection	✅ Yes (HIGH priority)
Timeout	✅ Yes
Player death (unexpected)	✅ Yes
Resource exhaustion	⚠️ Depends on test type

LLM Integration (Gemini Flash)

For ambiguous bugs, send to LLM for triage:

Only when assertion-based detection fails
Optimize prompts to stay within free tier limits
Include: screenshot description, game state, specific question

Execution Mode

Local only (no cloud for now)
Manual trigger (not CI/nightly yet)
Full coverage (all skills, combat, UI, economy)