Adversarial scenarios add a saboteur agent that re-injects faults on a cooldown,
noisy Slack channels designed to actively misdirect the agent, runbook traps
where the standard playbook makes things worse, and cascading failures where
the loud symptom hides the real root cause.
Source files: rl-agent/scenarios/sim/hard/sim_advanced_*.json.
| ID | Title | Difficulty | Category | Mean reward |
|---|---|---|---|---|
sim_gen_redherring_auth_001 | Red-herring on Slack — true cause is leak in auth | medium | Generated · Red Herring | -8.280 |
sim_gen_redherring_auth_002 | Red-herring on Slack — true cause is leak in auth | medium | Generated · Red Herring | -6.330 |
sim_gen_redherring_auth_003 | Red-herring on Slack — true cause is leak in auth | medium | Generated · Red Herring | -6.630 |
sim_gen_redherring_auth_004 | Red-herring on Slack — true cause is leak in auth | medium | Generated · Red Herring | -8.280 |
sim_gen_redherring_catalog_009 | Red-herring on Slack — true cause is leak in catalog | medium | Generated · Red Herring | -5.880 |
sim_gen_redherring_catalog_010 | Red-herring on Slack — true cause is leak in catalog | medium | Generated · Red Herring | -6.480 |
sim_gen_redherring_catalog_011 | Red-herring on Slack — true cause is leak in catalog | medium | Generated · Red Herring | -8.280 |
sim_gen_redherring_catalog_012 | Red-herring on Slack — true cause is leak in catalog | medium | Generated · Red Herring | -5.430 |
sim_gen_redherring_checkout_005 | Red-herring on Slack — true cause is leak in checkout | medium | Generated · Red Herring | -6.480 |
sim_gen_redherring_checkout_006 | Red-herring on Slack — true cause is leak in checkout | medium | Generated · Red Herring | -5.730 |
sim_gen_redherring_checkout_007 | Red-herring on Slack — true cause is leak in checkout | medium | Generated · Red Herring | -4.830 |
sim_gen_redherring_checkout_008 | Red-herring on Slack — true cause is leak in checkout | medium | Generated · Red Herring | -5.730 |
sim_gen_redherring_inventory_017 | Red-herring on Slack — true cause is leak in inventory | medium | Generated · Red Herring | -6.630 |
sim_gen_redherring_inventory_018 | Red-herring on Slack — true cause is leak in inventory | medium | Generated · Red Herring | -5.880 |
sim_gen_redherring_inventory_019 | Red-herring on Slack — true cause is leak in inventory | medium | Generated · Red Herring | -4.230 |
sim_gen_redherring_inventory_020 | Red-herring on Slack — true cause is leak in inventory | medium | Generated · Red Herring | -6.930 |
sim_gen_redherring_payments_013 | Red-herring on Slack — true cause is leak in payments | medium | Generated · Red Herring | -6.180 |
sim_gen_redherring_payments_014 | Red-herring on Slack — true cause is leak in payments | medium | Generated · Red Herring | -6.930 |
sim_gen_redherring_payments_015 | Red-herring on Slack — true cause is leak in payments | medium | Generated · Red Herring | -8.280 |
sim_gen_redherring_payments_016 | Red-herring on Slack — true cause is leak in payments | medium | Generated · Red Herring | -5.880 |
sim_advanced_cascade_users_db_001 | Cascade: users_db memory-leak hides behind frontend 504s | hard | Cascading Failure | -7.230 |
sim_advanced_runbook_trap_postgres_001 | Trap: TXID wraparound — restart corrupts the DB | hard | Runbook Trap | -7.380 |
sim_advanced_saboteur_duel_001 | 1v1 Duel — Active Saboteur attacks auth_db, then choke-holds the replica | hard | Adversarial Saboteur | -5.805 |
sim_advanced_slack_redherring_001 | Slack Red-Herring — A frontend dev claims their hotfix broke checkout | hard | Slack Red Herring | -5.655 |
sim_advanced_trolley_orders_db_001 | Trolley: orders_db index corrupted — rebuild vs restore | hard | Trolley Problem | -6.180 |