PPO KAGGLE · 381 TASKS
These charts come from the 3-shard PPO + LoRA training we ran on free Kaggle T4s. Source data: kaggle ran notebooks/shard {1,2,3}/training_kaggle{N}.json + the 381 scenarios in rl-agent/scenarios/sim/{easy,medium,hard}/*.json, pre-bundled into rl-agent/showcase_data.json by scripts/build_showcase_data.py. Every visible number is computed from those files.

Curriculum composition

In the PPO run we shuffled all 381 tasks (modulo-3 across shards) rather than escalating tiers — the curriculum knob is reserved for the next training pass.
Easy
156
Medium
128
Hard
97
Categories
21

Categories

CategoryTasks
Other · Easy80
Generated · App Memory Leak42
Other · Med30
Generated · Cascade30
Generated · Cache Cold-Start24
Generated · DB Failover Duel24
DynamoDB Throttling20
Lambda Throttling20
Generated · Red Herring20
Generated · Peak Traffic18
EventBridge → Lambda Chain15
Step Functions → Lambda15
API Gateway Multi-Stage10
DynamoDB Cascade10
IAM Permission Chain10
Generated · DB Restore8
Cascading Failure1
Runbook Trap1
Adversarial Saboteur1
Slack Red Herring1
Trolley Problem1

Where the controller will plug in

The legacy curriculum tiers (warmup → beginner → intermediate → advanced → expert) remain wired into environment/curriculum.py and POST /reset with use_curriculum:true. The next training pass will turn it on so the agent sees easy scenarios first and graduates only after rolling success ≥ 0.65 in the current tier.