PPO KAGGLE ยท 381 TASKS
These charts come from the 3-shard PPO + LoRA training we ran on free Kaggle T4s. Source data: kaggle ran notebooks/shard {1,2,3}/training_kaggle{N}.json + the 381 scenarios in rl-agent/scenarios/sim/{easy,medium,hard}/*.json, pre-bundled into rl-agent/showcase_data.json by scripts/build_showcase_data.py. Every visible number is computed from those files.

PPO endpoint catalogue

Routes specific to the 381-task PPO Kaggle run. Core /reset, /step, /state etc. are documented in the legacy API view.
MethodPathDescription
GET/showcaseApple-style fluid showcase page (this dataset)
GET/showcase/dataPre-computed bundle (381 scenarios + 3 shards)
GET/dashboard/ppoPPO Overview (this page family)
GET/dashboard/ppo/rewardsPPO reward analytics
GET/dashboard/ppo/tasksPPO per-task table
GET/dashboard/ppo/trainingPPO training curves
GET/dashboard/ppo/clusterCluster integration in PPO mode
GET/dashboard/ppo/awsAWS scenario coverage
GET/dashboard/ppo/adversarialAdversarial scenarios
GET/dashboard/ppo/curriculumCurriculum composition
GET/dashboard/ppo/judgeJudge config
GET/dashboard?ds=legacySwitch back to the legacy 11-task dataset

OpenAPI docs: /docs