PPO KAGGLE · 381 TASKS
These charts come from the 3-shard PPO + LoRA training we ran on free Kaggle T4s. Source data:
kaggle ran notebooks/shard {1,2,3}/training_kaggle{N}.json + the 381 scenarios in
rl-agent/scenarios/sim/{easy,medium,hard}/*.json, pre-bundled into
rl-agent/showcase_data.json by scripts/build_showcase_data.py.
Every visible number is computed from those files.
Cluster integration · PPO mode
The PPO-trained adapter targets the same IncidentCommanderEnv action set; cluster integration is unchanged.
Code path
| Mode | Effect | Env var |
| mock | All write actions return [MOCK] deltas; pure simulator. | MOCK_MODE=true (default) |
| live k8s | Write actions hit a real cluster via kubernetes python client. | REAL_K8S=true |
| aws | Lambda / DDB / API Gateway / EventBridge actions hit real AWS. | USE_AWS=true |
Files
| Path | Purpose |
infra/terraform/main.tf | Hetzner Cloud cluster (3× cx21 + load-balancer) |
infra/k8s/*.yaml | Deployments, Services, ConfigMaps for the 5 microservices |
infra/helm/acmecorp/ | Helm chart that rolls everything out |
infra/aws/ | CloudFormation alternative (free EKS credits) |
environment/k8s_real.py | Bridge from env.step to live cluster |