LEGACY DATASET
These charts come from the kube-sre-gym-style heuristic + early notebook runs — the 11 hand-curated tasks in rl-agent/scenarios/{easy,medium,hard}/*.json, recorded into rl-agent/checkpoints/<run>/metrics.jsonl and colab/logs/reward_breakdown_history.jsonl. They do not include the 381-task PPO Kaggle run.

API reference

All endpoints exposed by the env server.
MethodPathDescription
POST/resetReset env; supports use_curriculum, adversarial, persona, real_mode
POST/stepTake one action; returns observation/reward/done/info
GET/stateCurrent env state snapshot
POST/graderGrade the finished episode; records outcome for curriculum
GET/tasksList 11 tasks + action schema
GET/baselineRun the built-in heuristic agent (deterministic)
GET/curriculumCurrent tier, mastery, rolling success
POST/curriculum/resetReset curriculum progress
POST/adversarial/designProcedural or LLM-backed scenario designer
GET/judge/configPersona + judge settings
POST/k8s/injectInject a fault into the live cluster
POST/k8s/resetRestore clean cluster state
POST/k8s/execRun a kubectl verb (whitelisted)
GET/k8s/healthCluster health snapshot
GET/dashboardOverview page (this dashboard)
GET/dashboard/rewardsReward-signal analytics
GET/dashboard/tasksPer-task performance
GET/dashboard/trainingTraining curves
GET/dashboard/clusterCluster health
GET/dashboard/adversarialAdversarial designer history
GET/dashboard/curriculumCurriculum state
GET/dashboard/judgeJudge/persona

OpenAPI docs: /docs