scored:
died:
Learning Rate
Discount Factor
Action Randomization
Full set of states
(may take longer to train)
▶ TRAIN
reset