[Experiment Note] Domain Neurons
Series of experiments conducted for domain neurons
2023.08.22
- π Why CartPole checkpoints have the same return (500) for all models? : To validate the code, we run additional training: Mountain-Car. [mail-link]
- βοΈ Mountain-Car Training is implemented [mail-link]
- βοΈ Moutain-Car evaluation showed the increasing return unlike
CartPole
. [mail-link]. Although, training is underfit, we donβt play with Mountain-Car and CartPole as they are not be used in the Paper.
- β Initial PPO Training in
CarRacing
. (failed. The return did not increase.) mail-link
- This experiment is failed as the return did not increase.
- βοΈ CartPole-Randomized environmet training (V0 ~ V2). (to be done).
[mail-link v2]
[mail-link v1]
[mail-link v0]
- V2 has higher return than v1 due to small pole length has higher return.
2023.08.23
- SAC trainer implementation and train on car racing.
- Domain Randomization