this reminds me of ddpg-usv-asmc
and Deep-Reinforcement-Learning-Algorithms-with-PyTorch (is it? nope. it is stable-baseline3, containing PPO preferred by OpenAI when training InstructGPT) or Deep-reinforcement-learning-with-pytorch
awesome-deep-rl For deep RL and the future of AI.