Further reading
- Ghasemi, An Introduction to Reinforcement Learning: Fundamental Concepts and Practical Applications, 2024, https://arxiv.org/abs/2408.07712
- Mnih, Playing Atari with Deep Reinforcement Learning, 2013, https://arxiv.org/abs/1312.5602
- Hugging Face, Proximal Policy Optimization (PPO), https://huggingface.co/blog/deep-rl-ppo
- Wang, Learning Reinforcement Learning by LearningREINFORCE, https://www.cs.toronto.edu/~tingwuwang/REINFORCE.pdf
- Kaufmann, A Survey of Reinforcement Learning from Human Feedback, 2024, https://arxiv.org/pdf/2312.14925
- Bongratz, How to Choose a Reinforcement-Learning Algorithm, 2024, https://arxiv.org/abs/2407.20917v1
- Schulman, Proximal Policy Optimization Algorithms, 2017, https://arxiv.org/abs/1707.06347
- OpenAI, Proximal Policy Optimization, https://openai.com/index/openai-baselines-ppo/
- OpenAI Spinning UP, Proximal Policy Optimization, https://spinningup.openai.com/en/latest/algorithms/ppo.html
- Bick, Towards...