Further reading
- Everton et al., Catastrophic Forgetting in Deep Learning: A Comprehensive Taxonomy, 2023, https://arxiv.org/abs/2312.10549
- Raieli, Emergent Abilities in AI: Are We Chasing a Myth?, 2023, https://towardsdatascience.com/emergent-abilities-in-ai-are-we-chasing-a-myth-fead754a1bf9
- Rasyl et al., Preference Tuning LLMs with Direct Preference Optimization Methods, 2024, https://huggingface.co/blog/pref-tuning
- Alemi, KL is All You Need, 2024, https://blog.alexalemi.com/kl-is-all-you-need.html
- OpenAI, Proximal Policy Optimization, https://spinningup.openai.com/en/latest/algorithms/ppo.html
- Simonini, Proximal Policy Optimization (PPO), 2022, https://huggingface.co/blog/deep-rl-ppo
- Hoffmann et al., Training Compute-Optimal Large Language Models, 2022, https://arxiv.org/abs/2203.15556
- Brown et al., Language Models are Few-Shot Learners, 2020, https://arxiv.org/abs/2005.14165