This document discusses reinforcement learning. It begins by defining reinforcement learning as learning from interaction through trial and error using a goal-directed approach. It then contrasts reinforcement learning with unsupervised learning, noting that reinforcement learning aims to maximize rewards through closed-loop interaction rather than finding hidden structures. The document discusses the exploration-exploitation dilemma and provides examples of reinforcement learning problems like controlling a mobile robot or optimizing a petroleum refinery. It outlines the key components of reinforcement learning problems including policies, rewards, value functions, and models. Finally, it discusses solutions like Markov decision processes, dynamic programming, and Monte Carlo methods.