Introduction to Reinforcement Learning
The introductory notes included Bandit Algorithms, MDP, Model-free Methods, Value Function Approximation, Policy Optimization. For the state-of-the-art advances, one can refer to paper directly and some excellent blogs.
Reinforcement Learning Notes (an integration of the following sections)
Section 5 Markov Decision Process
Section 6 Model-Free Prediction