link:
https://www.quora.com/What-are-the-best-books-about-reinforcement-learning
The main RL problems are related to:
- Information representation: from POMDP to predictive state representation to deep-learning to TD-networks
- Inverse RL: how to learn the reward?
- Algorithms
+ Off-policy
+ Large scale: linear and nonlinear approximations of the value function
+ Policy search vs. Q-learning based
- Beyond MDP
+ Policy search for Black-box optimization with global performance guarantees
Recommended papers:
* Algorithms for Reinforcement Learning: Csaba Szepesvari. Nice compendium of ready to be implemented algorithms.
* Reinforcement Learning and Dynamic Programming using Function Approximators. Busoniu, Lucian; Robert Babuska; Bart De Schutter; Damien Ernst (2010). This is a very practical book that explains some state-of-the-art algorithms (i.e., useful for real world problems) like fitted-Q-iteration and its variations.
* Reinforcement Learning: State-of-the-Art. Vol. 12 of Adaptation, Learning, and Optimization. Wiering, M., van Otterlo, M. (Eds.), 2012. Springer, Berlin. In Sutton‘s words "This book is a valuable resource for students wanting to
go beyond the older textbooks and for researchers wanting to easily catch up with
recent developments".
* Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles: Draguna Vrabie, Kyriakos G. Vamvoudakis, Frank L. Lewis. I am not familiar with this one, but I have seen it recommended.
* Markov Decision Processes in Artificial Intelligence, Sigaud O. & Buffet O. editors, ISTE Ld., Wiley and Sons Inc, 2010.
There are also several good specialized monographs and surveys on the topic, some of these are:
+ "From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning" by Remi Munos (New trends on Machine Learning). This monograph covers important nonconvex optimistic optimization methods that can be applied to policy search.
+ "Reinforcement Learning in Robotics: A Survey" by J. Kober, J. A. Bagnell and J. Peters.
+ "A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning" by A. Geramifard, T. J. Walsh, S. Tllex, G. Chowdhary, N. Roy and J. P. How (Foundations and Trends in Machine Learning).
+ "A Survey on Policy Search for Robotic" by Newmann and Peters (Foundations and Trends in Machine Learning).