A Conceptual Introduction to Policy Gradient Methods

Reinforcement learning algorithms tend to fall into two distinct categories: value based and policy based learning. Q Learning, and its deep neural network implementation, Deep Q Learning, are examples of the former. Policy gradient methods, as one might guess from the name, are examples of the latter. While value based methods have shown performance that … Continue reading A Conceptual Introduction to Policy Gradient Methods