Browsed
Author: phil

A Conceptual Introduction to Policy Gradient Methods

A Conceptual Introduction to Policy Gradient Methods

Reinforcement learning algorithms tend to fall into two distinct categories: value based and policy based learning. Q Learning, and its deep neural network implementation, Deep Q Learning, are examples of the former. Policy gradient methods, as one might guess from the name, are examples of the latter. While value based methods have shown performance that meets or exceeds human level play in a variety of environments, they suffer from some significant drawbacks that limit their viability across the broader array…

Read More Read More