cs234 / lecture 8 - policy gradient I

Policy Based Reinforcement Learning

Value-Based vs Policy-Based RL

Advantages + Disadvantages of Policy Based RL

Policy Objective Functions

Policy Optimization

Policy Gradient

Compute Gradients By Finite Differences

Compute the Gradient Analytically

Likelihood Ratio Policies