上QQ阅读APP看书，第一时间看更新

Category 3 - actor-critic

In Actor-Critic, we have both policy and value functions (or a combination of value-based and policy-based). This method is the best of both worlds: