L4 TRPO and PPO (Foundations of Deep RL Series)

Published 2021-08-24

Download video MP4 360p

Recommendations

12:12

L5 DDPG and SAC (Foundations of Deep RL Series)
3:33:03

Deep Learning: A Crash Course (2018) | SIGGRAPH Courses
34:09

L2 Deep Q-Learning (Foundations of Deep RL Series)
3:50:57

How Deep Neural Networks Work - Full Course for Beginners
35:01

Let's Code Proximal Policy Optimization
41:01

Deep RL Bootcamp Lecture 5: Natural Policy Gradients, TRPO, PPO
58:12

MIT Introduction to Deep Learning | 6.S191
1:02:47

Proximal Policy Optimization (PPO) is Easy With PyTorch | Full PPO Tutorial
59:36

Policy Gradient Theorem Explained - Reinforcement Learning
1:00:38

Reinforcement Learning from Human Feedback: From Zero to chatGPT
3:15:38

What is ChatGPT doing...and why does it work?
35:35

Q-Learning: Model Free Reinforcement Learning and Temporal Difference Learning

Similar videos

41:22

L3 Policy Gradients and Advantage Estimation (Foundations of Deep RL Series)
17:50

Proximal Policy Optimization Explained
22:18

CS885 Module 1: Trust region & proximal policy optimization
25:51

Part 1 of 3 — Proximal Policy Optimization Implementation: 11 Core Implementation Details
18:14

L6 Model-based RL (Foundations of Deep RL Series)
1:16:10

L1 MDPs, Exact Solution Methods, Max-ent RL (Foundations of Deep RL Series)
29:27

TRPO 置信域策略优化 (Trust Region Policy Optimization)
18:14

CS885 Lecture 15b: Proximal Policy Optimization (Presenter: Ruifan Yu)
12:16

Does your PPO agent fail to learn?
23:44

10 minutes paper (episode 5); Proximal Policy Optimization Algorithms
29:08

Proximal Policy Optimization is Easy with Tensorflow 2 | PPO Tutorial
35:40

#7 PPO-TRPO , Surrogate Function
20:14

TRPO, ACKTR and PPO (V2)
25:55

Overview of the TRPO RL paper/algorithm
More results