What is Proximal Policy Optimization (PPO) algorithm in reinforcement learning?

Published --