Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation

· research