Interesting research papers I have read (and my notes):
- Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
- Decision Transformer: Reinforcement Learning via Sequence Modeling
- Bridging the Gap Between Value and Policy Based Reinforcement Learning
- Action-dependent Control Variates for Policy Optimization via Stein’s Identity
- Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
- Implicit Quantile Networks for Distributional Reinforcement Learning
- Distributional Reinforcement Learning with Quantile Regression
- A Distributional Perspective on Reinforcement Learning
- Addressing Function Approximation Error in Actor-Critic Methods
- Continuous Control with Deep Reinforcement Learning
- Deterministic Policy Gradient Algorithms
- Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
- Sample Efficient Actor-Critic with Experience Replay
- Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
- Proximal Policy Optimization Algorithms
- Emergence of Locomotion Behaviours in Rich Environments
- High-Dimensional Continuous Control Using Generalized Advantage Estimation
- Trust Region Policy Optimization
- Asynchronous Methods for Deep Reinforcement Learning
- Rainbow - Combining Improvements in Deep Reinforcement Learning
- Prioritized Experience Replay
- Deep Reinforcement Learning with Double Q-learning
- Dueling Network Architectures for Deep Reinforcement Learning
- Deep Recurrent Q-Learning for Partially Observable MDPs
- Playing Atari With Deep Reinforcement Learning
- Extensibility, Safety, and Performance in the SPIN Operating System
- On Micro-Kernel Construction
- Exokernel - An Operating System Architecture for Application-Level Resource Management