Bao Xi, Rui Wang, Ying-Hao Cai, Tao Lu, Shuo Wang. A Novel Heterogeneous Actor-critic Algorithm with Recent Emphasizing Replay Memory[J]. Machine Intelligence Research, 2021, 18(4): 619-631. DOI: 10.1007/s11633-021-1296-x
Citation: Bao Xi, Rui Wang, Ying-Hao Cai, Tao Lu, Shuo Wang. A Novel Heterogeneous Actor-critic Algorithm with Recent Emphasizing Replay Memory[J]. Machine Intelligence Research, 2021, 18(4): 619-631. DOI: 10.1007/s11633-021-1296-x

A Novel Heterogeneous Actor-critic Algorithm with Recent Emphasizing Replay Memory

  • Reinforcement learning (RL) algorithms have been demonstrated to solve a variety of continuous control tasks. However, the training efficiency and performance of such methods limit further applications. In this paper, we propose an off-policy heterogeneous actor-critic (HAC) algorithm, which contains soft Q-function and ordinary Q-function. The soft Q-function encourages the exploration of a Gaussian policy, and the ordinary Q-function optimizes the mean of the Gaussian policy to improve the training efficiency. Experience replay memory is another vital component of off-policy RL methods. We propose a new sampling technique that emphasizes recently experienced transitions to boost the policy training. Besides, we integrate HAC with hindsight experience replay (HER) to deal with sparse reward tasks, which are common in the robotic manipulation domain. Finally, we evaluate our methods on a series of continuous control benchmark tasks and robotic manipulation tasks. The experimental results show that our method outperforms prior state-of-the-art methods in terms of training efficiency and performance, which validates the effectiveness of our method.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return