开山之作: 《Playing Atari with Deep Reinforcement Learning》(NIPS)
http://export.arxiv.org/pdf/1312.5602
《Human-level control through deep reinforcementlearnin》 https://www.cs.swarthmore.edu/~meeden/cs63/s15/nature15b.pdf
使用2个网络,减少了相关性,每隔一定时间,替换参数。
《Deep Reinforcement Learning with Double Q-learning》 https://arxiv.org/pdf/1509.06461.pdf
原文地址:https://www.cnblogs.com/zle1992/p/10287200.html
时间: 2024-09-30 23:49:30