in most AC algorithms, we actually just fit value function. less common to fit Q function as well.
batch:off line, monte carlo。online: bootstrap,TD
原文地址:https://www.cnblogs.com/ecoflex/p/9092566.html
时间: 2024-07-31 01:04:04
in most AC algorithms, we actually just fit value function. less common to fit Q function as well.
batch:off line, monte carlo。online: bootstrap,TD
原文地址:https://www.cnblogs.com/ecoflex/p/9092566.html