Deep RL Bootcamp Frontiers Lecture I: Recent Advances,

high bias

if the robot has learnt something (no changes appear with iterations)

however, in the real world tasks, the task could change a little bit, then the robot will failed to generalize.

no matter how well we train the robot in situations, there‘s always something that happens and messes up the system.

14 robot, sharing their own experience to each other

here the goal is to grasp anything. no task here

a few clips of our best trained neural network, picking up four objects that are visually kind of similar. They are all blue and roughly the same size, roughly rectangular.

imitation learning

push the green tedy bear to the red spot

终于结束了，接下来需要

1，学习使用tensorflow和pytorch

2，对CNN、RNN、GAN动手实践，做项目

（大概需要20天）

3，学习基础的ML（大概需要4天）

4，学习raspberry pi和arduino（大概需要4天）

原文地址：https://www.cnblogs.com/ecoflex/p/8991605.html

时间： 2024-10-07 23:18:51

Deep RL Bootcamp Frontiers Lecture I: Recent Advances,的相关文章

Deep RL Bootcamp Lecture 2: Sampling-based Approximations and Function Fitting

原文地址:https://www.cnblogs.com/ecoflex/p/8973854.html

Deep RL Bootcamp Lecture 5: Natural Policy Gradients, TRPO, PPO

https://statweb.stanford.edu/~owen/mc/Ch-var-is.pdf https://zhuanlan.zhihu.com/p/29934206 blue curve is the lower bounded one conjugate gradient to solve the optimization problem. Fisher information matrix, natural policy gradient To write down an op

Deep RL Bootcamp Lecture 7: SVG, DDPG, and Stochastic Computation Graphs

^ is the square root of epsilon a simplified version of hard version a more smooth way to find correct solution the first term is the REINFORCE term, and the seconde term is our grad log probability of our loss b is a stochastic node more form

Deep RL Bootcamp TAs Research Overview

model free: high variance. model based: high bias within 1h of human demonstration of each task, VR!!! 原文地址:https://www.cnblogs.com/ecoflex/p/8990885.html

综述：A survey of recent advances in visual feature detection（Author's Accepted Manuscript）

翻译一项关于视觉特征检测最新进展的调查:综述摘要 - 综述:A survey of recent advances in visual feature detection(Author's Accepted Manuscript) 原文地址:https://www.cnblogs.com/Alliswell-WP/p/TranslationOfPapers_Review006.html

Vehicle-to-Vehicle Channel Modeling and Measurements: Recent Advances and Future Challenges

题目:车联网信道建模和测量的最新进展及未来挑战这篇论文属于介绍性文章,主要介绍阐述了V2V信道的建模与测量,并提出未来V2V信道研究的挑战. 论文主要分为四个部分,第一部分是摘要和简介,讲述了这篇论文的梗概及车联网的发展情况:第二部分系统地介绍了V2V信道的测量活动,根据载波频率.频率选择性.路况环境.天线.TX和RX的运动方向.信道统计量方面分别阐述了信道测量活动:第三部分讲述了车联网中典型的模型:GBDMs(几何确定性模型).NGSMs(非几何随机性模型)和GBSMs(几何随机性模型):(

综述：Recent Advances in Features Extraction and Description Algorithms: A Comprehensive Survey

翻译特征提取与描述算法的最新进展:综述摘要 - 计算机视觉是当今信息技术中最活跃的研究领域之一.让机器和机器人能够以视线的速度看到和理解周围的世界,创造出无穷无尽的潜在应用和机会.特征检测和描述算法确实可以被认为是这种机器和机器人眼睛的视网膜.然而,这些算法通常是计算密集型的,这使得它们无法实现视觉实时性能的速度.此外,它们的能力不同,有些人可能会因为特定类型的输入而与其他人相比更有利于工作.因此,必须紧凑地报告其利弊,以及他们的表现和最近的进展.本文致力于全面概述特征检测和描述算法的最新进

综述：A survey of recent advances in visual feature detection

翻译一项关于视觉特征检测最新进展的调查:综述摘要 - 原文地址:https://www.cnblogs.com/Alliswell-WP/p/TranslationOfPapers_Review004.html

复现深度强化学习论文经验之谈

近期深度强化学习领域日新月异,其中最酷的一件事情莫过于 OpenAI 和 DeepMind 训练智能体接收人类的反馈而不是传统的奖励信号.本文作者认为复现论文是提升机器学习技能的最好方式之一,所以选择了 OpenAI 论文<Deep Reinforcement Learning from Human Preferences>作为 target,虽获得最后成功,却未实现初衷.如果你也打算复现强化学习论文,那么本文经验也许是你想要的.此外,本文虽对强化学习模型的训练提供了宝贵经验,同时也映射出另外