【CV】ICML2015_Unsupervised Learning of Video Representations using LSTMs

Unsupervised Learning of Video Representations using LSTMs

Note here: it‘s a learning notes on new LSTMs architecture used as an unsupervised learning way of video representations.

(More unsupervised learning related topics, you can refer to:

Learning Temporal Embeddings for Complex Video Analysis

Unsupervised Learning of Visual Representations using Videos

Unsupervised Visual Representation Learning by Context Prediction)

Link: http://arxiv.org/abs/1502.04681

Motivation:

- Understanding temporal sequences is important for solving many video related problems. We should utilize temporal structure of videos as a supervisory signal for unsupervised learning.

Proposed model:

In this paper, the author proposed three models based on LSTM:

1) LSTM Autoencoder Model:

  This model is composed of two parts, the encoder and the decoder.

  The encoder accepts sequences of frames as input, and the learned representation generated from encoder are copied to decoder as initial input. Then the decoder should reconstruct similar images like input frames in reverse order.

  (This is called unconditional version, while a conditional version receives last generated output of decoder as input, shown as the dashed boxes below)

Intuition: The reconstruction work requires the network to capture information about the appearance of objects and the background, this is exactly the information that we would like the representation to contain.

2) LSTM Future Predictor Model:

  This model is similar with the one above. The main difference lies in the output. Output of this model is the prediction of frames that come just after the input sequences. It also varies with conditional/unconditional versions just like the description above.

Intuition: In order to predict the next few frames correctly, the model needs information about which objects are present and how they are moving so that the motion can be extrapolated.

3) A Composite Model:

  This model combines "input reconstruction" and "future prediction" together to form a more powerful model. These two modules share a same encoder, which encodes input sequences into a feature vector and copy them to different decoders.

Intuition: this only encoder learns representations that contain not only static appearance of objects&background, but also the dynamic informations like moving objects and their moving pattern.

时间: 2024-11-07 07:39:36

【CV】ICML2015_Unsupervised Learning of Video Representations using LSTMs的相关文章

【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos

Unsupervised Learning of Visual Representations using Videos Motivation: - Supervised learning is popular for CNN to train an excellent model on various visual problems, while the application of unsupervised learning leaves blank. - People learn conc

【资源】Deep learning 资源汇总......

在网上总能发现一些感兴趣的东西,从前是直接转载过来,现在发现太多了,还是汇总url吧.积累,慢慢开始...... 1. 斯坦福Richard Socher在EMNLP2014发表新作:GloVe: Global Vectors for Word Representation 粗看是融合LSA等算法的想法,利用global word co-occurrence信息提升word vector学习效果,很有意思,在word analogy task上准确率比word2vec提升了11% http://

【转载】Deep Learning(深度学习)学习笔记整理

目录: 一.概述 二.背景 三.人脑视觉机理 四.关于特征 4.1.特征表示的粒度 4.2.初级(浅层)特征表示 4.3.结构性特征表示 4.4.需要有多少个特征? 五.Deep Learning的基本思想 六.浅层学习(Shallow Learning)和深度学习(Deep Learning) 七.Deep learning与Neural Network 八.Deep learning训练过程 8.1.传统神经网络的训练方法 8.2.deep learning训练过程 九.Deep Learn

【MATLAB】Machine Learning (Coursera Courses Outline & Schedule)

课程涉及技术: 梯度下降.线性回归.监督/非监督学习.分类/逻辑回归.正则化.神经网络.梯度检验/数值计算.模型选择/诊断.学习曲线.评估度量.SVM.K-Means聚类.PCA.Map Reduce & Data Parallelism 等- 课程涉及应用: 邮件分类.肿瘤诊断.手写识别.自动驾驶.模型优化.OCR等- Coursera machine learning course materials, including problem sets and my solutions (usi

【转载】Discriminative Learning和Generative Learning

Discriminative Learning和Generative Learning 2011-12-08 10:47 1929人阅读 评论(2) 收藏 举报 variablesdependencies算法includeparametersexpress Discriminative 学习算法是一类模型化输入(X)输出(Y)的关系的方法,简单来说就好比中医,我们只知道用若干个药(当归,虎骨...)可以凑成一个药方,就能治疗跌打病痛.但我们并不去了解内部的原因,我们将其看做一个黑盒,只需了解X和

【转载】Machine Learning CMSC 422 Spring 2013

Machine LearningCMSC 422Spring 2013 Schedule: MWF 4:00pm-4:50pm Location: CSIC 2117 Instructor: Hal Daume III:  Office Hours: AVW 3227; Fri 2:45-3:45 or by appointment Piazza: UMD/cs422 TAs: Phil Dasler (office hours: Thr 2:00-3:00 in TA room)   Josh

【转载】deep learning这件小事……

deep learning这件小事…… (2013-03-30 16:35:17) 转载▼ 标签: deep-learning 机器学习 深度神经网络 监督学习 非监督学习 分类: 机器学习 「深度神经网络」(deep neural network)具体是怎样工作的? 多层的好处是可以用较少的参数表示复杂的函数. 在监督学习中,以前的多层神经网络的问题是容易陷入局部极值点.如果训练样本足够充分覆盖未来的样本,那么学到的多层权重可以很好的用来预测新的测试样本.但是很多任务难以得到足够多的标记样本,

【转】Deep Learning论文笔记之(四)CNN卷积神经网络推导和实现

原作者:zouxy09 原文链接:http://blog.csdn.net/zouxy09/article/details/9993371 Deep Learning论文笔记之(四)CNN卷积神经网络推导和实现 [email protected] http://blog.csdn.net/zouxy09          自己平时看了一些论文,但老感觉看完过后就会慢慢的淡忘,某一天重新拾起来的时候又好像没有看过一样.所以想习惯地把一些感觉有用的论文中的知识点总结整理一下,一方面在整理过程中,自己

【DeepLearning】Exercise:Learning color features with Sparse Autoencoders

Exercise:Learning color features with Sparse Autoencoders 习题链接:Exercise:Learning color features with Sparse Autoencoders sparseAutoencoderLinearCost.m function [cost,grad] = sparseAutoencoderLinearCost(theta, visibleSize, hiddenSize, ... lambda, spar