机器学习中使用的神经网络(六) --第二课

An overview of the main types of neural network architecture 神经网络结构的主要类型

什么是神经网络的结构?

神经网络中神经元的组织方式。

1. 目前应用领域中最常用的结构类型是feet-forward 神经网络, 信息来自输入单元,并且以一个方向流动,通过隐含层,直到所有信息到达输出单元。

2 .一种非常有趣的结构类型是recurrent神经网络,information can flow round in cycles. 这种网络能够记住信息一段时间,They can exhibit all sorts of interesting oscillations but they are much more difficult to train in part because they are so much more complicated in what they can do. 然而最近,人们在训练recurrent神经网络上有了很大进展,他们现在能做令人印象深刻的事。

3. 最后一种结构是symmetrically-connected 网络,在这种网络中,权值是相同的in both directions between two units.

feet-forward 神经网络

These are the commonest type of neural network in practical applications. 最常用
– The first layer is the input and the last layer is the output. 第一层是输入,最后一层是输出
– If there is more than one hidden layer, we call them “deep” neural networks. 如果多于一层的隐藏层,我们就称为“深度”神经网络
• They compute a series of transformations between their input an output. So at each layer, you get a new representation of the input in which things that were similar in the previous layer may have become less similar, or things that were dissimilar in the previous layer may become more similar. So in speech recongntion, for example, we‘d like the same thing said by different speakers to become more similar, and different thing said by the same speaker to be less similar as we go up through the layers of the network.
– In order to achieve this, we need the activities of the neurons in each layer to be a non-linear function of the activities in the layer below.

Recurrent networks
他比前向网络强大得多They have directed cycles in their connection graph.在他们的连接图中是有方向的循环
That means you can sometimes get back to where you started by following the arrows. 这意味着你有时想回到起点,沿着这些箭头即可
They can have complicated dynamics and this can make them very difficult to train.
– There is a lot of interest at present in finding efficient ways of training recurrent nets. 目前在寻找训练recurren网络高效方法上引起了很多的兴趣

They are more biologically realistic. 同时他们是基于生物学现实的

Recurrent nets with multiple hidden layers are just a special case of a general recurrent neural net that has some of its hidden to hidden connections missing.

Recurrent neural networks for modeling sequences
Recurrent neural networks are a very naturalway to model sequential data: 非常适合对序列数据建模, 我们需要做的是在隐含单元之间建立联系
– They are equivalent to very deep nets withone hidden layer per time slice. 隐含单元表现的像网络,very deep in time.

所以每个时间步长隐含单元的状态决定了下一时刻时间步长隐含单元状态

– Except that they use the same weights at every time slice and they get input at every time slice.

他们区别与前向网络的一个方面是我们在每个时间步长使用相同的权值。所以下图红色箭头,隐含单元正在决定下时刻的隐含单元。红色箭头描述权值矩阵,在每个时间步长都是相同的

• They have the ability to remember information in their hidden state for a long time.

他们在每个时间也获得输入,在每个时间戳也给出输出,这些也都使用相同的权值矩阵。他们能在隐藏单元储存信息很长一段时间  
– But its very hard to train them to use this potential.

An example of what recurrent neural nets can now do
Ilya Sutskever (2011) trained a special type of recurrent neural net to predict the next character in a sequence. So llya trained it on lots and lots of strings from English Wikipedia. It‘s seeing English characters and trying to predict the next English character. He actually used 86 different characters to allow for punctuation, and digits , and capital letters and so on. After you trained it, one way of seeing how well it can do is to see whether it assigns high probability to the next character that actually occurs. Another way of seeing get it to generate text. So what you do is you give it a string of characters and get it to predit probabilitites for the next character.Then you pick the next character from that probability distribution. It‘s no use picking the most likely character. If you do that after a while it starts saying the United States of the United States of the United States of the United States of the United States. That tells you something about Wikipedia. 

Some text generated one character at a time by Ilya Sutskever’s recurrent neural network
In 1974 Northern Denver had been overshadowed by CNL, and several Irish intelligence agencies in the Mediterranean
region. However, on the Victoria, Kings Hebrew stated that Charles decided to
escape during an alliance. The mansion house was completed in 1882, the second in
its bridge are omitted, while closing is the proton reticulum composed below it aims,
such that it is the blurring of appearing on any well-paid type of box printer.

symmetrically-connected 网络

与recurrent网络相似,These are like recurrent networks, but the connections between units are symmetrical (they have the same weight in both directions)
– John Hopfield (and others) realized that symmetric networks are much easier to analyze than recurrent networks.
– They are also more restricted in what they can do. because they obey an energy function
   For example, they cannot model cycles.
• Symmetrically connected nets without hidden units are called “Hopfield nets”
Symmetrically connected networks with hidden units
• These are called “Boltzmann machines”.
– They are much more powerful models than Hopfield nets.
– They are less powerful than recurrent neural networks.
– They have a beautifully simple learning algorithm.
• We will cover Boltzmann machines towards the end of the course

时间: 2024-10-09 23:31:18

机器学习中使用的神经网络(六) --第二课的相关文章

七月算法-12月机器学习在线班--第十六次课笔记—采样和变分

七月算法-12月机器学习--第十六次课笔记—采样和变分 七月算法(julyedu.com)12月机器学习在线班学习笔记http://www.julyedu.com 第一部分 采样 引言 为什么要研究采样? 根据采样结果估算分布的参数,完成参数学习. 前提:模型已经存在,但参数未知: 方法:通过采样的方式,获得一定数量的样本,从而学习该系统的参数. 1 采样算法 现需要对概率密度函数f(x)的参数进行估计,若已知的某概率密度函数g(x)容易采样获得其样本,可以如何估计f(x)的参数? g(x)很容

机器学习中使用的神经网络(七)

A geometrical view of perceptron 感知器的几何视图 Weight-space 权值空间 在这个空间中,每一个感知器中的权值都表示一维,而空间中的一点则代表了所有权值的特定集合,假设消除阈值,则每个训练样本都可以看做通过起点的超平面.So, points in the space correspond to weight vectors and training cases correspond to planes. 也就是说,空间中的点对应权值向量,超平面则对应训

机器学习中使用的神经网络第六讲笔记

Geoffery Hinton教授的Neuron Networks for Machine Learning的第六讲介绍了随机梯度下降法(SGD),并且介绍了加快学习速度的动量方法(the momentum method).针对网络中每一个连接的自适应学习步长(adaptive learning rates for each connection)和RMSProp算法. 这几个算法的难度很大,需要反复推理思考,并在实践中摸索以加深理解. Overview of mini-batch gradie

机器学习中使用的神经网络第五讲笔记

Geoffery Hinton教授的Neuron Networks for Machine Learning的第五讲主要介绍物体识别问题的难点及克服这些难点的方法,重点介绍了数字识别和物体识别中使用的卷积网络. Why object recognition is difficult 我们知道识别真实场景中的物体是很困难的,这一小节我们来介绍造成这些困难的一些东西. Segmentation: 在一个图像中,我们很难将其中的一个物体与其他的物体分隔开.在现实生活中,我们人类有两只眼睛且我们身体可以

机器学习中使用的神经网络第十讲笔记

Geoffery Hinton教授的Neuron Networks for Machine Learning的第十讲介绍了如何combine模型,并进一步从实际应用的角度介绍了完全贝叶斯方法. Why it helps to combine models 这一小节,我们讨论在做预测时为什么要结合许多模型.使用多个模型可以在拟合真正的规则和拟合样本错误之间做一个很好的折中. 我们已经知道,当训练数据比较少时容易出现过拟合,如果我们平均一下许多不同模型的预测,那我们就可以降低过拟合的程度.对于回归来

机器学习中使用的神经网络第七讲

这一节主要介绍了RNN和LSTM. Modeling sequences: A brief overview 在这一小节,我们将对应用于序列(sequences)的不同类型的模型做一个概括.我们从最简单的模型--ultra aggressive models(该模型尝试根据前一个序列(term or sequence)去预测下一个序列)开始,接着再讨论该模型使用了隐含层的复杂变形,然后再介绍更多有着hidden state和hidden dynamics的模型(这其中包含了linear dyna

机器学习中使用的神经网络第四讲笔记

Geoffery Hinton教授的Neuron Networks for Machine Learning的第四讲主要介绍如何使用back propagation算法来学习到词汇的特征表示. Learning to predict the next word 接下来的几小节主要介绍如何使用back propagation算法来学习到词汇的特征表示.我们从一个很简单的例子开始,介绍使用back propagation算法来将词汇间的相关信息转换成特征向量. 下图给出了一个家庭的树状图,我们要做的

机器学习中使用的神经网络(一)

本系列博客根据Geoffrey Hinton课程Neural Network for Machine Learning总结. 课程网址为: https://www.coursera.org/course/neuralnets 1.Some examples of tasks best solved by learning 机器学习最适用的领域举例 - Recognizing patterns: 模式识别– Objects in real scenes 物体识别 – Facial identiti

机器学习中使用的神经网络第九讲笔记

Geoffery Hinton教授的Neuron Networks for Machine Learning的第八讲为可选部分,好像很难,这里就先跳过了,以后有用的时候再回来补.第九讲介绍了如何避免过拟合,提高模型的泛化能力. 这是Cousera上的课程链接 Overview of ways to improve generalization 这一小节,我们介绍如何在网络有过多能力处理过多容量的训练数据集时,如何通过降低过拟合来提高网络模型的泛化能力.下面将介绍几种控制网络容量的方法,以及如何设