
Training a neural network with neuralnet


The neural network is constructed with an interconnected group of nodes, which involves the

input, connected weights, processing element, and output. Neural networks can be applied to

many areas, such as classification, clustering, and prediction. To train a neural network in R,

you can use neuralnet, which is built to train multilayer perceptron in the context of regression

analysis, and contains many flexible functions to train forward neural networks. In this recipe,

we will introduce how to use neuralnet to train a neural network.


Getting ready

In this recipe, we will use an  iris dataset as our example dataset. We will first split the  iris

dataset into a training and testing datasets, respectively.


How to do it...

Perform the following steps to train a neural network with neuralnet:


  1. First load the  iris dataset and split the data into training and testing datasets:


> data(iris)

> ind = sample(2, nrow(iris), replace = TRUE, prob=c(0.7, 0.3))

> trainset = iris[ind == 1,]

> testset = iris[ind == 2,]

2. Then, install and load the  neuralnet package:


> install.packages("neuralnet")

> library(neuralnet)

3. Add the columns versicolor, setosa, and virginica based on the name matched value

in the  Species column:


> trainset$setosa = trainset$Species == "setosa"

> trainset$virginica = trainset$Species == "virginica"

> trainset$versicolor = trainset$Species == "versicolor"

4. Next, train the neural network with the  neuralnet function with three hidden

neurons in each layer. Notice that the results may vary with each training, so you

might not get the same result. However, you can use set.seed at the beginning, so

you can get the same result in every training process


> network = neuralnet(versicolor + virginica + setosa~ Sepal.

Length + Sepal.Width + Petal.Length + Petal.Width, trainset,


> network

Call: neuralnet(formula = versicolor + virginica + setosa ~ Sepal.

Length + Sepal.Width + Petal.Length + Petal.Width, data =

trainset, hidden = 3)

1 repetition was calculated.

Error Reached Threshold Steps

1 0.8156100175 0.009994274769 11063

5. Now, you can view the  summary information by accessing the  result.matrix

attribute of the built neural network model:


> network$result.matrix

error 0.815610017474

reached.threshold 0.009994274769

steps 11063.000000000000

Intercept.to.1layhid1 1.686593311644

Sepal.Length.to.1layhid1 0.947415215237

Sepal.Width.to.1layhid1 -7.220058260187

Petal.Length.to.1layhid1 1.790333443486

Petal.Width.to.1layhid1 9.943109233330

Intercept.to.1layhid2 1.411026063895

Sepal.Length.to.1layhid2 0.240309549505

Sepal.Width.to.1layhid2 0.480654059973

Petal.Length.to.1layhid2 2.221435192437

Petal.Width.to.1layhid2 0.154879347818

Intercept.to.1layhid3 24.399329878242

Sepal.Length.to.1layhid3 3.313958088512

Sepal.Width.to.1layhid3 5.845670010464

Petal.Length.to.1layhid3 -6.337082722485

Petal.Width.to.1layhid3 -17.990352566695

Intercept.to.versicolor -1.959842102421

1layhid.1.to.versicolor 1.010292389835

1layhid.2.to.versicolor 0.936519720978

1layhid.3.to.versicolor 1.023305801833

Intercept.to.virginica -0.908909982893

1layhid.1.to.virginica -0.009904635231

1layhid.2.to.virginica 1.931747950462

1layhid.3.to.virginica -1.021438938226

Intercept.to.setosa 1.500533827729

1layhid.1.to.setosa -1.001683936613

1layhid.2.to.setosa -0.498758815934

1layhid.3.to.setosa -0.001881935696

  1. Lastly, you can view the generalized weight by accessing it in the network:


> head(network$generalized.weights[[1]])

How it works...

The neural network is a network made up of artificial neurons (or nodes). There are three

types of neurons within the network: input neurons, hidden neurons, and output neurons.

In the network, neurons are connected; the connection strength between neurons is called

weights. If the weight is greater than zero, it is in an excitation status. Otherwise, it is in an

inhibition status. Input neurons receive the input information; the higher the input value, the

greater the activation. Then, the activation value is passed through the network in regard to

weights and transfer functions in the graph. The hidden neurons (or output neurons) then

sum up the activation values and modify the summed values with the transfer function. The

activation value then flows through hidden neurons and stops when it reaches the output

nodes. As a result, one can use the output value from the output neurons to classify the data.


The advantages of a neural network are: first, it can detect nonlinear relationships between

the dependent and independent variable. Second, one can efficiently train large datasets

using the parallel architecture. Third, it is a nonparametric model so that one can eliminate

errors in the estimation of parameters. The main disadvantages of a neural network are that

it often converges to the local minimum rather than the global minimum. Also, it might over-fit

when the training process goes on for too long.


In this recipe, we demonstrate how to train a neural network. First, we split the  iris dataset

into training and testing datasets, and then install the  neuralnet package and load the

library into an R session. Next, we add the columns  versicolor ,  setosa , and  virginica

based on the name matched value in the  Species column, respectively. We then use the

neuralnet function to train the network model. Besides specifying the label (the column

where the name equals to versicolor, virginica, and setosa) and training attributes in the

function, we also configure the number of hidden neurons (vertices) as three in each layer.


Then, we examine the basic information about the training process and the trained network

saved in the network. From the output message, it shows the training process needed

11,063 steps until all the absolute partial derivatives of the error function were lower than

0.01 (specified in the threshold). The error refers to the likelihood of calculating Akaike

Information Criterion (AIC). To see detailed information on this, you can access the  result.

matrix of the built neural network to see the estimated weight. The output reveals that the

estimated weight ranges from -18 to 24.40; the intercepts of the first hidden layer are 1.69,

1.41 and 24.40, and the two weights leading to the first hidden neuron are estimated as 0.95

( Sepal.Length ), -7.22 ( Sepal.Width ), 1.79 ( Petal.Length ), and 9.94 ( Petal.Width ).

We can lastly determine that the trained neural network information includes generalized

weights, which express the effect of each covariate. In this recipe, the model generates

12 generalized weights, which are the combination of four covariates ( Sepal.Length ,

Sepal.Width ,  Petal.Length ,  Petal.Width ) to three responses ( setosa ,  virginica ,

versicolor ).


See also

For a more detailed introduction on neuralnet, one can refer to the following paper:

Günther, F., and Fritsch, S. (2010). neuralnet: Training of neural networks. The R

journal, 2(1), 30-38


Visualizing a neural network trained by neuralnet


The package,  neuralnet , provides the  plot function to visualize a built neural network and

the  gwplot function to visualize generalized weights. In following recipe, we will cover how to

use these two functions.


Getting ready

You need to have completed the previous recipe by training a neural network and have all

basic information saved in the network.


How to do it...

Perform the following steps to visualize the neural network and the generalized weights:

  1. You can visualize the trained neural network with the  plot function:


> plot(network)

2. Furthermore, you can use gwplot to visualize the generalized weights: > par(mfrow=c(2,2)) > gwplot(network,selected.covariate="Petal.Width") > gwplot(network,selected.covariate="Sepal.Width") > gwplot(network,selected.covariate="Petal.Length") > gwplot(network,selected.covariate="Petal.Width")

How it works...

In this recipe, we demonstrate how to visualize the trained neural network and the generalized

weights of each trained attribute. As per Figure 10, the plot displays the network topology of

the trained neural network. Also, the plot includes the estimated weight, intercepts and basic

information about the training process. At the bottom of the figure, one can find the overall

error and number of steps required to converge.


Figure 11 presents the generalized weight plot in regard to  network$generalized.weights .

The four plots in Figure 11 display the four covariates:  Petal.Width ,  Sepal.Width ,  Petal.

Length , and  Petal.Width , in regard to the versicolor response. If all the generalized weights

are close to zero on the plot, it means the covariate has little effect. However, if the overall

variance is greater than one, it means the covariate has a nonlinear effect.


See also

For more information about  gwplot , one can use the  help function to access the

following document:


> ?gwplot

Predicting labels based on a model trainedby neuralnet


Similar to other classification methods, we can predict the labels of new observations based

on trained neural networks. Furthermore, we can validate the performance of these networks

through the use of a confusion matrix. In the following recipe, we will introduce how to use

the  compute function in a neural network to obtain a probability matrix of the testing dataset

labels, and use a table and confusion matrix to measure the prediction performance.


Getting ready

You need to have completed the previous recipe by generating the training dataset,  trainset ,

and the testing dataset,  testset . The trained neural network needs to be saved in the network.


How to do it...

Perform the following steps to measure the prediction performance of the trained neural



1. First, generate a prediction probability matrix based on a trained neural network and

the testing dataset,  testset :


> net.predict = compute(network, testset[-5])$net.result

2. Then, obtain other possible labels by finding the column with the greatest probability:


> net.prediction = c("versicolor", "virginica", "setosa")

[apply(net.predict, 1, which.max)]

3. Generate a classification table based on the predicted labels and the labels of the

testing dataset:


> predict.table = table(testset$Species, net.prediction)

> predict.table


setosa versicolor virginica

setosa 20 0 0

versicolor 0 19 1

virginica 0 2 16

  1. Next, generate  classAgreement from the classification table:


> classAgreement(predict.table)


[1] 0.9444444444


[1] 0.9154488518


[1] 0.9224318658


[1] 0.8248251737

5. Finally, use  confusionMatrix to measure the prediction performance:


> confusionMatrix(predict.table)

Confusion Matrix and Statistics


setosa versicolor virginica

setosa 20 0 0

versicolor 0 19 1

virginica 0 2 16

Overall Statistics

Accuracy : 0.9482759

95% CI : (0.8561954, 0.9892035)

No Information Rate : 0.362069

P-Value [Acc > NIR] : < 0.00000000000000022204

Kappa : 0.922252

Mcnemar‘s Test P-Value : NA

Statistics by Class:

Class: setosa Class: versicolor Class:


Sensitivity 1.0000000 0.9047619


Specificity 1.0000000 0.9729730


Pos Pred Value 1.0000000 0.9500000


Neg Pred Value 1.0000000 0.9473684


Prevalence 0.3448276 0.3620690


Detection Rate 0.3448276 0.3275862


Detection Prevalence 0.3448276 0.3448276


Balanced Accuracy 1.0000000 0.9388674


How it works...

In this recipe, we demonstrate how to predict labels based on a model trained by neuralnet.

Initially, we use the  compute function to create an output probability matrix based on the

trained neural network and the testing dataset. Then, to convert the probability matrix to class

labels, we use the  which.max function to determine the class label by selecting the column

with the maximum probability within the row. Next, we use a table to generate a classification

matrix based on the labels of the testing dataset and the predicted labels. As we have

created the classification table, we can employ a confusion matrix to measure the prediction

performance of the built neural network.


See also

In this recipe, we use the  net.result function, which is the overall result of

the neural network, used to predict the labels of the testing dataset. Apart from

examining the overall result by accessing  net.result , the  compute function also

generates the output from neurons in each layer. You can examine the output of

neurons to get a better understanding of how  compute works:


> compute(network, testset[-5])

Training a neural network with nnet

The  nnet package is another package that can deal with artificial neural networks. This

package provides the functionality to train feed-forward neural networks with traditional

back propagation. As you can find most of the neural network function implemented in

the  neuralnet package, in this recipe we provide a short overview of how to train neural

networks with  nnet .


Getting ready

In this recipe, we do not use the  trainset and  trainset generated from the previous step;

please reload the  iris dataset again.


How to do it...

Perform the following steps to train the neural network with  nnet :


1. First, install and load the  nnet package:

> install.packages("nnet")

> library(nnet)

2. Next, split the dataset into training and testing datasets:

> data(iris)

> set.seed(2)

> ind = sample(2, nrow(iris), replace = TRUE, prob=c(0.7, 0.3))

> trainset = iris[ind == 1,]

> testset = iris[ind == 2,]

3. Then, train the neural network with  nnet :

> iris.nn = nnet(Species ~ ., data = trainset, size = 2, rang =

0.1, decay = 5e-4, maxit = 200)

# weights: 19

initial value 165.086674

iter 10 value 70.447976

iter 20 value 69.667465

iter 30 value 69.505739

iter 40 value 21.588943

iter 50 value 8.691760

iter 60 value 8.521214

iter 70 value 8.138961

ter 80 value 7.291365

iter 90 value 7.039209

iter 100 value 6.570987

iter 110 value 6.355346

iter 120 value 6.345511

iter 130 value 6.340208

iter 140 value 6.337271

iter 150 value 6.334285

iter 160 value 6.333792

iter 170 value 6.333578

iter 180 value 6.333498

final value 6.333471


4. Use the  summary to obtain information about the trained neural network:

> summary(iris.nn)

a 4-2-3 network with 19 weights

options were - softmax modelling decay=0.0005

b->h1 i1->h1 i2->h1 i3->h1 i4->h1

-0.38 -0.63 -1.96 3.13 1.53

b->h2 i1->h2 i2->h2 i3->h2 i4->h2

8.95 0.52 1.42 -1.98 -3.85

b->o1 h1->o1 h2->o1

3.08 -10.78 4.99

b->o2 h1->o2 h2->o2

-7.41 6.37 7.18

b->o3 h1->o3 h2->o3

4.33 4.42 -12.16

How it works...

In this recipe, we demonstrate steps to train a neural network model with the  nnet package.

We first use  nnet to train the neural network. With this function, we can set the classification

formula, source of data, number of hidden units in the  size parameter, initial random

weight in the  rang parameter, parameter for weight decay in the  decay parameter, and the

maximum iteration in the  maxit parameter. As we set  maxit to 200, the training process

repeatedly runs till the value of the fitting criterion plus the decay term converge. Finally, we

use the  summary function to obtain information about the built neural network, which reveals

that the model is built with 4-2-3 networks with 19 weights. Also, the model shows a list of

weight transitions from one node to another at the bottom of the printed message.


See also

For those who are interested in the background theory of  nnet and how it is made, please

refer to the following articles:


f Ripley, B. D. (1996) Pattern Recognition and Neural Networks. Cambridge

f Venables, W. N., and Ripley, B. D. (2002). Modern applied statistics with S. Fourth

edition. Springer

Predicting labels based on a model trained by nnet


As we have trained aneural network with nnet in the previous recipe,we can now predict the labels of the testing dataset based on the trained neural network


Furthermore,we can assess the model with a confusion matrix adapted from the caret package.


Getting ready

You need to have completed the previous recipe by generating the training dataset,trainset,and the testing dataset, testset, from their is dataset.


The trained neural network also needs to be saved as iris.nn.


How to do it...














































How it works...

Similar to other classiication methods,one can also predict labels based on the neural networks trained by nnet.First,we use the predict function to generate the predicted labels based on a testing dataset, testset.Within the predict function,we specify the type argument to the class,so the output will be class labels in stead of a probability matrix.Next,we use the table function to generate a classification table based on predicted labels and labels written in the testing dataset.Finally,as we have created the classification table,we can employ a confusion matrix from the caret package to measure the prediction performance of the trained neural network.


See also

For the predict function,if the type argument to class is not speciied,by default,it will generate a probability matrix as a prediction result,which isvery similar to net.result generated from the compute function within the neuralnet package:



结论:神经网络是由大量处理单元(神经元)相互连接而成的网络,ANN(Artificial Neural Network)是生物神经系统的一种抽象、简化和模拟。神经网络的信息处理是通过神经元的相互作用来实现的,知识与信息的存储表现在网络元件互连的分布式结构与联系,神经网络的学习与识别就是神经元连接权系数的动态演化过程。实践中常用的基本神经网络模型有:感知器(perceptron)神经网络、线性神经(AdalinePerceptron)网络、BP神经网络、径向基神经网络、自组织神经网络、Hopfield反馈神经网络等。



时间: 2024-11-01 01:56:27



Visualizing an SVM fit 格式化SVM To visualize the built model, one can first use the plot function to generate a scatter plot of data input and the SVM fit. In this plot, support vectors and classes are highlighted through the color symbol. In addition

Gradle 1.12 翻译——第十六章. 使用文件

有关其他已翻译的章节请关注Github上的项目:https://github.com/msdx/gradledoc/tree/1.12,或访问:http://gradledoc.qiniudn.com/1.12/userguide/userguide.html 本文原创,转载请注明出处:http://blog.csdn.net/maosidiaoxian/article/details/41113353 关于我对Gradle的翻译,以Github上的项目及http://gradledoc.qin

《Introduction to Tornado》中文翻译计划——第五章:异步Web服务

http://www.pythoner.com/294.html 本文为<Introduction to Tornado>中文翻译,将在https://github.com/alioth310/itt2zh上面持续更新,本文内容可能不是最新状态,请在GitHub上获得最新版本. 本文也可在http://demo.pythoner.com/itt2zh上进行格式化的预览. 第五章:异步Web服务 到目前为止,我们已经看到了许多使Tornado成为一个Web应用强有力框架的功能.它的简单性.易用性

ASP.NET MVC with Entity Framework and CSS一书翻译系列文章之第六章:管理产品图片:多对多关系(上)

这章介绍了怎样创建一个新的实体来管理图片,怎样使用HTML窗体来上传图片文件和使用多对多关系来使它们与产品相关,并且怎样来保存图片到文件系统中.这章也介绍了更多复杂的错误处理增加客户端错误到模型中为了把它们显示回给用户.在这章中播种数据库使用的产品图片可能在在第六章的从Apress网页站点下载代码中. 注意:如果你想遵从这章的代码,你必须完成第五章的代码或者从www.apress.com下载第五章的源代码作为一个起点. 创建实体保存图片文件名 这个项目,我们正要使用文件系统在Web项目中存储图片

《Entity Framework 6 Recipes》中文翻译系列 (30) ------ 第六章 继承与建模高级应用之多对多关联

翻译的初衷以及为什么选择<Entity Framework 6 Recipes>来学习,请看本系列开篇 第六章  继承与建模高级应用 现在,你应该对实体框架中基本的建模有了一定的了解,本章将帮助你解决许多常见的.复杂的建模问题,并解决你可能在现实中遇到的建模问题. 本章以多对多关系开始,这个类型的关系,无论是在现存系统还是新项目的建模中都非常普遍.接下来,我们会了解自引用关系,并探索获取嵌套对象图的各种策略.最后,本章以继承的高级建模和实体条件结束. 6-1  获取多对多关联中的链接表 问题

翻译经典之《Cisco Lan Switching》第六章(八):Mastering the show spantree Command

[版权声明:原创翻译文章,翻译水平有限,错误在所难免,翻译作者对文章中存在的错误或遗漏所造成后果不承担任何责任,请谨慎转载.转载请保留本声明及出处:blog.csdn.net/shallnet ,下载该书英文版] Catalyst平台交换机上STP最重要的命令是show spantree.虽然该命令提供很多有用的参数,但这一节只讨论其基本的语法(完整详细语法见第七章).示例6-1为图6-6示例中交换机Cat-B使用show spantree命令输出的例子. Example 6-1. show s

翻译经典之《Cisco Lan Switching》第六章(二): What Is Spanning Tree and Why Use Spanning Tree?

[版权声明:原创翻译文章,翻译水平有限,错误在所难免,翻译作者对文章中存在的错误或遗漏所造成后果不承担任何责任,请谨慎转载.转载请保留本声明及出处:blog.csdn.net/shallnet ,下载该书英文版] 在最基本的情景中,生成树协议(STP)是一个环路保护协议,它允许网桥之间通过相互通信去发现网络中的物理环,然后网桥通过指定的算法是网络形成一个逻辑上无环的拓扑,也就是说,STP创建一个由叶子和树干构成的无环树形结构网络.在后面的章节将讲述各个网桥之间如何通信以及STP算法如何工作. 在

Gradle 1.12用户指南翻译——第三十六章. Sonar Runner 插件

本文由CSDN博客万一博主翻译,其他章节的翻译请参见: http://blog.csdn.net/column/details/gradle-translation.html 翻译项目请关注Github上的地址: https://github.com/msdx/gradledoc/tree/1.12. 直接浏览双语版的文档请访问: http://gradledoc.qiniudn.com/1.12/userguide/userguide.html. 另外,Android 手机用户可通过我写的一个

《Entity Framework 6 Recipes》中文翻译系列 (37) ------ 第六章 继承与建模高级应用之独立关联与外键关联

翻译的初衷以及为什么选择<Entity Framework 6 Recipes>来学习,请看本系列开篇 6-13  在基类中应用条件 问题 你想从一个已存在的模型中的实体派生一个新的实体,允许基类被实例化. 解决方案 假设你有如图6-20所示的模型. 图6-20 包含Invoice实体的模型 这个模型只包含一个单独的实体Invoice(发货单).我们想从Invoice派生一个新的实体,它表示删除掉的发货单.这将允许我们以更清晰的业务逻辑来分别对有效的发货单和已删除掉的发货进行不同的操作.按下面