Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3.2

3、Spark MLlib Deep Learning Convolution Neural Network(深度学习-卷积神经网络)3.2

http://blog.csdn.net/sunbow0

第三章Convolution Neural Network (卷积神经网络)

2基础及源码解析

2.1 Convolution Neural Network卷积神经网络基础知识

1)基础知识:

自行google,百度,基础方面的非常多,随便看看就可以,只是很多没有把细节说得清楚和明白;

能把细节说清楚了讲明白了,可以参照下面2个文章,前提条件是你得先要有基础性的了解;

2)重点参照:

http://www.cnblogs.com/fengfenggirl/p/cnn_implement.html

http://www.cnblogs.com/tornadomeet/archive/2013/05/05/3061457.html

2.2 Deep Learning CNN源码解析

2.2.1 CNN代码结构

CNN源码主要包括:CNN,CNNModel两个类,源码结构如下:

CNN结构:

CNNModel结构:

2.2.2 CNN训练过程

2.2.3 CNN解析

(1) CNNLayers

/**

* types:网络层类别

* outputmaps:特征map数量

* kernelsize:卷积核k大小

* k: 卷积核

* b: 偏置

* dk:
卷积核的偏导

* db:
偏置的偏导

* scale: pooling大小

*/

caseclassCNNLayers(

types:String,

outputmaps:Double,

kernelsize:Double,

scale:Double,

k:Array[Array[BDM[Double]]],

b: Array[Double],

dk:Array[Array[BDM[Double]]],

db:Array[Double])extends Serializable

CNNLayers:自定义数据类型,存储网络每一层的参数信息。

(2) CnnSetup

卷积神经网络参数初始化,根据参数逐层构建CNN网络。

/** 卷积神经网络层参数初始化. */

defCnnSetup: (Array[CNNLayers], BDM[Double], BDM[Double], Double) = {

varinputmaps1=1.0

varmapsize1=mapsize

varconfinit= ArrayBuffer[CNNLayers]()

for(l <-0 tolayer -1)
{// layer

valtype1=types(l)

valoutputmap1=outputmaps(l)

valkernelsize1=kernelsize(l)

valscale1=scale(l)

vallayersconf=if(type1=="s"){//每一层参数初始化

mapsize1 =mapsize1 /scale1

valb1 = Array.fill(inputmaps1.toInt)(0.0)

valki = Array(Array(BDM.zeros[Double](1,1)))

new CNNLayers(type1,outputmap1,kernelsize1,scale1,ki,b1,ki,b1)

} elseif(type1=="c"){

mapsize1 =mapsize1 -kernelsize1+1.0

valfan_out =outputmap1* math.pow(kernelsize1,2)

valfan_in =inputmaps1* math.pow(kernelsize1,2)

valki = ArrayBuffer[Array[BDM[Double]]]()

for (i <-0toinputmaps1.toInt-1)
{// input map

valkj = ArrayBuffer[BDM[Double]]()

for (j <-0tooutputmap1.toInt-1)
{// output map

valkk = (BDM.rand[Double](kernelsize1.toInt,kernelsize1.toInt)-0.5)*2.0*
sqrt(6.0/ (fan_in+fan_out))

kj +=kk

}

ki +=kj.toArray

}

valb1 = Array.fill(outputmap1.toInt)(0.0)

inputmaps1 =outputmap1

new CNNLayers(type1,outputmap1,kernelsize1,scale1,ki.toArray,b1,ki.toArray,b1)

} else{

valki = Array(Array(BDM.zeros[Double](1,1)))

valb1 = Array(0.0)

new CNNLayers(type1,outputmap1,kernelsize1,scale1,ki,b1,ki,b1)

}

confinit+=layersconf

}

valfvnum=mapsize1(0,0)
* mapsize1(0,1) *inputmaps1

valffb= BDM.zeros[Double](onum,1)

valffW= (BDM.rand[Double](onum,fvnum.toInt)-0.5)*2.0*
sqrt(6.0/ (onum+fvnum))

(confinit.toArray,ffb,ffW,alpha)

}

(3) expand

克罗内克积方法。

/**

* 克罗内克积

*

*/

defexpand(a: BDM[Double],s: Array[Int]): BDM[Double]= {

// val a = BDM((1.0, 2.0), (3.0,4.0), (5.0, 6.0))

// val s = Array(3, 2)

valsa = Array(a.rows, a.cols)

vartt =new Array[Array[Int]](sa.length)

for(ii <-sa.length -1 to0
by -1) {

varh =BDV.zeros[Int](sa(ii) * s(ii))

h(0 tosa(ii) * s(ii) -1
by s(ii)) :=1

tt(ii) = Accumulate(h).data

}

varb = BDM.zeros[Double](tt(0).length,tt(1).length)

for(j1 <-0 tob.rows
-1) {

for(j2 <-0 tob.cols
-1) {

b(j1,j2) = a(tt(0)(j1)
-1, tt(1)(j2) -1)

}

}

b

}

(4) convn

卷积计算方法。

/**

* convn卷积计算

*/

defconvn(m0: BDM[Double],k0: BDM[Double],shape: String): BDM[Double]=
{

//val m0 = BDM((1.0, 1.0, 1.0, 1.0),(0.0, 0.0, 1.0, 1.0), (0.0, 1.0, 1.0, 0.0), (0.0, 1.0, 1.0, 0.0))

//val k0 = BDM((1.0, 1.0), (0.0,1.0))

//val m0 = BDM((1.0, 1.0, 1.0),(1.0, 1.0, 1.0), (1.0, 1.0, 1.0))

//val k0 = BDM((1.0, 2.0, 3.0),(4.0, 5.0, 6.0), (7.0, 8.0, 9.0))

valout1= shapematch{

case"valid"=>

valm1 = m0

valk1 = k0.t

valrow1 =m1.rows -k1.rows
+1

valcol1 =m1.cols -k1.cols
+1

varm2 = BDM.zeros[Double](row1,col1)

for (i <-0torow1-1)
{

for (j <-0tocol1-1)
{

valr1 =i

valr2 =r1 +k1.rows
-1

valc1 =j

valc2 =c1 +k1.cols
-1

valmi =m1(r1 tor2,c1
toc2)

m2(i,j) = (mi :*k1).sum

}

}

m2

case"full"=>

varm1 = BDM.zeros[Double](m0.rows +2
* (k0.rows -1), m0.cols +2 * (k0.cols -1))

for (i <-0to m0.rows-1)
{

for (j <-0to m0.cols-1)
{

m1((k0.rows -1) +i, (k0.cols -1)
+j) = m0(i,j)

}

}

valk1 = Rot90(Rot90(k0))

valrow1 =m1.rows -k1.rows
+1

valcol1 =m1.cols -k1.cols
+1

varm2 = BDM.zeros[Double](row1,col1)

for (i <-0torow1-1)
{

for (j <-0tocol1-1)
{

valr1 =i

valr2 =r1 +k1.rows
-1

valc1 =j

valc2 =c1 +k1.cols
-1

valmi =m1(r1 tor2,c1
toc2)

m2(i,j) = (mi :*k1).sum

}

}

m2

}

out1

}

(5) CNNtrain

对神经网络进行训练。

输入参数:train_d 训练RDD数据,opts训练参数。

输出:CNNModel,训练模型。

/**

* 运行卷积神经网络算法.

*/

defCNNtrain(train_d: RDD[(BDM[Double], BDM[Double])], opts: Array[Double]):CNNModel = {

valsc =train_d.sparkContext

varinitStartTime= System.currentTimeMillis()

varinitEndTime= System.currentTimeMillis()

// 参数初始化配置

var(cnn_layers,cnn_ffb,cnn_ffW,cnn_alpha)=
CnnSetup

// 样本数据划分:训练数据、交叉检验数据

valvalidation= opts(2)

valsplitW1= Array(1.0-validation,validation)

valtrain_split1= train_d.randomSplit(splitW1, System.nanoTime())

valtrain_t=train_split1(0)

valtrain_v=train_split1(1)

// m:训练样本的数量

valm =train_t.count

// 计算batch的数量

valbatchsize= opts(0).toInt

valnumepochs= opts(1).toInt

valnumbatches= (m /batchsize).toInt

varrL = Array.fill(numepochs *numbatches.toInt)(0.0)

varn =0

// numepochs是循环的次数

for(i <-1 tonumepochs) {

initStartTime= System.currentTimeMillis()

valsplitW2= Array.fill(numbatches)(1.0 /numbatches)

//
根据分组权重,随机划分每组样本数据

for(l <-1 tonumbatches) {

//
权重

valbc_cnn_layers =sc.broadcast(cnn_layers)

valbc_cnn_ffb =sc.broadcast(cnn_ffb)

valbc_cnn_ffW =sc.broadcast(cnn_ffW)

//
样本划分

valtrain_split2 =train_t.randomSplit(splitW2, System.nanoTime())

valbatch_xy1 =train_split2(l -1)

// CNNff是进行前向传播

// net =cnnff(net, batch_x);

valtrain_cnnff = CNN.CNNff(batch_xy1,bc_cnn_layers,bc_cnn_ffb,bc_cnn_ffW)

// CNNbp是后向传播

// net =cnnbp(net, batch_y);

valtrain_cnnbp = CNN.CNNbp(train_cnnff,bc_cnn_layers,bc_cnn_ffb,bc_cnn_ffW)

//
权重更新

//  net =cnnapplygrads(net,opts);

valtrain_nnapplygrads = CNN.CNNapplygrads(train_cnnbp,bc_cnn_ffb,bc_cnn_ffW,cnn_alpha)

cnn_ffW =train_nnapplygrads._1

cnn_ffb =train_nnapplygrads._2

cnn_layers =train_nnapplygrads._3

// error and loss

//
输出误差计算

// net.L = 1/2* sum(net.e(:) .^ 2) / size(net.e, 2);

valrdd_loss1 =train_cnnbp._1.map(f => f._5)

val (loss2,counte)=rdd_loss1.treeAggregate((0.0,0L))(

seqOp = (c, v) => {

// c: (e, count), v: (m)

vale1 = c._1

vale2 = (v :* v).sum

valesum =e1 +e2

(esum, c._2 +1)

},

combOp = (c1, c2) => {

// c: (e, count)

vale1 = c1._1

vale2 = c2._1

valesum =e1 +e2

(esum, c1._2 + c2._2)

})

valLoss = (loss2/counte.toDouble)*0.5

if (n ==0) {

rL(n) =Loss

} else {

rL(n) =0.09*rL(n -1)
+0.01 *Loss

}

n =n +1

}

initEndTime= System.currentTimeMillis()

//
打印输出结果

printf("epoch: numepochs = %d , Took = %dseconds; batch train mse = %f.\n",i,
scala.math.ceil((initEndTime -initStartTime).toDouble /1000).toLong,rL(n -1))

}

// 计算训练误差及交叉检验误差

// Full-batch trainmse

varloss_train_e=0.0

varloss_val_e=0.0

loss_train_e= CNN.CNNeval(train_t,sc.broadcast(cnn_layers),sc.broadcast(cnn_ffb),sc.broadcast(cnn_ffW))

if(validation>0)loss_val_e = CNN.CNNeval(train_v,sc.broadcast(cnn_layers),sc.broadcast(cnn_ffb),sc.broadcast(cnn_ffW))

printf("epoch: Full-batch train mse = %f, valmse = %f.\n",loss_train_e,loss_val_e)

newCNNModel(cnn_layers,cnn_ffW,cnn_ffb)

}

(6) CNNff

前向传播计算,计算每层输出,从输入层->隐含层->输出层,计算每一层每一个节点的输出值。

输入参数:

batch_xy1 样本数据

bc_cnn_layers 每层的参数

bc_cnn_ffb 偏置参数

bc_cnn_ffW 权重参数

输出:

每一层的计算结果。

/**

* cnnff是进行前向传播

* 计算神经网络中的每个节点的输出值;

*/

defCNNff(

batch_xy1: RDD[(BDM[Double], BDM[Double])],

bc_cnn_layers: org.apache.spark.broadcast.Broadcast[Array[CNNLayers]],

bc_cnn_ffb: org.apache.spark.broadcast.Broadcast[BDM[Double]],

bc_cnn_ffW: org.apache.spark.broadcast.Broadcast[BDM[Double]]):RDD[(BDM[Double], Array[Array[BDM[Double]]], BDM[Double], BDM[Double])] = {

// 第1层:a(1)=[x]

valtrain_data1= batch_xy1.map { f =>

vallable= f._1

valfeatures= f._2

valnna1= Array(features)

valnna= ArrayBuffer[Array[BDM[Double]]]()

nna+=nna1

(lable,nna)

}

// 第2至n-1层计算

valtrain_data2=train_data1.map{ f =>

vallable= f._1

valnn_a= f._2

varinputmaps1=1.0

valn =bc_cnn_layers.value.length

// for each layer

for(l <-1 ton -1)
{

valtype1 = bc_cnn_layers.value(l).types

valoutputmap1 = bc_cnn_layers.value(l).outputmaps

valkernelsize1 = bc_cnn_layers.value(l).kernelsize

valscale1 = bc_cnn_layers.value(l).scale

valk1 = bc_cnn_layers.value(l).k

valb1 = bc_cnn_layers.value(l).b

valnna1 = ArrayBuffer[BDM[Double]]()

if (type1 =="c"){

for (j <-0tooutputmap1.toInt-1)
{// output map

// createtemp output map

varz = BDM.zeros[Double](nn_a(l -1)(0).rows
-kernelsize1.toInt +
1, nn_a(l -1)(0).cols -kernelsize1.toInt
+ 1)

for (i <-0toinputmaps1.toInt-1)
{// input map

// convolve with corresponding kernel and add to temp outputmap

// z = z + convn(net.layers{l - 1}.a{i}, net.layers{l}.k{i}{j},‘valid‘);

z = z + convn(nn_a(l -1)(i),k1(i)(j),"valid")

}

// add bias, pass through nonlinearity

// net.layers{l}.a{j} =sigm(z + net.layers{l}.b{j})

valnna0 = sigm(z +b1(j))

nna1 +=nna0

}

nn_a +=nna1.toArray

inputmaps1 =outputmap1

} elseif (type1=="s"){

for (j <-0toinputmaps1.toInt-1)
{

// z =convn(net.layers{l - 1}.a{j}, ones(net.layers{l}.scale) /(net.layers{l}.scale ^ 2), ‘valid‘); replace with variable

// net.layers{l}.a{j} = z(1 : net.layers{l}.scale : end, 1 :net.layers{l}.scale : end, :);

valz = convn(nn_a(l -1)(j),
BDM.ones[Double](scale1.toInt,scale1.toInt) / (scale1 *scale1),"valid")

valzs1 =z(::,0
to -1 byscale1.toInt).t +0.0

valzs2 =zs1(::,0
to -1 byscale1.toInt).t +0.0

valnna0 =zs2

nna1 +=nna0

}

nn_a +=nna1.toArray

}

}

// concatenate all end layer feature mapsinto vector

valnn_fv1= ArrayBuffer[Double]()

for(j <-0 tonn_a(n
-1).length -1) {

nn_fv1 ++=nn_a(n -1)(j).data

}

valnn_fv=newBDM[Double](nn_fv1.length,1,nn_fv1.toArray)

// feedforward into outputperceptrons

// net.o =sigm(net.ffW * net.fv +repmat(net.ffb,1, size(net.fv, 2)));

valnn_o= sigm(bc_cnn_ffW.value *nn_fv + bc_cnn_ffb.value)

(lable,nn_a.toArray,nn_fv,nn_o)

}

train_data2

}

(7) CNNbp

后向传播计算,计算每层导数,输出层->隐含层->输入层,计算每个节点的偏导数,也即误差反向传播。

输入参数:

train_cnnff 前向计算结果

bc_cnn_layers 每层的参数

bc_cnn_ffb 偏置参数

bc_cnn_ffW 权重参数

输出:

每一层的偏导数计算结果。

/**

* CNNbp是后向传播

* 计算权重的平均偏导数

*/

defCNNbp(

train_cnnff: RDD[(BDM[Double], Array[Array[BDM[Double]]], BDM[Double],BDM[Double])],

bc_cnn_layers: org.apache.spark.broadcast.Broadcast[Array[CNNLayers]],

bc_cnn_ffb: org.apache.spark.broadcast.Broadcast[BDM[Double]],

bc_cnn_ffW: org.apache.spark.broadcast.Broadcast[BDM[Double]]):(RDD[(BDM[Double], Array[Array[BDM[Double]]], BDM[Double], BDM[Double],BDM[Double], BDM[Double], BDM[Double], Array[Array[BDM[Double]]])],BDM[Double], BDM[Double],
Array[CNNLayers]) = {

// error : net.e = net.o - y

valn =bc_cnn_layers.value.length

valtrain_data3= train_cnnff.map { f =>

valnn_e= f._4 - f._1

(f._1, f._2, f._3, f._4,nn_e)

}

// backprop deltas

// 输出层的灵敏度或者残差

// net.od = net.e .* (net.o .* (1 - net.o))

// net.fvd = (net.ffW‘ * net.od)

valtrain_data4=train_data3.map{ f =>

valnn_e= f._5

valnn_o= f._4

valnn_fv= f._3

valnn_od=nn_e:* (nn_o:* (1.0-nn_o))

valnn_fvd=if(bc_cnn_layers.value(n -1).types
=="c") {

// net.fvd = net.fvd .* (net.fv .* (1 - net.fv));

valnn_fvd1 = bc_cnn_ffW.value.t *nn_od

valnn_fvd2 =nn_fvd1:* (nn_fv:* (1.0-nn_fv))

nn_fvd2

} else{

valnn_fvd1 = bc_cnn_ffW.value.t *nn_od

nn_fvd1

}

(f._1, f._2, f._3, f._4, f._5,nn_od,nn_fvd)

}

// reshape feature vector deltas intooutput map style

valsa1=train_data4.map(f=> f._2(n
-1)(1)).take(1)(0).rows

valsa2=train_data4.map(f=> f._2(n
-1)(1)).take(1)(0).cols

valsa3=1

valfvnum=sa1*sa2

valtrain_data5=train_data4.map{ f =>

valnn_a= f._2

valnn_fvd= f._7

valnn_od= f._6

valnn_fv= f._3

varnnd=newArray[Array[BDM[Double]]](n)

valnnd1= ArrayBuffer[BDM[Double]]()

for(j <-0 tonn_a(n
-1).length -1) {

valtmp1 =nn_fvd((j *fvnum)
to ((j +1) *fvnum -1),0)

valtmp2 =newBDM(sa1,sa2,tmp1.data)

nnd1 +=tmp2

}

nnd(n -1) =nnd1.toArray

for(l <- (n -2) to0
by -1) {

valtype1 = bc_cnn_layers.value(l).types

varnnd2 = ArrayBuffer[BDM[Double]]()

if (type1 =="c"){

for (j <-0tonn_a(l).length
-1) {

valtmp_a =nn_a(l)(j)

valtmp_d =nnd(l +1)(j)

valtmp_scale = bc_cnn_layers.value(l +1).scale.toInt

valtmp1 =tmp_a:* (1.0-tmp_a)

valtmp2 = expand(tmp_d,Array(tmp_scale,tmp_scale))/
(tmp_scale.toDouble*tmp_scale)

nnd2 += (tmp1 :*tmp2)

}

} elseif (type1=="s"){

for (i <-0tonn_a(l).length
-1) {

varz = BDM.zeros[Double](nn_a(l)(0).rows,nn_a(l)(0).cols)

for (j <-0tonn_a(l
+1).length -1) {

// z = z + convn(net.layers{l + 1}.d{j}, rot180(net.layers{l +1}.k{i}{j}), ‘full‘);

z = z + convn(nnd(l +1)(j),Rot90(Rot90(bc_cnn_layers.value(l
+1).k(i)(j))),"full")

}

nnd2 +=z

}

}

nnd(l) =nnd2.toArray

}

(f._1, f._2, f._3, f._4, f._5, f._6,
f._7, nnd)

}

// dk db calcgradients

varcnn_layers= bc_cnn_layers.value

for(l <-1 ton -1)
{

valtype1= bc_cnn_layers.value(l).types

vallena1=train_data5.map(f=> f._2(l).length).take(1)(0)

vallena2=train_data5.map(f=> f._2(l
-1).length).take(1)(0)

if(type1=="c"){

for (j <-0tolena1-1)
{

for (i <-0tolena2-1)
{

valrdd_dk_ij =train_data5.map{ f =>

valnn_a = f._2

valnn_d = f._8

valtmp_d =nn_d(l)(j)

valtmp_a =nn_a(l -1)(i)

convn(Rot90(Rot90(tmp_a)),tmp_d,"valid")

}

valinitdk = BDM.zeros[Double](rdd_dk_ij.take(1)(0).rows,rdd_dk_ij.take(1)(0).cols)

val (dk_ij,count_dk)=rdd_dk_ij.treeAggregate((initdk,0L))(

seqOp = (c, v) => {

// c: (m, count), v: (m)

valm1 = c._1

valm2 =m1 + v

(m2, c._2 +1)

},

combOp = (c1, c2) => {

// c: (m, count)

valm1 = c1._1

valm2 = c2._1

valm3 =m1 +m2

(m3, c1._2 + c2._2)

})

valdk =dk_ij/count_dk.toDouble

cnn_layers(l).dk(i)(j) =dk

}

valrdd_db_j =train_data5.map{ f =>

valnn_d = f._8

valtmp_d =nn_d(l)(j)

Bsum(tmp_d)

}

valdb_j =rdd_db_j.reduce(_+ _)

valcount_db =rdd_db_j.count

valdb =db_j/count_db.toDouble

cnn_layers(l).db(j) =db

}

}

}

// net.dffW = net.od * (net.fv)‘ /size(net.od, 2);

// net.dffb = mean(net.od, 2);

valtrain_data6=train_data5.map{ f =>

valnn_od= f._6

valnn_fv= f._3

nn_od*nn_fv.t

}

valtrain_data7=train_data5.map{ f =>

valnn_od= f._6

nn_od

}

valinitffW= BDM.zeros[Double](bc_cnn_ffW.value.rows, bc_cnn_ffW.value.cols)

val(ffw2,countfffw2)=train_data6.treeAggregate((initffW,0L))(

seqOp = (c, v) => {

// c: (m, count), v: (m)

valm1 = c._1

valm2 =m1 + v

(m2, c._2 +1)

},

combOp = (c1, c2) => {

// c: (m, count)

valm1 = c1._1

valm2 = c2._1

valm3 =m1 +m2

(m3, c1._2 + c2._2)

})

valcnn_dffw=ffw2/countfffw2.toDouble

valinitffb= BDM.zeros[Double](bc_cnn_ffb.value.rows, bc_cnn_ffb.value.cols)

val(ffb2,countfffb2)=train_data7.treeAggregate((initffb,0L))(

seqOp = (c, v) => {

// c: (m, count), v: (m)

valm1 = c._1

valm2 =m1 + v

(m2, c._2 +1)

},

combOp = (c1, c2) => {

// c: (m, count)

valm1 = c1._1

valm2 = c2._1

valm3 =m1 +m2

(m3, c1._2 + c2._2)

})

valcnn_dffb=ffb2/countfffb2.toDouble

(train_data5,cnn_dffw,cnn_dffb,cnn_layers)

}

(8) CNNapplygrads

权重更新。

输入参数:

train_cnnbp:CNNbp输出值

bc_cnn_ffb:神经网络偏置参数

bc_cnn_ffW:神经网络权重参数

alpha:更新的学习率

输出参数:(cnn_ffW, cnn_ffb, cnn_layers)更新后权重参数。

/**

* NNapplygrads是权重更新

* 权重更新

*/

defCNNapplygrads(

train_cnnbp: (RDD[(BDM[Double], Array[Array[BDM[Double]]], BDM[Double],BDM[Double], BDM[Double], BDM[Double], BDM[Double],Array[Array[BDM[Double]]])], BDM[Double], BDM[Double], Array[CNNLayers]),

bc_cnn_ffb: org.apache.spark.broadcast.Broadcast[BDM[Double]],

bc_cnn_ffW: org.apache.spark.broadcast.Broadcast[BDM[Double]],

alpha: Double): (BDM[Double], BDM[Double], Array[CNNLayers]) = {

valtrain_data5= train_cnnbp._1

valcnn_dffw= train_cnnbp._2

valcnn_dffb= train_cnnbp._3

varcnn_layers= train_cnnbp._4

varcnn_ffb= bc_cnn_ffb.value

varcnn_ffW= bc_cnn_ffW.value

valn =cnn_layers.length

for(l <-1 ton -1)
{

valtype1=cnn_layers(l).types

vallena1=train_data5.map(f=> f._2(l).length).take(1)(0)

vallena2=train_data5.map(f=> f._2(l
-1).length).take(1)(0)

if(type1=="c"){

for (j <-0tolena1-1)
{

for (ii <-0tolena2-1)
{

cnn_layers(l).k(ii)(j) =cnn_layers(l).k(ii)(j)
-cnn_layers(l).dk(ii)(j)

}

cnn_layers(l).b(j) =cnn_layers(l).b(j)
-cnn_layers(l).db(j)

}

}

}

cnn_ffW=cnn_ffW+cnn_dffw

cnn_ffb=cnn_ffb+cnn_dffb

(cnn_ffW,cnn_ffb,cnn_layers)

}

(9) CNNeval

误差计算。

/**

* nneval是进行前向传播并计算输出误差

* 计算神经网络中的每个节点的输出值,并计算平均误差;

*/

defCNNeval(

batch_xy1: RDD[(BDM[Double], BDM[Double])],

bc_cnn_layers: org.apache.spark.broadcast.Broadcast[Array[CNNLayers]],

bc_cnn_ffb: org.apache.spark.broadcast.Broadcast[BDM[Double]],

bc_cnn_ffW: org.apache.spark.broadcast.Broadcast[BDM[Double]]): Double ={

// CNNff是进行前向传播

valtrain_cnnff= CNN.CNNff(batch_xy1, bc_cnn_layers, bc_cnn_ffb, bc_cnn_ffW)

// error and loss

// 输出误差计算

valrdd_loss1=train_cnnff.map{ f =>

valnn_e= f._4 - f._1

nn_e

}

val(loss2,counte)=rdd_loss1.treeAggregate((0.0,0L))(

seqOp = (c, v) => {

// c: (e, count), v: (m)

vale1 = c._1

vale2 = (v :* v).sum

valesum =e1 +e2

(esum, c._2 +1)

},

combOp = (c1, c2) => {

// c: (e, count)

vale1 = c1._1

vale2 = c2._1

valesum =e1 +e2

(esum, c1._2 + c2._2)

})

valLoss= (loss2/counte.toDouble)*0.5

Loss

}

2.2.4 CNNModel解析

(1) CNNModel

CNNModel:存储CNN网络参数,包括:cnn_layers每一层的配置参数,cnn_ffW权重,dbn_b偏置,cnn_ffb偏置。

class CNNModel(

valcnn_layers:Array[CNNLayers],

valcnn_ffW:BDM[Double],

valcnn_ffb: BDM[Double])extends Serializable {

}

(2) predict

predict:根据模型进行预测计算。

/**

* 返回预测结果

*  返回格式:(label, feature, predict_label, error)

*/

defpredict(dataMatrix: RDD[(BDM[Double], BDM[Double])]): RDD[PredictCNNLabel] = {

valsc =dataMatrix.sparkContext

valbc_cnn_layers=sc.broadcast(cnn_layers)

valbc_cnn_ffW=sc.broadcast(cnn_ffW)

valbc_cnn_ffb=sc.broadcast(cnn_ffb)

// CNNff是进行前向传播

valtrain_cnnff= CNN.CNNff(dataMatrix,bc_cnn_layers,bc_cnn_ffb,bc_cnn_ffW)

valrdd_predict=train_cnnff.map{ f =>

vallabel= f._1

valnna1= f._2(0)(0)

valnnan= f._4

valerror= f._4 - f._1

PredictCNNLabel(label,nna1,nnan,error)

}

rdd_predict

}

(3) Loss

Loss:根据预测结果计算误差。

/**

* 计算输出误差

* 平均误差;

*/

defLoss(predict: RDD[PredictCNNLabel]): Double = {

valpredict1= predict.map(f => f.error)

// error and loss

// 输出误差计算

valloss1=predict1

val(loss2,counte)=loss1.treeAggregate((0.0,0L))(

seqOp = (c, v) => {

// c: (e, count), v: (m)

vale1 = c._1

vale2 = (v :* v).sum

valesum =e1 +e2

(esum, c._2 +1)

},

combOp = (c1, c2) => {

// c: (e, count)

vale1 = c1._1

vale2 = c2._1

valesum =e1 +e2

(esum, c1._2 + c2._2)

})

valLoss= (loss2/counte.toDouble)*0.5

Loss

}

转载请注明出处:

http://blog.csdn.net/sunbow0

版权声明:本文为博主原创文章,未经博主允许不得转载。

时间: 2024-10-23 13:39:39

Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3.2的相关文章

Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3.1

3.Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3.1 http://blog.csdn.net/sunbow0 Spark MLlib Deep Learning工具箱,是根据现有深度学习教程<UFLDL教程>中的算法,在SparkMLlib中的实现.具体Spark MLlib Deep Learning(深度学习)目录结构: 第一章Neural Net(NN) 1.源码 2.源码解析 3.实例 第二章D

Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3.3

3.Spark MLlib Deep Learning Convolution Neural Network(深度学习-卷积神经网络)3.3 http://blog.csdn.net/sunbow0 第三章Convolution Neural Network (卷积神经网络) 3实例 3.1 测试数据 按照上例数据,或者新建图片识别数据. 3.2 CNN实例 //2 测试数据 Logger.getRootLogger.setLevel(Level.WARN) valdata_path="/use

Spark MLlib Deep Learning Neural Net(深度学习-神经网络)1.1

Spark MLlib Deep Learning Neural Net(深度学习-神经网络)1.1 http://blog.csdn.net/sunbow0/ Spark MLlib Deep Learning工具箱,是根据现有深度学习教程<UFLDL教程>中的算法,在SparkMLlib中的实现.具体Spark MLlib Deep Learning(深度学习)目录结构: 第一章Neural Net(NN) 1.源码 2.源码解析 3.实例 第二章Deep Belief Nets(DBNs

Spark MLlib Deep Learning Neural Net(深度学习-神经网络)1.2

Spark MLlib Deep Learning Neural Net(深度学习-神经网络)1.2 http://blog.csdn.net/sunbow0/ 第一章Neural Net(神经网络) 2基础及源码解析 2.1 Neural Net神经网络基础知识 2.1.1 神经网络 基础知识参照: http://deeplearning.stanford.edu/wiki/index.php/%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C 2.1.2 反向传导算法

Spark MLlib Deep Learning Neural Net(深度学习-神经网络)1.3

Spark MLlib Deep Learning Neural Net(深度学习-神经网络)1.3 http://blog.csdn.net/sunbow0/ 第一章Neural Net(神经网络) 3实例 3.1 测试数据 3.1.1 测试函数 采用智能优化算法的经典测试函数,如下: (1)Sphere Model 函数表达式 搜索范围 全局最优值 函数简介:此函数为非线性的对称单峰函数,不同维之间是不可分离的.此函数相对比较简单,大多数算法都能够轻松地达到优化效果,其主要用于测试算法的寻优

Spark MLlib Deep Learning Deep Belief Network (深度学习-深度信念网络)2.3

Spark MLlib Deep Learning Deep Belief Network (深度学习-深度信念网络)2.3 http://blog.csdn.net/sunbow0 第二章Deep Belief Network (深度信念网络) 3实例 3.1 測试数据 依照上例数据,或者新建图片识别数据. 3.2 DBN实例 //****************例2(读取固定样本:来源于经典优化算法測试函数Sphere Model)***********// //2 读取样本数据 Logge

Spark MLlib Deep Learning Deep Belief Network (深度学习-深度信念网络)2.2

Spark MLlib Deep Learning Deep Belief Network (深度学习-深度信念网络)2.2 http://blog.csdn.net/sunbow0 第二章Deep Belief Network (深度信念网络) 2基础及源代码解析 2.1 Deep Belief Network深度信念网络基础知识 1)综合基础知识參照: http://tieba.baidu.com/p/2895759455   http://wenku.baidu.com/link?url=

《卷积神经网络的Python实现》PDF代码+《解析深度学习卷积神经网络原理与视觉实践》PDF分析

CNN正在革新几个应用领域,如视觉识别系统.自动驾驶汽车.医学发现.创新电子商务等.需要在专业项目或个人方案中利用复杂的图像和视频数据集来实现先进.有效和高效的CNN模型. 深度卷积网络DCNN是目前十分流行的深度神经网络架构,它的构造清晰直观,效果引人入胜,在图像.视频.语音.语言领域都有广泛应用. 深度学习,特别是深度卷积神经网络是人工智能的重要分支领域,卷积神经网络技术也被广泛应用于各种现实场景,在许多问题上都取得了超越人类智能的结果. <卷积神经网络的Python实现>作为深度学习领域

Deep learning与Neural Network

该文章转自深度学习微信公众号 深度学习是机器学习研究中的一个新的领域,其动机在于建立.模拟人脑进行分析学习的神经网络,它模仿人脑的机制来解释数据,例如图像,声音和文本.深度学习是无监督学习的一种. 深度学习的概念源于人工神经网络的研究.含多隐层的多层感知器就是一种深度学习结构.深度学习通过组合低层特征形成更加抽象的高层表示属性类别或特征,以发现数据的分布式特征表示. Deep learning本身算是machine learning的一个分支,简单可以理解为neural network的发展.大