jrae源码解析（二）

本文细述上文引出的RAECost和SoftmaxCost两个类。

SoftmaxCost

我们已经知道，SoftmaxCost类在给定features和label的情况下（超参数给定），衡量给定权重（$hidden\times catSize$）的误差值$cost$,并指出当前的权重梯度。看代码。

@Override
	public double valueAt(double[] x)
	{
		if( !requiresEvaluation(x) )
			return value;
		int numDataItems = Features.columns;

		int[] requiredRows = ArraysHelper.makeArray(0, CatSize-2);
		ClassifierTheta Theta = new ClassifierTheta(x,FeatureLength,CatSize);
		DoubleMatrix Prediction = getPredictions (Theta, Features);

		double MeanTerm = 1.0 / (double) numDataItems;
		double Cost = getLoss (Prediction, Labels).sum() * MeanTerm;
		double RegularisationTerm = 0.5 * Lambda * DoubleMatrixFunctions.SquaredNorm(Theta.W);

		DoubleMatrix Diff = Prediction.sub(Labels).muli(MeanTerm);
	    DoubleMatrix Delta = Features.mmul(Diff.transpose());

	    DoubleMatrix gradW = Delta.getColumns(requiredRows);
	    DoubleMatrix gradb = ((Diff.rowSums()).getRows(requiredRows));

	    //Regularizing. Bias does not have one.
	    gradW = gradW.addi(Theta.W.mul(Lambda));

	    Gradient = new ClassifierTheta(gradW,gradb);
	    value = Cost + RegularisationTerm;
	    gradient = Gradient.Theta;
		return value;
	}

public DoubleMatrix getPredictions (ClassifierTheta Theta, DoubleMatrix Features)    {        int numDataItems = Features.columns;        DoubleMatrix Input = ((Theta.W.transpose()).mmul(Features)).addColumnVector(Theta.b);        Input = DoubleMatrix.concatVertically(Input, DoubleMatrix.zeros(1,numDataItems));        return Activation.valueAt(Input);     }

是个典型的2层神经网络，没有隐层，首先根据features预测labels，预测结果用softmax归一化，然后根据误差反向传播算出权重梯度。

此处增加200字。

这个典型的2层神经网络，label为一列向量，目标label置1，其余为0；转换函数为softmax函数，输出为每个label的概率。

计算cost的函数为getLoss，假设目标label的预测输出为$p^*$，则每个样本的cost也即误差函数为：

$$cost=E(p^*)=-\log(p^*)$$

根据前述的神经网络后向传播算法，我们得到($j$为目标label时，否则为0)：

$$\frac{\partial E}{\partial w_{ij}}=\frac{\partial E}{\partial p_j}\frac{\partial h_j}{\partial net_j}x_i=-\frac{1}{p_j}p_j(1-p_j)x_i=-(1-p_j)x_i=-(label_j-p_j)feature_i$$

因此我们便理解了下面代码的含义：

DoubleMatrix Delta = Features.mmul(Diff.transpose());

RAECost

先看实现代码：

@Override
	public double valueAt(double[] x)
	{
		if(!requiresEvaluation(x))
			return value;

		Theta Theta1 = new Theta(x,hiddenSize,visibleSize,dictionaryLength);
		FineTunableTheta Theta2 = new FineTunableTheta(x,hiddenSize,visibleSize,catSize,dictionaryLength);
		Theta2.setWe( Theta2.We.add(WeOrig) );

		final RAEClassificationCost classificationCost = new RAEClassificationCost(
				catSize, AlphaCat, Beta, dictionaryLength, hiddenSize, Lambda, f, Theta2);
		final RAEFeatureCost featureCost = new RAEFeatureCost(
				AlphaCat, Beta, dictionaryLength, hiddenSize, Lambda, f, WeOrig, Theta1);

		Parallel.For(DataCell,
			new Parallel.Operation<LabeledDatum<Integer,Integer>>() {
				public void perform(int index, LabeledDatum<Integer,Integer> Data)
				{
					try {
						LabeledRAETree Tree = featureCost.Compute(Data);
						classificationCost.Compute(Data, Tree);
					} catch (Exception e) {
						System.err.println(e.getMessage());
					}
				}
		});

		double costRAE = featureCost.getCost();
		double[] gradRAE = featureCost.getGradient().clone();

		double costSUP = classificationCost.getCost();
		gradient = classificationCost.getGradient();

		value = costRAE + costSUP;
		for(int i=0; i<gradRAE.length; i++)
			gradient[i] += gradRAE[i];

		System.gc();	System.gc();
		System.gc();	System.gc();
		System.gc();	System.gc();
		System.gc();	System.gc();

		return value;
	}

cost由两部分组成，featureCost和classificationCost。程序遍历每个样本，用featureCost.Compute(Data)生成一个递归树，同时累加cost和gradient，然后用classificationCost.Compute(Data, Tree)根据生成的树计算并累加cost和gradient。因此关键类为RAEFeatureCost和RAEClassificationCost。

RAEFeatureCost类在Compute函数中调用RAEPropagation的ForwardPropagate函数生成一棵树，然后调用BackPropagate计算梯度并累加。具体的算法过程，下一章分解。

时间： 2024-10-13 03:02:54

jrae源码解析（二）

jrae源码解析（二）的相关文章

Spring 源码解析之HandlerAdapter源码解析(二)

chenglei1986/DatePicker源码解析(二)

erlang下lists模块sort（排序）方法源码解析(二)

AFNetworking2.0源码解析<二>

volley源码解析(二)--Request<T>类的介绍

Mybatis 源码解析(二) - Configuration.xml解析

jQuery 源码解析(二十五) DOM操作模块 html和text方法的区别

第37篇 Asp.Net源码解析(二)--详解HttpApplication

dmytrodanylyk/circular-progress-button源码解析(二)