Mahout的BreimanExample例子执行了
Leo Breiman: Random Forests. Machine Learning 45(1): 5-32 (2001)这篇论文的测试。
对它的分析我分为3个部分
- 森林生成的Iteration部分
- BreimanExample的测试执行部分
- 命令行执行部分
Iteration部分
迭代函数如下,对于训练数据集data,根据随机生成器rng随机将data分成训练集与测试集两部分,之后生成随机森林,并进行准确率检测。
/**
* runs one iteration of the procedure.
*
* @param rng
* random numbers generator
* @param data
* training data
* @param m
* number of random variables to select at each tree-node
* take m to be the first integer less than log2(M) + 1, where M is the number of attributes
* @param nbtrees
* number of trees to grow
*/
private void runIteration(Random rng, Data data, int m, int nbtrees)
1.数据集的构造
data是输入的数据集,不过并不会将data的全部都用来做训练,而是将它分为两部分:
第一部分train是训练集,用data克隆一下,咦,好奇怪啊,那不就和data一样了么?
第二部分test为测试集,它是从train中随机取出10%左右的数据构成,同时将这些数据从train中删除,具体是使用Data类的rsplit函数实现的。
Data train = data.clone();
Data test = train.rsplit(rng, (int) (data.size() * 0.1));
1.1Data类的rsplit函数
Data类位于org.apache.mahout.classifier.df.data.Data。
它的成员变量有两个,相关说明是
Holds a list of vectors and their corresponding Dataset
private final List<Instance> instances;
private final Dataset dataset;
rpslit函数如下,它从Data对象的instances变量存储的数据随机取出了subsize个放到新构造的subset集合中去,这subsize个数据也就从instances中给去掉了。
因为数据同出一源,所以最后返回的Data对象它的dataset和调用rsplit的Data对象的dataset是相同的。
/**
* Splits the data in two, returns one part, and this gets the rest of the data. <b>VERY SLOW!</b>
*/
public Data rsplit(Random rng, int subsize) {
List<Instance> subset = Lists.newArrayListWithCapacity(subsize);
for (int i = 0; i < subsize; i++) {
subset.add(instances.remove(rng.nextInt(instances.size())));
}
return new Data(dataset, subset);
}
1.2TreeBuilder
接着定义了决策树的构造器以及森林的构造器。
两个类分别位于
org.apache.mahout.classifier.df.builder.DefaultTreeBuilder;
org.apache.mahout.classifier.df.ref.SequentialBuilder;
/**
* Builds a Decision Tree <br>
* Based on the algorithm described in the "Decision Trees" tutorials by Andrew W. Moore, available at:<br>
* <br>
* http://www.cs.cmu.edu/~awm/tutorials
* <br><br>
* This class can be used when the criterion variable is the categorical attribute.
*/
DefaultTreeBuilder treeBuilder = new DefaultTreeBuilder();
/**
* Builds a Random Decision Forest using a given TreeBuilder to grow the trees
*/
SequentialBuilder forestBuilder = new SequentialBuilder(rng, treeBuilder, train);
接着用forestBuilder来构造一个随机森林。
/* grow a forest with m = log2(M)+1*/
treeBuilder.setM(m);
DecisionForest forestM = forestBuilder.build(nbtrees);
SequentialBuilder中的build函数如下,它循环使用bagging生成了nbTrees颗树。用trees记录各棵树的根节点。
public class SequentialBuilder {
private final Bagging bagging;
public DecisionForest build(int nbTrees) {
List<Node> trees = Lists.newArrayList();
for (int treeId = 0; treeId < nbTrees; treeId++) {
trees.add(bagging.build(rng));
logProgress(((float) treeId + 1) / nbTrees);
}
return new DecisionForest(trees);
}
}
1.3Bagging
bagging是如何建树的呢?
如下所示,先是用Data的bagging方法从数据中采样出一个训练集bag,之后用这个bag按照决策树的方法建树就好了。
/**
* Builds one tree
*/
public Node build(Random rng) {
log.debug("Bagging...");
Arrays.fill(sampled, false);
Data bag = data.bagging(rng, sampled);
log.debug("Building...");
return treeBuilder.build(rng, bag);
}
那么如何bagging采样呢?
如下所示,也即有放回从N个样本的数据集中随机采样N次,同一个数据可以多次采样,挺奇怪的这个sampled有什么用呢?
/**
* if data has N cases, sample N cases at random -but with replacement.
*
* @param sampled
* indicating which instance has been sampled
*
* @return sampled data
*/
public Data bagging(Random rng, boolean[] sampled) {
int datasize = size();
List<Instance> bag = Lists.newArrayListWithCapacity(datasize);
for (int i = 0; i < datasize; i++) {
int index = rng.nextInt(datasize);
bag.add(instances.get(index));
sampled[index] = true;
}
return new Data(dataset, bag);
}
接下来又按照m=1的方式再进行一次随机森林的生成
m表示的是number of attributes to select randomly at each node
// grow a forest with m=1
treeBuilder.setM(1);
time = System.currentTimeMillis();
log.info("Growing a forest with m=1");
DecisionForest forestOne = forestBuilder.build(nbtrees);
sumTimeOne += System.currentTimeMillis() - time;
numNodesOne += forestOne.nbNodes();
1.4测试森林的准确率
为两个森林(m= log2(M) + 1以及m=1分别生成的两个森林)进行准确率的测试。
首先得到测试集的标签集合testLabels,
接着定义树的预测集合predictions,它是二维数组,predictions[i][j]表示森林中第j棵树对第i个测试样本的预测。
forestM是按照m = log2(M) + 1方式创建的树,首先用这颗树进行一次预测填满预测集合predictions.
接着定义了一个sumPredictions数组,大小是预测集大小,sumPredictions[i]记录随机森林的所有决策树对第i个样本的预测之和。
注意,每颗树都有个权重值,所有树权重值相加为1,所以把所有树的预测结果相加即为森林预测最终结果.
ErrorEstimate类定义在org.apache.mahout.classifier.df.
它在作用是求错误率,也即sumPredictions与testLabels不相同的比例。
看了一下它的实现,发现还有个特殊情形,就是森林对一个样本没有预测的时候,则忽略这个样本。
if (predictions[index] == -1) {
continue; // instance not classified
}
最后用sumTestErrM将这个错误率累加,是因为一般准确率测试都会做N次取平均,这里也是一样的,最后运行时会做N次Iteration取平均。
// compute the test set error (Selection Error), and mean tree error (One Tree Error),
double[] testLabels = test.extractLabels();
double[][] predictions = new double[test.size()][];
forestM.classify(test, predictions);
double[] sumPredictions = new double[test.size()];
Arrays.fill(sumPredictions, 0.0);
for (int i = 0; i < predictions.length; i++) {
for (int j = 0; j < predictions[i].length; j++) {
sumPredictions[i] += predictions[i][j];
}
}
sumTestErrM += ErrorEstimate.errorRate(testLabels, sumPredictions);
forestOne.classify(test, predictions);
Arrays.fill(sumPredictions, 0.0);
for (int i = 0; i < predictions.length; i++) {
for (int j = 0; j < predictions[i].length; j++) {
sumPredictions[i] += predictions[i][j];
}
}
sumTestErrOne += ErrorEstimate.errorRate(testLabels, sumPredictions);
DecisionForest类的函数classify用来对data做预测。
它的实现如下:
首先给predictions[index]申请空间,大小即是树的大小啦。
然后每棵树都给一个预测,填上prediction。
/**
* Classifies the data and calls callback for each classification
*/
public void classify(Data data, double[][] predictions) {
Preconditions.checkArgument(data.size() == predictions.length, "predictions.length must be equal to data.size()");
if (data.isEmpty()) {
return; // nothing to classify
}
int treeId = 0;
for (Node tree : trees) {
for (int index = 0; index < data.size(); index++) {
if (predictions[index] == null) {
predictions[index] = new double[trees.size()];
}
predictions[index][treeId] = tree.classify(data.get(index));
}
treeId++;
}
}
Run部分
具体执行过程如下
因为这段代码最后会编译成jar文件执行,所以需要考虑输入参数。
输入参数如下,包括
- 数据集data
- 数据集描述dataset
- 森林中树的棵树nbTrees
- 循环运行次数iterations,虽然Iteration是迭代的意思,但这里并没有做迭代处理,我觉得就是循环
- 帮助选项help,显示使用说明的
Option dataOpt = obuilder.withLongName("data").withShortName("d").withRequired(true).withArgument(
abuilder.withName("path").withMinimum(1).withMaximum(1).create()).withDescription("Data path").create();
Option datasetOpt = obuilder.withLongName("dataset").withShortName("ds").withRequired(true).withArgument(
abuilder.withName("dataset").withMinimum(1).withMaximum(1).create()).withDescription("Dataset path")
.create();
Option nbtreesOpt = obuilder.withLongName("nbtrees").withShortName("t").withRequired(true).withArgument(
abuilder.withName("nbtrees").withMinimum(1).withMaximum(1).create()).withDescription(
"Number of trees to grow, each iteration").create();
Option nbItersOpt = obuilder.withLongName("iterations").withShortName("i").withRequired(true)
.withArgument(abuilder.withName("numIterations").withMinimum(1).withMaximum(1).create())
.withDescription("Number of times to repeat the test").create();
Option helpOpt = obuilder.withLongName("help").withDescription("Print out help").withShortName("h")
.create();
Group group = gbuilder.withName("Options").withOption(dataOpt).withOption(datasetOpt).withOption(
nbItersOpt).withOption(nbtreesOpt).withOption(helpOpt).create();
接着对输入进行解析处理,得到数据文件的输入路径,数据描述文件的输入路径以及树棵树,循环次数。
Path dataPath;
Path datasetPath;
int nbTrees;
int nbIterations;
try {
Parser parser = new Parser();
parser.setGroup(group);
CommandLine cmdLine = parser.parse(args);
if (cmdLine.hasOption("help")) {
CommandLineUtil.printHelp(group);
return -1;
}
String dataName = cmdLine.getValue(dataOpt).toString();
String datasetName = cmdLine.getValue(datasetOpt).toString();
nbTrees = Integer.parseInt(cmdLine.getValue(nbtreesOpt).toString());
nbIterations = Integer.parseInt(cmdLine.getValue(nbItersOpt).toString());
dataPath = new Path(dataName);
datasetPath = new Path(datasetName);
} catch (OptionException e) {
log.error("Error while parsing options", e);
CommandLineUtil.printHelp(group);
return -1;
}
接着载入数据,这里让我挺头疼的,如果单独运行这段代码,在我已经生成好data,dataset文件之后,它会提示错误,因为dataset里边存储的是json格式的描述,所以这里需要对json进行解析,也就需要相应的lib包。
// load the data
FileSystem fs = dataPath.getFileSystem(new Configuration());
Dataset dataset = Dataset.load(getConf(), datasetPath);
Data data = DataLoader.loadData(dataset, fs, dataPath);
载入数据之后就是运行Iteration啦。
代码中说M is the number of inputs,我觉得不合适,从代码中可以看出M = data.getDataset().nbAttributes(),明明就是属性数嘛!!
这里有个问题,属性数是否包括了label这一项呢?看了下代码,是的哦!
之后生成Iteration次随机森林,记录每次生成的结果,最后输出:
平均错误率(m = log2(M) + 1)
平均错误率(m = 1)
平均生成森林时间(m = log2(M) + 1)
平均生成森林时间(m = 1)
平均随机森林所有树节点之和(m = log2(M) + 1)
平均随机森林所有树节点之和(m = 1)
// take m to be the first integer less than log2(M) + 1, where M is the
// number of inputs
int m = (int) Math.floor(FastMath.log(2.0, data.getDataset().nbAttributes()) + 1);
Random rng = RandomUtils.getRandom();
for (int iteration = 0; iteration < nbIterations; iteration++) {
log.info("Iteration {}", iteration);
runIteration(rng, data, m, nbTrees);
}
log.info("********************************************");
log.info("Random Input Test Error : {}", sumTestErrM / nbIterations);
log.info("Single Input Test Error : {}", sumTestErrOne / nbIterations);
log.info("Mean Random Input Time : {}", DFUtils.elapsedTime(sumTimeM / nbIterations));
log.info("Mean Single Input Time : {}", DFUtils.elapsedTime(sumTimeOne / nbIterations));
log.info("Mean Random Input Num Nodes : {}", numNodesM / nbIterations);
log.info("Mean Single Input Num Nodes : {}", numNodesOne / nbIterations);
执行
数据
glass:#lass : http://archive.ics.uci.edu/ml/datasets/Glass+Identification
部分数据如下:
1,1.52101,13.64,4.49,1.10,71.78,0.06,8.75,0.00,0.00,1
2,1.51761,13.89,3.60,1.36,72.73,0.48,7.83,0.00,0.00,1
3,1.51618,13.53,3.55,1.54,72.99,0.39,7.78,0.00,0.00,1
4,1.51766,13.21,3.69,1.29,72.61,0.57,8.22,0.00,0.00,1
5,1.51742,13.27,3.62,1.24,73.08,0.55,8.07,0.00,0.00,1
6,1.51596,12.79,3.61,1.62,72.97,0.64,8.07,0.00,0.26,1
生成数据dataset的命令如下
[email protected]:/home/user/mahout-distribution-0.9# $HADOOP_HOME/bin/hadoop jar mahout-core-0.9-job.jar org.apache.mahout.classifier.df.tools.Describe -p /user/glass.data -f /user/glass.info -d I 9 N L
15/08/24 06:34:02 INFO tools.Describe: Generating the descriptor…
15/08/24 06:34:03 INFO tools.Describe: generating the dataset…
15/08/24 06:34:03 INFO tools.Describe: storing the dataset description
-p 输入data路径
-f 输出dataset路径
-d 数据描述,为I 9 N L
关于数据描述说明如下:
- 第一个是样本编号
- 接着9个是样本属性,都是Numerical类型
- 最后一个是样本的类标签
- 所以写成[I, 9, N, L]
- I表示为忽视,是ignore的缩写
- N是Numerical的缩写,L表示Label
- 当然如果维度中有非数值型的属性,也是可以的用C表示(Categorical的缩写)
- 9表示九个都是N
- 如果属性是这样的[Ignore,Numerical,Numerical,Categorical,Numerical,Categorical,Categorical,Label],那么–descriptor参数就应该写为下面的方式:[I,2,N,C,N,2,C,L]。
上面关于数据描述摘自http://running.iteye.com/blog/923483
数据描述文件具体如下,是用Json格式保存:
[email protected]:/home/user/mahout-distribution-0.9# hadoop dfs -cat /user/glass.info
[{“values”:null,”label”:false,”type”:”ignored”},{“values”:null,”label”:false,”type”:”numerical”},{“values”:null,”label”:false,”type”:”numerical”},{“values”:null,”label”:false,”type”:”numerical”},{“values”:null,”label”:false,”type”:”numerical”},{“values”:null,”label”:false,”type”:”numerical”},{“values”:null,”label”:false,”type”:”numerical”},{“values”:null,”label”:false,”type”:”numerical”},{“values”:null,”label”:false,”type”:”numerical”},{“values”:null,”label”:false,”type”:”numerical”},{“values”:[“1”,”2”,”3”,”5”,”6”,”7”],”label”:true,”type”:”categorical”}][email protected]:/home/user/mahout-distribution-0.9#
注意,数据文件要放在hdfs中,要不会出如下错误
15/08/24 06:32:11 INFO tools.Describe: Generating the descriptor…
15/08/24 06:32:12 INFO tools.Describe: generating the dataset…
Exception in thread “main” java.io.FileNotFoundException: File does not exist: /home/user/data/glass.data
另外,根据mahout版本不同,可能类的位置也会不同,比如我运行
[email protected]:/home/user/mahout-distribution-0.9# $HADOOP_HOME/bin/hadoop jar mahout-core-0.9-job.jar org.apache.mahout.df.tools.Describe -p /home/user/data/glass.data -f /home/user/data/glass.info -d I 9 N L
便会出现
Exception in thread “main” java.lang.ClassNotFoundException: org.apache.mahout.df.tools.Describe错误。
所以要根据自己mahout的版本查找下core文件夹里边Describe类的位置情况。
最后执行,可以看到依次进行了9次循环,每次循环对m=4,m=4分别生成了一次森林,最后则输出了各项平均结果。
[email protected]:/home/user/mahout-distribution-0.9# hadoop jar mahout-examples-0.9-job.jar org.apache.mahout.classifier.df.BreimanExample -d /user/glass.data -ds /user/glass.info -i 10 -t 100
15/08/24 07:14:03 INFO df.BreimanExample: Iteration 0
15/08/24 07:14:03 INFO df.BreimanExample: Splitting the data
15/08/24 07:14:03 INFO df.BreimanExample: Growing a forest with m=4
15/08/24 07:14:03 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:03 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:03 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:03 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:03 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:03 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:04 INFO df.BreimanExample: Growing a forest with m=1
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:04 INFO df.BreimanExample: Iteration 1
15/08/24 07:14:04 INFO df.BreimanExample: Splitting the data
15/08/24 07:14:04 INFO df.BreimanExample: Growing a forest with m=4
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:04 INFO df.BreimanExample: Growing a forest with m=1
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:04 INFO df.BreimanExample: Iteration 2
15/08/24 07:14:04 INFO df.BreimanExample: Splitting the data
15/08/24 07:14:04 INFO df.BreimanExample: Growing a forest with m=4
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:04 INFO df.BreimanExample: Growing a forest with m=1
15/08/24 07:14:04 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:05 INFO df.BreimanExample: Iteration 3
15/08/24 07:14:05 INFO df.BreimanExample: Splitting the data
15/08/24 07:14:05 INFO df.BreimanExample: Growing a forest with m=4
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:05 INFO df.BreimanExample: Growing a forest with m=1
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:05 INFO df.BreimanExample: Iteration 4
15/08/24 07:14:05 INFO df.BreimanExample: Splitting the data
15/08/24 07:14:05 INFO df.BreimanExample: Growing a forest with m=4
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:05 INFO df.BreimanExample: Growing a forest with m=1
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:05 INFO df.BreimanExample: Iteration 5
15/08/24 07:14:05 INFO df.BreimanExample: Splitting the data
15/08/24 07:14:05 INFO df.BreimanExample: Growing a forest with m=4
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:05 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:06 INFO df.BreimanExample: Growing a forest with m=1
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:06 INFO df.BreimanExample: Iteration 6
15/08/24 07:14:06 INFO df.BreimanExample: Splitting the data
15/08/24 07:14:06 INFO df.BreimanExample: Growing a forest with m=4
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:06 INFO df.BreimanExample: Growing a forest with m=1
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:06 INFO df.BreimanExample: Iteration 7
15/08/24 07:14:06 INFO df.BreimanExample: Splitting the data
15/08/24 07:14:06 INFO df.BreimanExample: Growing a forest with m=4
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:06 INFO df.BreimanExample: Growing a forest with m=1
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:06 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:07 INFO df.BreimanExample: Iteration 8
15/08/24 07:14:07 INFO df.BreimanExample: Splitting the data
15/08/24 07:14:07 INFO df.BreimanExample: Growing a forest with m=4
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:07 INFO df.BreimanExample: Growing a forest with m=1
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:07 INFO df.BreimanExample: Iteration 9
15/08/24 07:14:07 INFO df.BreimanExample: Splitting the data
15/08/24 07:14:07 INFO df.BreimanExample: Growing a forest with m=4
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:07 INFO df.BreimanExample: Growing a forest with m=1
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 10%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 20%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 30%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 40%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 50%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 60%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 70%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 80%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 90%
15/08/24 07:14:07 INFO ref.SequentialBuilder: Building 100%
15/08/24 07:14:07 INFO df.BreimanExample: **********************************
15/08/24 07:14:07 INFO df.BreimanExample: Random Input Test Error : 1.0
15/08/24 07:14:07 INFO df.BreimanExample: Single Input Test Error : 1.0
15/08/24 07:14:07 INFO df.BreimanExample: Mean Random Input Time : 0h 0m 0s 288
15/08/24 07:14:07 INFO df.BreimanExample: Mean Single Input Time : 0h 0m 0s 107
15/08/24 07:14:07 INFO df.BreimanExample: Mean Random Input Num Nodes : 6761
15/08/24 07:14:07 INFO df.BreimanExample: Mean Single Input Num Nodes : 11326
版权声明:本文为博主原创文章,未经博主允许不得转载。