本科毕业设计涉及用机器学习的方法训练预测模型,线性回归、SVM、RF等方法表现均不理想,于是需要用简单的神经网络方法做对比实验。在对NN的优化没有深入理解的情况下,直接调用了R包提供的接口,在此略作记录,供以后反思改进。
主要用到了nnet、neuralnet、h2o这几个包,具体的建模、预测、优化的方法在手册中均能查到。nnet、neuralnet提供的都是单隐藏层的简单神经网络,h2o提供了DNN的方法。
1 library(nnet) 2 data<-read.csv("tomcat_done_1.csv",header=T) 3 4 total_size <-363 5 test_size <- 90 6 7 train=sample(1:dim(data)[1],total_size-test_size) 8 9 train_set<- data[train,] 10 11 test<-data[-train,1:35] 12 13 test_effort<- data[-train,36] 14 15 count <-0 16 17 18 m<-nnet(Effort~.,train_set,size=9,decay=0.015,maxit=10,linout=T,trace=F,MaxNWts=8000) 19 20 preds <- predict(m,test)
1 library(neuralnet) 2 data<-read.csv("tomcat_done_2.csv",header=T) 3 4 total_size <-363 5 test_size <- 90 6 7 train=sample(1:dim(data)[1],total_size-test_size) 8 9 train_set<- data[train,] 10 11 test<-data[-train,1:35] 12 13 test_effort<- data[-train,36] 14 15 count <-0 16 17 18 m<-neuralnet(Effort~CountDeclClass+CountDeclClassMethod+CountDeclClassVariable 19 +CountDeclFunction+CountDeclInstanceMethod+CountDeclInstanceVariable 20 +CountDeclMethod+CountDeclMethodDefault+CountDeclMethodPrivate 21 +CountDeclMethodProtected+CountDeclMethodPublic+CountLine 22 +CountLineBlank+CountLineCode+CountLineCodeDecl+CountLineCodeExe 23 +CountLineComment+CountSemicolon+CountStmt+CountStmtDecl+CountStmtExe 24 +SumCyclomatic+SumCyclomaticModified+SumCyclomaticStrict+SumEssential 25 +MaxCyclomatic+MaxCyclomaticModified+MaxCyclomaticStrict+MaxEssential 26 +MaxNesting+AvgCyclomatic+AvgCyclomaticModified+AvgCyclomaticStrict 27 +AvgEssential+RatioCommentToCode,data = train_set,hidden = 2) 28 29 30 31 preds <- compute(m,test)
数据需要按照模型的格式要求进行预处理再输入,例如某些包要求label信息映射到[0,1]。多看手册以及原始论文了解优化方法,切记!
时间: 2024-10-31 09:29:08