资源描述:
《An Empirical Comparison of Supervised Learning Algorithms》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、AnEmpiricalComparisonofSupervisedLearningAlgorithmsRichCaruanacaruana@cs.cornell.eduAlexandruNiculescu-Mizilalexn@cs.cornell.eduDepartmentofComputerScience,CornellUniversity,Ithaca,NY14853USAAbstractThispaperpresentsresultsofalarge-scaleempiricalcomp
2、arisonoftensupervisedlearningalgorithmsus-Anumberofsupervisedlearningmethodsingeightperformancecriteria.Weevaluatetheperfor-havebeenintroducedinthelastdecade.Un-manceofSVMs,neuralnets,logisticregression,naivefortunately,thelastcomprehensiveempiri-bay
3、es,memory-basedlearning,randomforests,deci-calevaluationofsupervisedlearningwasthesiontrees,baggedtrees,boostedtrees,andboostedStatlogProjectintheearly90's.Wepresentstumpsonelevenbinaryclassicationproblemsusingalarge-scaleempiricalcomparisonbetweena
4、varietyofperformancemetrics:accuracy,F-score,tensupervisedlearningmethods:SVMs,Lift,ROCArea,averageprecision,precision/recallneuralnets,logisticregression,naivebayes,break-evenpoint,squarederror,andcross-entropy.memory-basedlearning,randomforests,de-
5、Foreachalgorithmweexaminecommonvariations,cisiontrees,baggedtrees,boostedtrees,andandthoroughlyexplorethespaceofparameters.Forboostedstumps.Wealsoexaminetheeectexample,wecomparetendecisiontreestyles,neuralthatcalibratingthemodelsviaPlattScalingnetso
6、fmanysizes,SVMswithmanykernels,etc.andIsotonicRegressionhasontheirperfor-mance.AnimportantaspectofourstudyisBecausesomeoftheperformancemetricsweexaminetheuseofavarietyofperformancecriteriatointerpretmodelpredictionsasprobabilitiesandmod-evaluatethele
7、arningmethods.elssuchasSVMsarenotdesignedtopredictprobabil-ities,wecomparetheperformanceofeachalgorithmbothbeforeandaftercalibratingitspredictionswith1.IntroductionPlattScalingandIsotonicRegression.Therearefewcomprehensiveempiricalstudiescom-Theempir
8、icalresultsaresurprising.Topreview:priorparinglearningalgorithms.STATLOGisperhapsthetocalibration,baggedtrees,randomforests,andneu-bestknownstudy(Kingetal.,1995).STATLOGwasralnetsgivethebestaverageperformanceacrossallverycomprehensivewhenitwasperform