资源描述:
《Extremely randomized trees》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、MachLearn():DOI10.1007/s10994-006-6226-1ExtremelyrandomizedtreesPierreGeurts·DamienErnst·LouisWehenkelReceived:14June2005/Revised:29October2005/Accepted:15November2005/Publishedonline:2March2006SpringerScience+BusinessMedia,Inc.2006AbstractThispaperproposesanewtree-basedensemblemethodforsupervisedc
2、lassifica-tionandregressionproblems.Itessentiallyconsistsofrandomizingstronglybothattributeandcut-pointchoicewhilesplittingatreenode.Intheextremecase,itbuildstotallyrandom-izedtreeswhosestructuresareindependentoftheoutputvaluesofthelearningsample.Thestrengthoftherandomizationcanbetunedtoproblemspeci
3、ficsbytheappropriatechoiceofaparameter.Weevaluatetherobustnessofthedefaultchoiceofthisparameter,andwealsoprovideinsightonhowtoadjustitinparticularsituations.Besidesaccuracy,themainstrengthoftheresultingalgorithmiscomputationalefficiency.Abias/varianceanalysisoftheExtra-Treesalgorithmisalsoprovidedasw
4、ellasageometricalandakernelcharacterizationofthemodelsinduced.KeywordsSupervisedlearning.Decisionandregressiontrees.Ensemblemethods.Cut-pointrandomization.Bias/variancetradeoff.Kernel-basedmodels1.IntroductionInthisarticle,weproposeanewtreeinductionalgorithmthatselectssplits,bothattributeandcut-poi
5、nt,totallyorpartiallyatrandom.Theideathatrandomizeddecisiontreescouldperformaswellasclassicalonesappearedinanexperimentalstudypublishedinthelateeighties(Mingers,1989),eveniflateritwasEditor:JohannesFurnkranz¨P.Geurts()·D.Ernst·L.WehenkelDepartmentofElectricalEngineeringandComputerScience,Universit
6、yofLiege,`Liege,Sart-Tilman,B-28,B-4000Belgium`e-mail:P.Geurts@ulg.ac.beD.Ernste-mail:Dernst@ulg.ac.beL.Wehenkele-mail:L.Wehenkel@ulg.ac.beMachLearn():showninamorecarefullydesignedexperimentthattheywereactuallysignificantlylessaccuratethannormalonesonmanydatasets(BuntineandNiblett,1992).Duringtheear
7、lynineties,thestatisticalnotionsofvarianceanditscompanion,thebias,werestudiedmoresystematicallybymachinelearningresearchers(seeforexample,DietterichandKong,1995;Breiman,1996a;Friedman,1997),andthehighvarian