欢迎来到天天文库
浏览记录
ID:7299015
大小:1.42 MB
页数:36页
时间:2018-02-10
《r statistical application development cart and beyond》由会员上传分享,免费在线阅读,更多相关内容在工程资料-天天文库。
1、10CARTandBeyondInthepreviouschapter,westudiedCARTasapowerfulrecursivepartitioningmethod,usefulforbuilding(non-linear)models.Despitetheoverallgenerality,CARTdoeshavecertainlimitationsthatnecessitatesomeenhancements.Itistheseextensionsthatformthecruxofthefinal
2、chapterofthisbook.Forsometechnicalreasons,wewillfocussolelyontheclassificationtreesinthischapter.WewillalsobrieflylookatsomelimitationsoftheCARTtool.ThefirstimprovementthatcanbemadetotheCARTisprovidedbythebaggingtechnique.Inthistechnique,webuildmultipletrees
3、onthebootstrapsamplesdrawnfromtheactualdataset.Anobservationisputthrougheachofthetreesandapredictionismadeforitsclass,andbasedonthemajoritypredictionofitsclass,itispredictedtobelongtothemajoritycountclass.AdifferentapproachisprovidedbyRandomForests,whereyouc
4、onsiderarandompoolofcovariatesagainsttheobservations.WefinallyconsideranotherimportantenhancementofaCARTbyusingtheboostingalgorithms.Thechapterwilldiscussthefollowing:Cross-validationerrorsforCARTThebootstrapaggregation(bagging)techniqueforCARTExtendingtheCA
5、RTwithrandomforestsAconsolidationoftheapplicationsdevelopedfromChapter6toChapter10,CARTandBeyondImprovingCARTIntheAnotherlookatmodelassessmentsectionofChapter8,wesawthatthetechniqueoftrain+validate+testmaybefurtherenhancedbyusingthecross-validationtechnique.
6、Inthecaseof,linearregressionmodel,wehadusedtheCVlmfunctionfromtheDAAGpackageforthepurposeofcross-validationoflinearmodels.Thecross-validationtechniqueforthelogisticregressionmodelsmaybecarriedoutbyusingtheCVbinaryfunctionfromthesamepackage.Profs.TherneauandA
7、tkinsoncreatedthepackagerpart,andadetaileddocumentationoftheentirerpartpackageisavailableontheWebathttp://www.mayo.edu/hsr/techrpt/61.pdf.RecalltheslightimprovementprovidedinthePruningandotherfineraspectsofatreesectionofthepreviouschapter.Thetwoaspectsconsid
8、eredthererelatedtothecomplexityparametercpandtheminimumsplitcriteriaminsplit.Now,theproblemofoverfittingwiththeCARTmaybereducedtoanextentbyusingthecross-validationtechnique.Intheridgeregressionm
此文档下载收益归作者所有