欢迎来到天天文库
浏览记录
ID:7304603
大小:240.47 KB
页数:17页
时间:2018-02-11
《the r book multivariate statistics》由会员上传分享,免费在线阅读,更多相关内容在工程资料-天天文库。
1、23TreeModelsTreemodelsarecomputationallyintensivemethodsthatareusedinsituationswheretherearemanyexplana-toryvariablesandwewouldlikeguidanceaboutwhichofthemtoincludeinthemodel.Oftentherearesomanyexplanatoryvariablesthatwesimplycouldnottestthemall,evenifw
2、ewantedtoinvestthehugeamountoftimethatwouldbenecessarytocompletesuchacomplicatedmultipleregressionexercise.Treemodelsareparticularlygoodattasksthatmightinthepasthavebeenregardedastherealmofmultivariatestatistics(e.g.classificationproblems).Thegreatvirtue
3、softreemodelsareasfollows:Theyareverysimple.Theyareexcellentforinitialdatainspection.Theygiveaveryclearpictureofthestructureofthedata.Theyprovideahighlyintuitiveinsightintothekindsofinteractionsbetweenvariables.Itisbesttobeginbylookingatatreemodelin
4、action,beforethinkingabouthowitworks.Hereisanairpollutionexamplethatwemightwanttoanalyzeasamultipleregression.Webeginbyusingtree,thenillustratethemoremodernfunctionrpart(whichstandsfor‘recursivepartitioning’)install.packages("tree")library(tree)Pollute<
5、-read.table("c:\temp\Pollute.txt",header=T)attach(Pollute)names(Pollute)[1]"Pollution""Temp""Industry""Population""Wind"[6]"Rain""Wet.days"model<-tree(Pollute)plot(model)text(model)TheRBook,SecondEdition.MichaelJ.Crawley.©2013JohnWiley&Sons,Ltd.Publis
6、hed2013byJohnWiley&Sons,Ltd.TREEMODELS769Industry<748Population<19067.00Wet.days<10843.43Temp<59.3512.00Wind<9.6515.0033.8823.00Youfollowapathfromthetopofthetree(called,indefianceofgravity,theroot)andproceedtooneoftheterminalnodes(calledaleaf)byfollowing
7、asuccessionofrules(calledsplits).Thenumbersatthetipsoftheleavesarethemeanvaluesinthatsubsetofthedata(meanSO2concentrationinthiscase).Thedetailsareexplainedbelow.23.1BackgroundThemodelisfittedusingbinaryrecursivepartitioning,wherebythedataaresuccessivelys
8、plitalongcoordinateaxesoftheexplanatoryvariablessothat,atanynode,thesplitwhichmaximallydistinguishestheresponsevariableintheleftandtherightbranchesisselected.Splittingcontinuesuntilnodesarepureorthedataaretoosparse(fewerthansixca
此文档下载收益归作者所有