欢迎来到天天文库
浏览记录
ID:40893502
大小:159.34 KB
页数:7页
时间:2019-08-10
《Cluster analysis using R 怎样用R来做类别分析》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、CLUSTERANALYSISStevenM.Ho!andDepartmentofGeology,UniversityofGeorgia,Athens,GA30602-2501January2006IntroductionClusteranalysisincludesabroadsuiteoftechniquesdesignedtofindgroupsofsimilaritemswithinadataset.Partitioningmethodsdividethedatasetintoanumberofgroupspre-designatedbytheus
2、er.Hierarchicalclustermethodsproduceahierarchyofclustersfromsmallclustersofverysimilaritemstolargeclustersthatincludemoredissimilaritems.Hier-archicalmethodsusuallyproduceagraphicaloutputknownasadendrogramortreethatshowsthishierarchicalclusteringstructure.Somehierarchicalmethodsa
3、redivisive,thatprogressivelydividetheonelargeclustercomprisingallofthedataintotwosmallerclustersandrepeatthisprocessuntilallclustershavebeendivided.Otherhierarchicalmethodsareagglomerativeandworkintheoppositedirectionbyfirstfindingtheclustersofthemostsimilaritemsandprogressivelyadd
4、inglesssimilaritemsuntilallitemshavebeenincludedintoasinglelargecluster.ClusteranalysiscanberunintheQ-modeinwhichclustersofsamplesaresoughtorintheR-mode,whereclustersofvariablesaredesired.Hierarchicalmethodsareparticularlyusefulinthattheyarenotlimitedtoapre-determinednumberofclus
5、tersandcandisplaysimilarityofsamplesacrossawiderangeofscales.Ag-glomerativehierarchicalmethodsareparticularlycommoninthenaturalsciencesandtheywillbethefocusofthislecture.ComputationAswithothermultivariatemethods,thestartingpointisadatamatrixconsistingofnrowsofsamplesandpcolumnsof
6、variables,calledannxp(nbyp)matrix.Hierarchicalagglomera-tiveclusteranalysisbeginsbycalculatingamatrixofdistancesamongitemsinthisdatama-trix.AlthoughclusteranalysiscanberunintheR-modewhenseekingrelationshipsamongvariables,thisdiscussionwillassumethataQ-modeanalysisisbeingrun.InQ-m
7、odeanaly-sis,thedistancematrixisasquare,symmetricmatrixofsizenxnthatexpressesallpossiblepairwisedistancesamongsamples.Manydistancemetricscanbeused.Beforeclusteringhasbegun,eachsampleisconsideredagroup,albeitofasinglesample.Clusteringbeginsbyfindingthetwogroupsthataremostsimilar,ba
8、sedonthedistancema-trix,andmergingthemin
此文档下载收益归作者所有