资源描述:
《k-means clustering - example - usc upstate facultyk-均值聚类样本学院usc北部》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、K-MeansClusteringExampleK-MeansClustering–ExampleWerecallfromthepreviouslecture,thatclusteringallowsforunsupervisedlearning.Thatis,themachine/softwarewilllearnonitsown,usingthedata(learningset),andwillclassifytheobjectsintoaparticularclass–forexample,ifourclass(decision)attributeistumorTyp
2、eanditsvaluesare:malignant,benign,etc.-thesewillbetheclasses.Theywillberepresentedbycluster1,cluster2,etc.However,theclassinformationisneverprovidedtothealgorithm.Theclassinformationcanbeusedlateron,toevaluatehowaccuratelythealgorithmclassifiedtheobjects.CurvatureTextureBloodConsumpTumorTy
3、pex10.81.2ABenignx20.751.4BBenignx30.230.4DMalignantx4..0.230.5DMalignantCurvatureTextureBloodConsumpTumorTypex10.81.2ABenignx20.751.4BBenignx30.230.4DMalignantx4..0.230.5DMalignant(learningset)CurvatureTextureBloodConsump0.80.231.20.4ABD.x1Thewaywedothat,isbyplottingtheobjectsfromthedatab
4、aseintospace.Eachattributeisonedimension:CurvatureTextureBloodConsump0.80.231.20.4ABD.......Cluster1benignCluster2malignant.Afteralltheobjectsareplotted,wewillcalculatethedistancebetweenthem,andtheonesthatareclosetoeachother–wewillgroupthemtogether,i.e.placetheminthesamecluster.8K-MeansClu
5、steringExampleWiththeK-Meansalgorithm,werecallitworksasfollows:8K-MeansClusteringExampleExampleProblem:Clusterthefollowingeightpoints(with(x,y)representinglocations)intothreeclustersA1(2,10)A2(2,5)A3(8,4)A4(5,8)A5(7,5)A6(6,4)A7(1,2)A8(4,9).Initialclustercentersare:A1(2,10),A4(5,8)andA7(1,2
6、).Thedistancefunctionbetweentwopointsa=(x1,y1)andb=(x2,y2)isdefinedas:ρ(a,b)=
7、x2–x1
8、+
9、y2–y1
10、.Usek-meansalgorithmtofindthethreeclustercentersaftertheseconditeration.Solution:Iteration1(2,10)(5,8)(1,2)PointDistMean1DistMean2DistMean3ClusterA1(2,10)A2(2,5)A3(8,4)A4(5,8)A5(7,5)A6(6,4)A7(1,2)A8
11、(4,9)Firstwelistallpointsinthefirstcolumnofthetableabove.Theinitialclustercenters–means,are(2,10),(5,8)and(1,2)-chosenrandomly.Next,wewillcalculatethedistancefromthefirstpoint(2,10)toeachofthethreemeans,byusingthedistancefunction:pointmean1x1,y1x2,y2(2,10)(2,1