欢迎来到天天文库
浏览记录
ID:28038954
大小:490.66 KB
页数:13页
时间:2018-12-07
《数据挖掘实验报告weka的数据聚类分析》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、本科生实验报告(2)姓名:学院:计算机科学学院专业:信息管理与信息系统班级:实验课程名称:数据挖掘实验日期:指导教师及职称:实验成绩:学期开课时间:2013—2014学年一甘肃政法学院实验管理中心印制实验题目Weka的数据聚类分析小组合作姓名班级学号一、实验目的1、了解和熟悉K均值聚类的步骤2、利用Weka中提供的simpleKmeans方法对数据文件进行聚类分析,更深刻的理解k均值算法,并通过对实验结果进行观察分析,找出实验中所存在的问题。二.实验环境Win7环境下的Eclipse三、实验内容在WEKA中实现K均值的算法,观察实验
2、结果并进行分析四、实验过程与分析一、实验过程1、添加数据文件打开Weka的Explore,使用Openf*ile点击打开本次实验所要使用的raff格式数据文件“auto93.raff”’WekaExplorerPreprocess!Classify
3、Cluster!Associate;SelectattributesVisual1Openfile...•~URLDBUndoEditSave...zeFilter,打开22ChooseNone查香:ApplyCurrentrelatiorRelation:anteInstances:9
4、3AttributesIT。.KanIBMaru2[Typ3Cit:4Mxg5Air6rDri垂Hum8Eng9EHor10匚RPM11Eng12Mart5、se.arff_autoMpg.arffLautoPrice.arffautos.arff文件名:auto93.arff^e:Nominalle:6(6%)VisualizeAll文件类型:Arffdatafiles(*.arff)Log2、选择算法类型点击Cluster中的Choose“SimpleKMeans”选择本次实验所要使用的算法类型,WekaExplorer6、:o』二B-JStatusOK3、得出实验结果选中“ClusterMode”的“Usetrainingset”,点击“Start”按钮,观察右边“Clusterer7、output”给由的聚类结果如下:RuninformationScheme:Relation:Instances:Attributes:weka.clusterers.SimpleKMeans-N2-S10auto93.names9323ManufacturerTypeCity_MPGHighway_MPGAir_Bags_standardDrive_train_typeNumber_of一cylindersEngine_sizeHorsepowerRPMEngine_revolutions_per_mileManual_trans8、mission_availableFuel_tank_capacityPassenger一capacityLengthWheelbaseWidthU-tum_spaceRear_seat_roomLuggage_capacityWeightDomesticclassTestmode:evaluateontrainingdata===ModelandevaluationontrainingsetkMeansNumberofiterations:5Withinclustersumofsquarederrors:282.1793434109、63733Clustercentroids:Cluster0Mean/Mode:ChevroletMidsize19.073226.3171115.90243.522173.85374965.85371964.2683018.60495.561193.7805108.609872.341541.634129.020215.51783517.561123.4512N/A1.2612.39160.9015N/A2.49032.96321.94622.77210.2372StdDevs:N/A3.0368N/AN/A50.3232581.10、2098370.731.073511.12325.24352.4527358.6609N/ACluster1Small24.961531.269201120.15385528.84622622.307714.7115174.86541
5、se.arff_autoMpg.arffLautoPrice.arffautos.arff文件名:auto93.arff^e:Nominalle:6(6%)VisualizeAll文件类型:Arffdatafiles(*.arff)Log2、选择算法类型点击Cluster中的Choose“SimpleKMeans”选择本次实验所要使用的算法类型,WekaExplorer
6、:o』二B-JStatusOK3、得出实验结果选中“ClusterMode”的“Usetrainingset”,点击“Start”按钮,观察右边“Clusterer
7、output”给由的聚类结果如下:RuninformationScheme:Relation:Instances:Attributes:weka.clusterers.SimpleKMeans-N2-S10auto93.names9323ManufacturerTypeCity_MPGHighway_MPGAir_Bags_standardDrive_train_typeNumber_of一cylindersEngine_sizeHorsepowerRPMEngine_revolutions_per_mileManual_trans
8、mission_availableFuel_tank_capacityPassenger一capacityLengthWheelbaseWidthU-tum_spaceRear_seat_roomLuggage_capacityWeightDomesticclassTestmode:evaluateontrainingdata===ModelandevaluationontrainingsetkMeansNumberofiterations:5Withinclustersumofsquarederrors:282.179343410
9、63733Clustercentroids:Cluster0Mean/Mode:ChevroletMidsize19.073226.3171115.90243.522173.85374965.85371964.2683018.60495.561193.7805108.609872.341541.634129.020215.51783517.561123.4512N/A1.2612.39160.9015N/A2.49032.96321.94622.77210.2372StdDevs:N/A3.0368N/AN/A50.3232581.
10、2098370.731.073511.12325.24352.4527358.6609N/ACluster1Small24.961531.269201120.15385528.84622622.307714.7115174.86541
此文档下载收益归作者所有