欢迎来到天天文库
浏览记录
ID:45762669
大小:806.64 KB
页数:69页
时间:2019-11-17
《云环境下面向大数据的模糊C均值算法研究与实现》由会员上传分享,免费在线阅读,更多相关内容在工程资料-天天文库。
1、距离算法不能被并行化,因此本文结合Hash取样方法对最大最小距离算法进行了研究,设计了Hash取样方法的MapReduce方案,使用Hash取样数据进行最大最小距离算法计算出初始聚类中心,最后将该初始聚类中心作为FCM算法的输入,以使FCM算法获取较好的聚类效果。关键字:人数据;云环境;数据集成;FCM;MapReduceAbstractWiththerapiddevelopmentofInternet,interactiveapplicationssuchasMicroBlog,WeChatandSNSspringup.Dataisexplodingascloud-basedapplica
2、tionsonvariousdigitaldevicesrise.Facingwithenormousamountofdata,traditionaldataanalysistoolscanlmineusefulinformationdeeplyasitjustmakessimpleprocessingofthedata.Soit'sparticularlyimportanttoexcavatevaluableinformationfromamassofdata.Clusteranalysisisoneofthesebigdataanalyticstechniques.Traditional
3、clusteranalysisonstand-alonedevicescaiftmeetthedemandsofcomputationalefficiencyandcomplexityinbigdataanalytics.Inthiscase,cloudcomputingprovidesanewapproachtotheresearchofclusteranalysisonbigdata.Inthispaper,itmakesresearchontraditionalclusteranalysisbycombiningMapReduceparallelcomputingmodelandcan
4、makefastandefficientclusteranalysisonbigdata.Thecontentofthisthesisisasfollows:(1)ResearchonmethodsofbigdataintegrationDiversityisoneofthenotablefeaturesofbigdataasthetypesandsourcesofdatavarygreatly.Weneedtointegratedatafromdifferentsourcesbeforeanalysis-Itmakesresearchonthefeatureofdiversityofbig
5、data.ItmakeresearchonmethodsofXMLdataparsinginacloudenvironmentbyanalyzingtraditionaldataintegrationsystemsbasedonWebServiceandXML.ItputsforwardaschemeofdataintegrationbasedonHadoopwhichcanintegratedatasetfromdifferentsourcesintoHBasedatabaseandcanmakefastandefficientanalysisonthedata.(2)Researchon
6、Fuzzy-CMeans(FCM)Clusteranalysisisoneofthebigdataanalyticstechniques.ItmakesresearchonFuzzy-CMeansandmakesadesigntoMapReduce.(3)ResearchonFuzzy-CMeansbasedonCanopy(Canopy-FCM)ItmakesresearchonCanopyalgorithmallowingforthefeatureofhighvolumeofbigdata.Canopyisacoarsebutfastalgorithmwhichcangetacoarse
7、clusteringincenterthroughfewtimesofiteration.TheresultbyCanopycanbeusedastheinputofFCMalgorithmtoaccelerateitsconvergence.ItmakesresearchonFuzzy-CMeansbasedonCanopyandmakesadesigntoMapReduce.(1)ResearchonFu
此文档下载收益归作者所有