欢迎来到天天文库
浏览记录
ID:34163177
大小:4.42 MB
页数:56页
时间:2019-03-03
《基于微博文本的话题聚类-研究和实现》由会员上传分享,免费在线阅读,更多相关内容在教育资源-天天文库。
1、AbstractThedevelopmentofWeb2.0technologymakestheeraofbigdatacoming.WiththerapiddevelopmentonsocialnetworksuchaSmicro-blog,itbringsSOmanychallengesondataminingandknowledgediscovery,althoughthemicro—blogenrichesthebigdata.Comparedwiththetraditionaltextdata,themicro—blogd
2、atahaSsomedifferenceamongthepersonalinterest,entertainment,businessmarketing,andthepublicpublicity,etc.Furthermore,themicro-blogdataalsohasitsownpropertiesbothoncontentfragmentationanditsmassdata.Howtoanalyzeandmineitshidinginformationisanimportantresearchtask.Topicclu
3、steringisabasicworkonmicro—blogresearch.Byclusteringrelevantmassdataintoseveralgroupsautomatically,itsprocessingresultscanpresentsomehintsonanalyzingandminingthedata.AsthetraditionalapproachesusuallypresentSOmanyresultswithirrelevantorreplicatedinformation,itisnotfeasi
4、bletoprocesstheaboveproblemseffectively.ThetopicclusteringapproachCangrouptherelevantinformationautomatically.Furthermore,byusingthekeywordextraction,theprocessingvisualresultsareintuitional.Thisthesisdosomeresearchworksbasedonmicro—blogbyusingsomeintelligentalgorithms
5、,andthemainworksareasfollows:Firstly,wepresentsomeusefulapproachesonobtainingmicro—blogstructureddataanddatapre’processingbeforeclustering.Secondly,onthebasisofanalyzingthemicro—blogdata,wedoresearchonselectingusefulfeaturesforfurtherprocessing.Thirdly,wedesignalleffec
6、tivelyclusteringalgorithm.Onanalyzingthemicro-blogdata,wedoresearchonanalyzingwhichperformanceisbetter.Fourthly,weextractkeywordsfromclusteringresultset.ThesekeywordsCanbeusedtovisualthetopicclustering.Fifthly,wedothevisualizationprocessing,andtheresultsisclearandvisua
7、l,SOitisusefulonunderstandingandrecognitionthehiddeninformationbehindthemassdata.Theexperimentalresultsandanalysisshowsthefeasibleoftheproposedapproach.Someexistingproblemsandfurtherworksarealsopresentintheend.KeyWordsTopicclustering;Micro—blog;Featurevector;Visualizat
8、ion;InformationgainIII河北科技大学硕士学位论文IV目录摘要⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯IAbstract⋯·-⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯·⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯一I
此文档下载收益归作者所有