欢迎来到天天文库
浏览记录
ID:33526835
大小:1.59 MB
页数:45页
时间:2019-02-26
《中文微博话题检测跟踪方法的研究和系统设计》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、AbstractAbstractMicro-blogasnewmediariseabovethecommonherdofaWeb2.0intheinformationage,multimediaplatformtosupportcross-platforminformationinteraction,isdevelopingrapidlyinrecenttwoyears,hasgraduallybecomethemainplatformforordinarypeopletosharepersonalinformation,payattentiontoo
2、thers'information,real-timeinformationacquisition,hasgraduallybecomethemainpartofthenetworkmedia.Itscharacteristicisthehugenumberofinformation,decentralized,diversity.Inordertolettheuserreal-timeunderstandingoftheoveralltopicofmicro-blog,trackingtheirinterestinthetopic,thispaper
3、Chinesemicro-blogtopicdataacquisition,trackingmethodoftopicdetection.Throughtheuseofsuitablemicro-blogthewebpageinformationcollectiontechnology—timecontrolbreadth-firstacquisitionbasedoninformationcollection,improveefficiency,ensuretheinformationacquisitioncoverage.Adaptivecolle
4、ctionandinformationextractionofmicro-blogsitetopicidentificationandstandardized,modularstorage,toprovidebetterqualityofthedatasource.Atthesametimeofobtainingmicro-blogbasedonAPIdata,andcomparedthewebcrawlerdataacquisitionmodebasedonAPIandobtainmicro-blogdatabasedonmodetwoschemes
5、inmicro-blogdataacquisitionperformance.BytheendoftheChineseprocessingtechniquesfortextprocessing,detectionandtrackingalgorithmisusedtoobtainthedata.Intopictrackingreal-timeadjustmentofqueryvectorprocess,andbyintroducingthewebpagerelationship,corefeaturesandnoncorefeatureadjustme
6、nteffectivelyfilterthenoiseinformation,andenhancethequeryvectoradjustmenteffect.Ultimatelythemicro-blogtopicdetectionandtopictracking.Keywords:micro-blog;API;detection;topictracking;dataacquisitionII目录目录中文摘要........................................................................
7、.............................................................IAbstract......................................................................................................................................II目录.......................................................................
8、...............................................
此文档下载收益归作者所有