欢迎来到天天文库
浏览记录
ID:33527290
大小:2.71 MB
页数:59页
时间:2019-02-26
《微博新词发现的研究》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、哈尔滨工业大学工学硕士学位论文lowconfidencelevelfragmentasacandidatestringanddetectednewwordsfromthesestrings.Thedetectednewwordswerethenaddedtothedictionaryandformattedadditionaldictionaryfeaturesinwordsegmentationmodeltraining.Experimentalresultsdemonstratethatco
2、mbinationofnewwordsdetectionandwordsegmentationpromotestheirperformance.Finally,weanalyzedlifecycleofnewwordonMicroblog.Theprobabilitydistributionfunctionoflogarithmicfunctionwasusedtofitnewword’sfrequencyfirstandweanalyzedthetemporaldistributionrule
3、softhenewword.Mostofthenewwordsdisappearedsoonafterbeingcreated,onlyasmallpartofthenewwordscouldsurvive,andgraduallydevelopedintoacommonword.Thenthefrequentitemsetminingalgorithmwasappliedtoextractfrequentwords,andweanalyzedthespatialdistributionrule
4、sofnewwords.Keywords:newworddetection,statisticalmeasure,wordsegmentation,lifecycle-III-哈尔滨工业大学工学硕士学位论文目录摘要..........................................................................................................................IABSTRACT............
5、....................................................................................................II第1章绪论...........................................................................................................11.1课题研究的背景和意义......................
6、......................................................11.1.1课题研究的背景........................................................................................11.1.2课题研究的意义..................................................................................
7、......21.2国内外的研究现状....................................................................................31.3本文的主要研究内容................................................................................5第2章规则与统计相结合的新词发现研究....................................
8、.......................72.1引言..........................................................................................................72.2候选新词的抽取和过滤............................................................................72.2.1语料的预处理.............
此文档下载收益归作者所有