欢迎来到天天文库
浏览记录
ID:34223826
大小:4.91 MB
页数:59页
时间:2019-03-04
《面向微博数据的命名实体识别研究与实现》由会员上传分享,免费在线阅读,更多相关内容在教育资源-天天文库。
1、万方数据东北大学壁士堂堡垒圭一!堕——●_-—__-__———●__●●-——_-_—___—●__—●_-—__-—___—●_——_●h—●_———●-—-●——__-__--—___●_-—--—_-_———————一一一万方数据onofMicroblogDataAbstractWithpopularityofmicroblog,microbloghasbecomeanewsocialmesaofinformationreleaseandpropagationg.BytheendofDecember2012,Sinamicrob
2、logreachedmorethan500millionregisteredusers.Microblogdataincreasedalsoinwhichthereismuchvaluableinformationfororganizationandindividual.Thus,informationextraction,analysisandnaturallanguageprocessingcarriedoutmicroblogbecomearesearchhotspot.Namedentityrecognitionisparti
3、cularlyimportantasabasicmissionoftheseresearches,butcurrentnamedentityrecognitionmicroblogresearchisnotverymature.Tranditionalmethodsofnamedentityrecognitionmicroblogdatacannotobtainsatisfactoryresult,whichhindersthefollow-upwork.Thispapermainlyresearchesnamedentityreco
4、gnitionmicroblogdata.Characteristicsofmicroblogdataleadtothefailureoftraditionalmethods.Thefundamentalreasonsarelistedinfourpoints.First,eachmicroblogisveryshortwithlimitedinformation,whichmakesitdifficulttofullyintegratealotofrelevantinformation.Second,thereismuchnoisy
5、data,butnoisyimmunityofmodelsislow.Thesemakeoverfittingphenomenonhappenmodelsintrainingcourse.Third,thereisnosufficientcompletetrainingresourceinmicroblogdata,whichmayleadtounder-trainingofmodels,Furthermore,italwaysneedsmuchmanualwork.Fourth,fastinformationupdateofmicr
6、oblogdatamakesunderfittingphenomenonhappenintrainingcourselowadaptabilityofmodels.ExperimentshowsthatF1measureofnamedentityrecognitionresultsbyconventionalmethodswillalmostdrop20percents.Inordertosolveproblemslistedabove,thepaperimplementsnamedentityrecognitionmicroblog
7、datawithmanytechnologies.Theprecision,recallandF1measureoftheresearchresultmicroblogbackgroundgets83.7%.79.8%and81.8%respectively.Theresultimprovesmuchrelativetoconventionalmethods.Thepaperovercomesdisadvantagesinconventionalmethodsbyfollowingaspects.First,theresearchbu
8、ildsasemi·supervisednamedentityrecognitionframe,whichfiguresoutthelackoftrainingdatabyrepeating万方数据东北大学硕士学位论文A
此文档下载收益归作者所有