ITC-UT Tweet Categorization by Query Categorization for On-line Reputation Management

ITC-UT Tweet Categorization by Query Categorization for On-line Reputation Management

ID:40082169

大小:80.88 KB

页数:9页

时间:2019-07-20

ITC-UT Tweet Categorization by Query Categorization for On-line Reputation Management_第1页
ITC-UT Tweet Categorization by Query Categorization for On-line Reputation Management_第2页
ITC-UT Tweet Categorization by Query Categorization for On-line Reputation Management_第3页
ITC-UT Tweet Categorization by Query Categorization for On-line Reputation Management_第4页
ITC-UT Tweet Categorization by Query Categorization for On-line Reputation Management_第5页
资源描述:

《ITC-UT Tweet Categorization by Query Categorization for On-line Reputation Management》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库

1、ITC-UT:TweetCategorizationbyQueryCategorizationforOn-lineReputationManagementMinoruYoshida,ShinMatsushima,ShingoOno,IsseiSato,andHiroshiNakagawaUniversityofTokyo7-3-1,Hongo,Bunkyo-ku,Tokyo113-0033{mino,masin,ono,sato,nakagawa}@r.dl.itc.u-tokyo.ac.jpAbs

2、tract.Thispaperdescribesoursystem,calledITC-UT,forthetask-2(on-linereputationmanagementtask)inWePS-3.Ourideaistocategorizeeachqueryinto3or4classesaccordingtohowmuchthetweetsretrievedbythequerycontainthetrueentitynamesthatrefertothetargetentity,andthenc

3、ategorizeeachtweetbytherulesdefinedforeachclassofqueries.Weshowtheevaluationresultsforoursystemalongwiththedetailsofresultsofquerycategorization.Keywords:OrganizationNameDisambiguation,Two-StageAlgorithm,NaiveBayes,Twitter1IntroductionThispaperreportsth

4、ealgorithmsandresultsoftheITC-UT(InformationTech-nologyCenter,theUniversityofTokyo)teamfortheWePS-3task-2(on-linereputationmanagementtask.)ThesupposedsituationofthistaskiswhereyousearchreputationofsomeorganizationinTwitter.Assumingthattweetsareretrieve

5、dbytheorganizationnamequery,theproblemistodecidewhethereachorganizationnamefoundineachtweetrepresentsthetargetorganizationornot(suchas“ApplePC”fortheformerand“ApplePie”forthelatterforthequery“Apple”.)Thisisonetypeofnamedisambiguationproblemsthathavebee

6、nex-tensivelystudiedthroughpreviousWePSworkshops[1,2].However,thecurrenttasksettingischallengingbecausegenerallyeachtweetissmallandprovideslittlecontextfordisambiguation.Ouralgorithmtosolvethisproblemisbasedontheintuitionthatorganiza-tionnamescanbeclas

7、sifiedinto“organization-likenames”and“general-word-likenames”,suchas“McDonald’s”fortheformerand“Pioneer”forthelatter.ThisintuitionissupportedbythefactthattheratioofTRUE1(orFALSE)tweetsinthetrainingdatavarywidelyfromentitytoentity.Forexample,over1TRUEind

8、icatesthatthetweetmentionsthetargetorganization(asdefinedinthenextsection).FALSEindicatestheopposite.2M.Yoshidaetal.98%oftweetswerelabeledTRUEforentity“nikon”,whiletheratioforentity“renaissancetechnologies”(forwhichthequerytermwas“Renais

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。