欢迎来到天天文库
浏览记录
ID:18430147
大小:191.81 KB
页数:8页
时间:2018-09-17
《Crowdsourcing a News Query Classification Dataset众包新闻查询分类数据集》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、CrowdsourcingaNewsQueryClassificationDatasetRichardM.C.McCreadieCraigMacdonaldIadhOunisDepartmentofComputingDepartmentofComputingDepartmentofComputingScienceScienceScienceUniversityofGlasgowUniversityofGlasgowUniversityofGlasgowGlasgow,G128QQGlasgow,G128QQGlasg
2、ow,G128QQrichardm@dcs.gla.ac.ukcraigm@dcs.gla.ac.ukounis@dcs.gla.ac.ukABSTRACTdataset.WeproposemultipleinterfacesforcrowdsourcedquerylabellingandevaluatetheseinterfacesempiricallyintermsoftheWebsearchenginesarewellknownforaggregatingnewsverticalqualityoftheres
3、ultinglabelsonasmallrepresentativesampleofcontentintotheirresultrankingsinresponsetoqueriesclassifieduserqueriesfromaWebsearchenginequerylog.Later,weusetheasnews-related.However,nodatasetcurrentlyexistsuponwhichbestperformingoftheseinterfacestogenerateourfinalne
4、wsqueryapproachestonewsqueryclassificationcanbeevaluatedandcom-classificationdatasetcomprisedofalargerquerysamplefromthepared.Thispaperstudiesthegenerationandvalidationofanewssamelog.Wereportthequalityofourresultingnewsqueryclas-queryclassificationdatasetcomprise
5、doflabelscrowdsourcedfromsificationdatasetintermsofinter-workerlabellingagreementandAmazon'sMechanicalTurkanddetailsinsightsgained.Notably,accuracywithregardtolabelscreatedseparatelybytheauthors.ourstudyfocusesaroundtwochallengeswhencrowdsourcingnewsMoreover,we
6、furtherinvestigateitsqualityintheformofanad-queryclassificationlabels:1)howtoovercomeourworkers'lackditionalagreementstudy,inwhichcrowdsourcingisleveragedforofinformationaboutthenewsstoriesfromthetimeofeachqueryqualityassurance.and2)howtoensuretheresultinglabel
7、sareofhighenoughqualityNotably,oneofthemostinterestingaspectsofnewsqueryclassi-tomakethedatasetuseful.Weempiricallyshowthataworker'sficationlabellingisthetemporalnatureofnews-relatedqueries[16].lackofinformationaboutnewsstoriescanbeaddressedthroughInparticular,
8、aqueryshouldonlybelabelledasnews-relatediftheintegrationofnews-relatedcontentintothelabellinginterfacetherewasarelevantnoteworthystoryinthenewsaroundthetimeandthatthisimprovesthequ
此文档下载收益归作者所有