欢迎来到天天文库
浏览记录
ID:18567889
大小:438.66 KB
页数:28页
时间:2018-09-18
《Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks消除垃圾邮件发送者和众源标记任务排序注释器》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、JournalofMachineLearningResearch13(2012)491-518Submitted4/11;Revised12/11;Published2/12EliminatingSpammersandRankingAnnotatorsforCrowdsourcedLabelingTasksVikasC.RaykarVIKAS.RAYKAR@SIEMENS.COMShipengYuSHIPENG.YU@SIEMENS.COMSiemensHealthcare51ValleyStreamParkway,E51Malvern
2、,PA19355,USAEditor:BenTaskarAbstractWiththeadventofcrowdsourcingservicesithasbecomequitecheapandreasonablyeffectivetogetadatasetlabeledbymultipleannotatorsinashortamountoftime.Variousmethodshavebeenproposedtoestimatetheconsensuslabelsbycorrectingforthebiasofannotatorswit
3、hdifferentkindsofexpertise.Sincewedonothavecontroloverthequalityoftheannotators,veryoftentheannotationscanbedominatedbyspammers,definedasannotatorswhoassignlabelsrandomlywithoutactuallylookingattheinstance.Spammerscanmakethecostofacquiringlabelsveryexpensiveandcanpotentia
4、llydegradethequalityofthefinalconsensuslabels.InthispaperweproposeanempiricalBayesianalgorithmcalledSpEMthatiterativelyeliminatesthespammersandestimatestheconsensuslabelsbasedonlyonthegoodannotators.Thealgorithmismotivatedbydefiningaspammerscorethatcanbeusedtoranktheannota
5、tors.Experimentsonsimulatedandrealdatashowthattheproposedapproachisbetterthan(orasgoodas)theearlierapproachesintermsoftheaccuracyandusesasignificantlysmallernumberofannotators.Keywords:crowdsourcing,multipleannotators,rankingannotators,spammers1.IntroductionAnnotatingadat
6、asetisoneofthemajorbottlenecksinusingsupervisedlearningtobuildgoodpredictivemodels.Gettingadatasetlabeledbyexpertscanbeexpensiveandtimeconsuming.Withtheadventofcrowdsourcingservices(Amazon'sMechanicalTurk1beingaprimeexample)ithasbecomequiteeasyandinexpensivetoacquirelabe
7、lsfromalargenumberofannotatorsinashortamountoftime(seeShengetal.2008,Snowetal.2008,andSorokinandForsyth2008forsomenaturallanguageprocessingandcomputervisioncasestudies).ForexampleinAMTtherequestersareabletoposetasksknownasHITs(HumanIntelligenceTasks).Workers(calledprovid
8、ers)canthenbrowseamongexistingtasksandcompletethemforasmallmonetarypaymentsetbytherequester.Amajordrawb
此文档下载收益归作者所有