资源描述:
《Crowdsourcing for Book Search Evaluation》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、CrowdsourcingforBookSearchEvaluation:ImpactofHITDesignonComparativeSystemRankingGabriellaKazai1JaapKamps2MarijnKoolen2NatasaMilic-Frayling11MicrosoftResearch,CambridgeUK,{v-gabkaz,natasamf}@microsoft.com2UniversityofAmsterdam,TheNetherlands,{kamps,m.h.a.koolen}@uva.nlA
2、BSTRACTcatetherelevanceofsearchresultstoasetofqueries.WiththeeverincreasingsizeanddiversityofboththedocumentcollectionsandTheevaluationofinformationretrieval(IR)systemsoverspecialthequerysets,gatheringrelevancelabelsbytraditionalmethods,collections,suchaslargebookrepos
3、itories,isoutofreachoftra-i.e.,fromaselectgroupoftrainedexperts,hasbecomeincreas-ditionalmethodsthatrelyuponeditorialrelevancejudgments.In-inglychallenging[9].Thisissueisespeciallyprevalentinspe-creasingly,theuseofcrowdsourcingtocollectrelevancelabelshascializedsearchd
4、omainssuchasacademicpapersorbooks,whichbeenregardedasaviablealternativethatscaleswithmodestcosts.cansupportarangeoftailoredsearchtasksbutalsopresentad-However,crowdsourcingsuffersfromundesirableworkerpracticesditionalcomplexitiesinIRevaluation.Agoodillustrationistheand
5、lowqualitycontributions.Inthispaperweinvestigatethede-INEXBookTrack[13]whichaimstoprovideatestbedforthesignandimplementationofeffectivecrowdsourcingtasksintheevaluationofbooksearchsystems.Thetrackreportsonarangeofcontextofbooksearchevaluation.Weobservetheimpactofas-iss
6、uesrelatedtothegatheringofrelevancelabels[12,13],oneofpectsoftheHumanIntelligenceTask(HIT)designonthequalitywhichisthesheereffortofreviewingwholebooksandrenderingofrelevancelabelsprovidedbythecrowd.Weassesstheoutputinrelevancejudgmentsforpagesacrossalargenumberofretrie
7、vedtermsoflabelagreementwithagoldstandarddatasetandobservebooks.WhiletheINEXbookcollectioncomprisesonly50,000theeffectofthecrowdsourcedrelevancejudgmentsontheresultingbooks,theefforttojudgeasingletopicisestimatedat33daysifsystemrankings.Thisenablesustoobservetheeffecto
8、fcrowd-theassessorspent95minutesadayjudgingpagesonthattopicsourcingontheentireIRevaluationprocess.Usingthetestsetanda