欢迎来到天天文库
浏览记录
ID:18429910
大小:880.03 KB
页数:73页
时间:2018-09-17
《Data Quality from Crowdsourcing(conference)》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、NAACLHLT2009ActiveLearningforNaturalLanguageProcessing(ALNLP-09)ProceedingsoftheWorkshopJune5,2009Boulder,ColoradoProductionandManufacturingbyOmnipressInc.2600AndersonStreetMadison,WI53707USAEndorsedbythefollowingACLSpecialInterestGroups:•SIGNLL,SpecialInterestGroupforNaturalLanguageLearning
2、•SIGANN,SpecialInterestGroupforAnnotationc2009TheAssociationforComputationalLinguisticsOrdercopiesofthisandotherACLproceedingsfrom:AssociationforComputationalLinguistics(ACL)209N.EighthStreetStroudsburg,PA18360USATel:+1-570-476-8006Fax:+1-570-476-0860acl@aclweb.orgISBN978-1-932432-40-4iiIntr
3、oductionWelcometotheworkshoponActiveLearningforNaturalLanguageProcessing!Westartedorganizingthisworkshopinmid-2008afterstrongencouragementinresponsetosomeofourownworkinthearea.Aswegatheredmembersoftheprogramcommittee,thetimelinessofthetopicresonatedwithseveralofthem:thegrowingbodyofknowledge
4、onactivelearningandonactivelearningforNLPinparticularmakesthistopiconeworthexploringinafocusedworkshopratherthaninisolatedpapersinoccasional,far-flungconferences.Labeleddataisaprerequisiteformanypopularalgorithmsinnaturallanguageprocessingandmachinelearning.Whileitispossibletoobtainlargeamoun
5、tsofannotateddataforwell-studiedlanguagesinwell-studieddomainsandwell-studiedproblems,labeleddataarerarelyavailableforlesscommonlanguages,domains,orproblems.Unfortunately,obtaininghumanannotationsforlinguisticdataislabor-intensiveandtypicallythecostliestpartoftheacquisitionofanannotatedcorpu
6、s.Ithasbeenshownbeforethatactivelearningcanbeemployedtoreduceannotationcostsbutnotattheexpenseofquality.WhilediverseworkoverthepastdecadehasdemonstratedthepossibleadvantagesofactivelearningforcorpusannotationandNLPapplications,activelearningisnotwidelyusedinmanyongoingdataannotationtasks.Muc
7、hofthemachinelearningliteratureonthetopichasfocusedonactivelearningforclassificationproblemswithlessattentiondevotedtothekindsofproblemsencounteredinNLP.Relatedtopicssuchasdistributed“humancomputation”,cost-sensitivemachinelearning,andsemi-supervise
此文档下载收益归作者所有