资源描述:
《learning to detect vandalism in social content systems a study on wikipedia》由会员上传分享,免费在线阅读,更多相关内容在工程资料-天天文库。
1、LearningtoDetectVandalisminSocialContentSystems:AStudyonWikipediaVandalismDetectioninWikipediaSaraJavanmardi,DavidW.McDonald,RichCaruana,SholehForouzan,andCristinaV.LopesAbstractAchallengefacingusergeneratedcontentsystemsisvandalism,i.e.editsthatdamagecontentquality.Thehighvisibilityandeasy
2、accesstosocialnetworksmakesthempopulartargetsforvandals.Detectingandremovingvandalismiscrit-icalfortheseusergeneratedcontentsystems.Becausevandalismcantakemanyforms,therearemanydifferentkindsoffeaturesthatarepotentiallyusefulforde-tectingit.Thecomplexnatureofvandalism,andthelargenumberofpot
3、entialfea-tures,makevandalismdetectiondifficultandtimeconsumingforhumaneditors.Machinelearningtechniquesholdpromisefordevelopingaccurate,tunable,andmaintainablemodelsthatcanbeincorporatedintovandalismdetectiontools.Wedescribeamethodfortrainingclassifiersforvandalismdetectionthatyieldsclassi-fi
4、ersthataremoreaccurateonthePAN2010corpusthanotherspreviouslydevel-oped.Becauseofthehighturnaroundinsocialnetworksystems,itisimportantforvandalismdetectiontoolstoruninreal-time.Tothisaim,weusefeatureselectiontofindtheminimalsetoffeaturesconsistentwithhighaccuracy.Inaddition,becausesomefeature
5、saremorecostlytocomputethanothers,weusecost-sensitivefeatureselectiontoreducethetotalcomputationalcostofexecutingourmodels.Inadditiontothefeaturespreviouslyusedforspamdetection,weintroducenewfeaturesbasedonuseractionhistories.Theuserhistoryfeaturescontributesignificantlytoclassi-fierperforman
6、ce.Theapproachweuseisgeneralandcaneasilybeappliedtootherusergeneratedcontentsystems.S.Javanmardi(B)UniversityofCalifornia,IrvineDonaldBrenHall5042,Irvine,CA92697-3440,USAe-mail:sjavanma@ics.uci.eduD.W.McDonaldTheInformationSchool,UniversityofWashington,Washington,WA,USAR.CaruanaMicrosoftRes
7、earch,Redmond,WA,USAS.Forouzan·C.V.LopesBrenSchoolofInformationandComputerSciences,UniversityofCalifornia,Irvine,CA,USAT.Özyeretal.(eds.),MiningSocialNetworksandSecurityInformatics,203LectureNotesinSocialNetworks,DOI10.1007/978-94-007-6359-3_11,©Springer