资源描述:
《FINDING HIGH-QUALITY CONTENT IN SOCIAL MEDIA WITH AN APPLICATION TO COMMUNITY-BASED QUESTIO》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、TECHNICALREPORTYR-2007-005FINDINGHIGH-QUALITYCONTENTINSOCIALMEDIAWITHANAPPLICATIONTOCOMMUNITY-BASEDQUESTIONANSWERINGEugeneAgichtein1,CarlosCastillo2,DeboraDonato2,AristidesGionis2,GiladMishne31EmoryUniversityAtlanta,USA2Yahoo!ResearchBarcelonaSpain3SearchandAdvertisingSciences,Yahoo!25Septembe
2、r2007SantaClara,California•Berkeley,California•Burbank,CaliforniaNewYork,NewYork•Barcelona,Spain•Santiago,ChileYahoo!ResearchReportNo.YR-2007-005Yahoo!ResearchReportNo.YR-2007-005FINDINGHIGH-QUALITYCONTENTINSOCIALMEDIAWITHANAPPLICATIONTOCOMMUNITY-BASEDQUESTIONANSWERINGEugeneAgichtein1,CarlosCa
3、stillo2,DeboraDonato2,AristidesGionis2,GiladMishne31EmoryUniversityAtlanta,USA2Yahoo!ResearchBarcelonaSpain3SearchandAdvertisingSciences,Yahoo!25September2007ABSTRACT:Thequalityofuser-generatedcontentvariesdrasticallyfromexcellenttoabuseandspam.Astheavailabilityofsuchcontentincreases,thetaskof
4、identifyinghigh-qualitycontentinsitesbasedonusercontributions—socialmediasites—becomesin-creasinglyimportant.Socialmediaingeneralexhibitarichvarietyofinformationsources:inadditiontothecontentitself,thereisawidearrayofnon-contentinformationavailable,suchaslinksbetweenitemsandexplicitqualityrati
5、ngsfrommembersofthecommunity.Inthispaperweinvestigatemethodsforexploitingsuchcommunityfeedbacktoautomaticallyidentifyhighqualitycontent.Asatestcase,wefocusonYahoo!Answers,alargecommu-nityquestionansweringportalthatisparticularlyrichintheamountandtypesofcontentandsocialinteractionsavailableinit
6、.Weintroduceageneralclassificationframeworkforcombiningtheevidencefromdifferentsourcesofinformationthatcanbetunedautomati-callyforagivensocialmediatypeandqualitydefinition.Inparticular,forthecommunityquestionansweringdomain,weshowthatoursystemisabletoseparatehigh-qualityitemsfromtherestwithanaccu
7、racyclosetothatofhumans.1Yahoo!ResearchReportNo.YR-2007-0051.IntroductionRecentyearshaveseenatransformationinthetypeofcontentavailableontheweb.Dur-ingthefirstdecadeoftheweb’sprominence—fromtheearly1990sonwards—mostonlinecontentresembledt