资源描述:
《Large-scale Parallel Collaborative Filtering for the netflix prize》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、Large-scaleParallelCollaborativeFilteringfortheNetflixPrizeYunhongZhou,DennisWilkinson,RobertSchreiberandRongPanHPLabs,1501PageMillRd,PaloAlto,CA,94304{yunhong.zhou,dennis.wilkinson,rob.schreiber,rong.pan}@hp.comAbstract.Manyrecommendationsystemssuggestitems
2、tousersbyutilizingthetechniquesofcollaborativefiltering(CF)basedonhistor-icalrecordsofitemsthattheusershaveviewed,purchased,orrated.TwomajorproblemsthatmostCFapproacheshavetoresolvearescal-abilityandsparsenessoftheuserprofiles.Inthispaper,wedescribeAlternatin
3、g-Least-SquareswithWeighted-λ-Regularization(ALS-WR),aparallelalgorithmthatwedesignedfortheNetflixPrize,alarge-scalecol-laborativefilteringchallenge.WeuseparallelMatlabonaLinuxclusterastheexperimentalplatform.WeshowempiricallythattheperformanceofALS-WRmonoton
4、icallyincreaseswithboththenumberoffeaturesandthenumberofALSiterations.OurALS-WRappliedtotheNet-flixdatasetwith1000hiddenfeaturesobtainedaRMSEscoreof0.8985,whichisoneofthebestresultsbasedonapuremethod.Combinedwiththeparallelversionofotherknownmethods,weachiev
5、edaperformanceimprovementof5.91%overNetflix’sownCineMatchrecommendationsystem.Ourmethodissimpleandscaleswelltoverylargedatasets.1IntroductionRecommendationsystemstrytorecommenditems(movies,music,webpages,products,etc)tointerestedpotentialcustomers,basedonthe
6、informationavail-able.Asuccessfulrecommendationsystemcansignificantlyimprovetherevenueofe-commercecompaniesorfacilitatetheinteractionofusersinonlinecommu-nities.Amongrecommendationsystems,content-basedapproachesanalyzethecontent(e.g.,texts,meta-data,features
7、)oftheitemstoidentifyrelateditems,whilecollaborativefilteringusestheaggregatedbehavior/tasteofalargenum-berofuserstosuggestrelevantitemstospecificusers.CollaborativefilteringispopularandwidelydeployedinInternetcompanieslikeAmazon[16],Netflix[2],GoogleNews[7],an
8、dothers.TheNetflixPrizeisalarge-scaledataminingcompetitionheldbyNetflixforthebestrecommendationsystemalgorithmforpredictinguserratingsonmovies,basedonatrainingsetofmorethan100millionratingsgivenbyover480