欢迎来到天天文库
浏览记录
ID:33728854
大小:440.55 KB
页数:9页
时间:2019-02-28
《机器学习十大算法:PageRank》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、Chapter6PageRankBingLiuandPhilipS.YuContents6.1Introduction...........................................................1176.2PageRankAlgorithm...................................................1186.3AnExtension:Timed-PageRank....................................
2、...1236.4Summary..............................................................1246.5Exercises..............................................................124References..................................................................1256.1IntroductionLink-bas
3、edrankinghascontributedsignificantlytothesuccessofWebsearch.PageRank[1,7]isperhapsthebestknownlink-basedrankingalgorithm,whichalsopowerstheGooglesearchengine.DuetothehugebusinesssuccessofGoogle,PageRankhasemergedasthedominantlinkanalysismodelontheWeb.ThePageRan
4、kalgorithmwasfirstintroducedbySergeyBrinandLarryPageattheSeventhInternationalWorldWideWebConference(WWW7)inApril1998,withtheaimoftacklingsomemajordifficultieswiththecontent-basedrankingalgorithmsofearlysearchengines.Theseearlysearchenginesessentiallyretrievedrel
5、evantpagesfortheuserbasedoncontentsimilaritiesoftheuserqueryandtheindexedpagesofthesearchengines.Theretrievalandrankingalgorithmsweresimplydirectimplementationofthosefrominformationretrieval.However,startingfrom1996,itbecameclearthatthecontentsimilarityalonewa
6、snolongersufficientforsearchduetotwomainreasons.First,thenumberofWebpagesgrewrapidlyduringthemiddletolate1990s.Givenanyquery,thenumberofrelevantpagescanbehuge.Forexample,giventhesearchquery“classificationtechnique,”theGooglesearchengineestimatesthatthereareabout
7、10millionrelevantpages.Thisabundanceofinformationcausesamajorproblemforranking,thatis,howtochooseonly10to30pagesandrankthemsuitablytopresenttotheuser.Second,contentsimilaritymethodsareeasilyspammed.Apageownercanrepeatsomeimportantwordsandaddmanyremotelyrelated
8、wordsinhis/herpagestoboosttherankingsofthepagesand/ortomakethepagesrelevanttoalargenumberofpossiblequeries.117©2009byTaylor&FrancisGroup,LLC118PageRankFromaround1996,researchersina
此文档下载收益归作者所有