资源描述:
《Scuba Diving into Data at Facebook.pdf》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、Scuba:DivingintoDataatFacebookLiorAbrahamJohnAllenOleksandrBarykinVinayakBorkarBhuwanChopraCiprianGereaDanielMerlJoshMetzlerDavidReissSubbuSubramanianJanetL.WienerOkayZedFacebook,Inc.MenloPark,CAABSTRACTOriginally,wereliedonpre-aggregatedgraphsandacarefullymanaged,hand-coded,s
2、etofscriptsoveraMySQLdatabaseofper-Facebooktakesperformancemonitoringseriously.Performanceformancedata.By2011,thatsolutionbecametoorigidandslow.issuescanimpactoveronebillionuserssowetrackthousandsofItcouldnotkeepupwiththegrowingdataingestionandqueryservers,hundredsofPBofdailyne
3、tworktraffic,hundredsofdailyrates.OtherquerysystemswithinFacebook,suchasHive[20]andcodechanges,andmanyothermetrics.WerequirelatenciesofPeregrine[13],querydatathatiswrittentoHDFSwithalong(typ-underaminutefromeventsoccuring(aclientrequestonaphone,aicallyoneday)latencybeforedataism
4、adeavailabletoqueriesandbugreportfiled,acodechangecheckedin)tographsshowingthosequeriesthemselvestakeminutestorun.eventsondevelopers’monitors.Therefore,webuiltScuba,afast,scalable,in-memorydatabase.ScubaisthedatamanagementsystemFacebookusesformostScubaisasignificantevolutioninthe
5、waywecollectandanalyzereal-timeanalysis.Scubaisafast,scalable,distributed,in-memorydatafromthevarietyofsystemsthatkeepthesiterunningeverydatabasebuiltatFacebook.Itcurrentlyingestsmillionsofrowsday.WenowuseScubaformostreal-time,ad-hocanalysisofarbi-(events)persecondandexpiresdat
6、aatthesamerate.Scubastorestrarydata.WecompareScubatootherdatamanagementsystemsdatacompletelyinmemoryonhundredsofserverseachwith144laterinthepaper,butweknowofnoothersystemthatbothingestsGBRAM.Toprocesseachquery,ScubaaggregatesdatafromalldataasfastandrunscomplexqueriesasfastasScu
7、ba.servers.Scubaprocessesalmostamillionqueriesperday.ScubaisToday,Scubarunsonhundredsofserverseachwith144GBusedextensivelyforinteractive,adhoc,analysisqueriesthatruninRAMinashared-nothingcluster.Itstoresaround70TBofcom-underasecondoverlivedata.Inaddition,Scubaistheworkhorsepres
8、seddataforover1000tablesinmemory,distributedbypar-behi