资源描述:
《Dremel, Interactive Analysis of Web-Scale Datasets》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、Dremel:InteractiveAnalysisofWeb-ScaleDatasetsSergeyMelnik,AndreyGubarev,JingJingLong,GeoffreyRomer,ShivaShivakumar,MattTolton,TheoVassilakisGoogle,Inc.fmelnik,andrey,jlong,gromer,shiva,mtolton,theovg@google.comABSTRACTexchangedbydistributedsystems,structured
2、documents,etc.lendthemselvesnaturallytoanestedrepresentation.NormalizingandDremelisascalable,interactivead-hocquerysystemforanaly-recombiningsuchdataatwebscaleisusuallyprohibitive.Anestedsisofread-onlynesteddata.Bycombiningmulti-levelexecutiondatamodelunderl
3、iesmostofstructureddataprocessingatGoogletreesandcolumnardatalayout,itiscapableofrunningaggrega-[21]andreportedlyatothermajorwebcompanies.tionqueriesovertrillion-rowtablesinseconds.Thesystemscales1ThispaperdescribesasystemcalledDremelthatsupportsinter-tothou
4、sandsofCPUsandpetabytesofdata,andhasthousandsactiveanalysisofverylargedatasetsoversharedclustersofcom-ofusersatGoogle.Inthispaper,wedescribethearchitecturemoditymachines.Unliketraditionaldatabases,itiscapableofop-andimplementationofDremel,andexplainhowitcomp
5、lementseratingoninsitunesteddata.InsitureferstotheabilitytoaccessMapReduce-basedcomputing.Wepresentanovelcolumnarstor-data‘inplace’,e.g.,inadistributedfilesystem(likeGFS[14])oragerepresentationfornestedrecordsanddiscussexperimentsonanotherstoragelayer(e.g.,Bi
6、gtable[8]).Dremelcanexecutemanyfew-thousandnodeinstancesofthesystem.queriesoversuchdatathatwouldordinarilyrequireasequenceofMapReduce(MR[12])jobs,butatafractionoftheexecutiontime.1.INTRODUCTIONDremelisnotintendedasareplacementforMRandisoftenusedLarge-scalean
7、alyticaldataprocessinghasbecomewidespreadininconjunctionwithittoanalyzeoutputsofMRpipelinesorrapidlywebcompaniesandacrossindustries,notleastduetolow-costprototypelargercomputations.storagethatenabledcollectingvastamountsofbusiness-criticalDremelhasbeeninprod
8、uctionsince2006andhasthousandsofdata.PuttingthisdataatthefingertipsofanalystsandengineersuserswithinGoogle.MultipleinstancesofDremelaredeployedinhasgrownincreasinglyimportant;interactiveresponset