资源描述:
《大数据分析仓库hive存储结构扩展的设计和实现》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、大数据分析仓库Hive存储结构扩展的设计和实现metadata.Thisalgorithmbuildstheindexofthemaximumsandminimumsofcolumndata.Forquerieswithfilters,itwillhelpskiptherecordsthatdonotsatisfythefilterconditions,insteadofloadingthewholecolumnintomemory.Experimentsshowthatonly1/4dataisloadedintomemorycompar
2、edtoRecordColumnFile.Second,currentcompressionalgorithmsinHive,suchasLZO,donotconsidertheactualdatadistribution,thuscannotprovidebettercompressionratio.Inordertosavestoragespaceandimprovedataloadingefficiency,thisthesisproposesthreedatacompressionalgorithms,andprovidesanad
3、aptivedecisionmethodofthesealgorithms,basedonthedatadistributionacrosseachcolumn.Thethreealgorithmsaresuitableforthearithmeticprogressionsequences,lots-of-duplicated-valuessequences,andsmalladjacentincrementsequences,respectively.Theadaptivedecisionalgorithmisresponsiblefo
4、rselectingthesuitablecompressionalgorithmforeachcolumn,toachievebettercompressionperformance.Experimentsshowthattheproposedalgorithmimprovesabout50%compressionratio,reducesabout10%compressiontimeand10%decompressiontime.Third,FOSFimplementstheStorageHandlerinterfaceinHive.F
5、OSFprovidesthecolumnarindexbasedonmetadata,self-adaptivecompressionbasedondatadistribution,andhybridstorage.KeywordsHadoop,Hive,SQL,Storage,AdaptiveCompression,ColumnIndexIV万方数据大数据分析仓库Hive存储结构扩展的设计和实现目录1绪论................................................................11.1
6、论文背景与意义................................................11.2国内外研究现状................................................21.2.1SQL解析器的优化.........................................31.2.2VectorizedQueryExecution..............................41.2.3存储格式的改进..............................
7、.............51.2.4HDFSRAID优化...........................................51.2.5I/O层的优化.............................................61.2.6Reduce计算资源动态分配的研究............................61.3研究目标和内容...............................................71.3.1研究目标........................
8、.........................71.3.2研究内容.................................................81.4论