资源描述:
《Hive高级编程-weibo.pdf》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、Hive高级编程天照Agenda•HiveComponents•MapReduce•HiveQL•Hive优化•SQL优化HIVE:ComponentsMapReduceHDFSHiveCLIWebUIBrowsingDDLQueriesThriftAPIParserExecutionMetaStorePlannerOptimizerSerDeDBThriftCSVJSON..Facebook(Simplified)MapReduceReviewMachine1<
2、nk1,nv6>LocalGlobalLocalLocalMapShuffleSortReduceMachine2HiveQL–Joinpage_viewpv_usersuserpageiuseritimepageiagedduseriagegenderdd11119:08:01X=12511125
3、female21119:08:1322522232male12229:08:14132•SQL:INSERTINTOTABLEpv_usersSELECTpv.pageid,u.ageFROMpage_viewpvJOINuseruON(pv.userid=u.userid);HiveQL–JoininMapReducepage_viewpageiuseritimekeyvaluekeyvaluedd111<1,1>111<1,1>11119:08:01111<1,2>111<1,2>21119:08:13222<1,1>111<2,2512229:08:14Shuffle>userMapSo
4、rtReduceuseriagegenderkeyvaluekeyvalued111<2,25222<1,1>11125female>222<2,3222232male222<2,32>>HiveQL–GroupBypv_userspageid_age_sumpageiagepageiageCouddnt125125122522521321321225•SQL:▪INSERTINTOTABLEpageid_age_sum▪SELECTpageid,age,count(1)▪FROMpv_users–GROUPBYpageid,age;HiveQL–GroupByinMapReducepv_us
5、erspageid_age_sumpageiagekeyvaluekeyvaluepageid<1,21<1,211255>5>225<2,21<1,315>Shuffle2>MapReduceSortpageiagekeyvaluekeyvaluedpagei<1,31<2,211322>5>225<2,21<2,215>5>HiveQL–GroupBywithDistinctpage_viewpageiuseritimeresultddpageicount_distinct_us11119:08:01derid21119:08:131212229:08:142121119:08:20•SQ
6、L–SELECTpageid,COUNT(DISTINCTuserid)–FROMpage_viewGROUPBYpageidHiveOptimizationsEfficientexecutionofSQLonMapReduce(Simplified)MapReduceRevisitMachine1LocalGlobalLocalLocalMapShuffleSor
7、tReduceMachine2HiveOptimizations–MergeSequentialMapReduceJobsAkeyavAB1111keyavbvBABC1111222