欢迎来到天天文库
浏览记录
ID:39713825
大小:437.88 KB
页数:6页
时间:2019-07-09
《085_optimizing_recursive_programs_in_mr》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、InMapReduce(computenext(removetheonesgenerationofnodes)we’vealreadyseen)JoinDifferenceΔAi-1mapreducemapreduceR(0)mapreducemap(0)R(1)mapAimapreduceA(1)imapAnythingnew?i=i+1Driverdone6/17/2013BillHowe,UW1Buetal.VLDB10,VLDBJ12Shawetal.Datalog12Evaluatin
2、gRecursiveQueriesatScale(computethenext(removetheonesgenerationofanswers)we’vealreadyseen)JoinDifferenceΔAi-1mapreducemapreduceR(0)mapreducemap(0)R(1)mapAimapreduce(a)(b)A(1)imap(a)Risloopinvariant,butgetsloadedandshuffledoneachiteration(b)Agrowsslow
3、lyandmonotonically,butisloadedandshuffledoneachiiteration.6/17/2013BillHowe,UW2Idea:CacheLoop-InvariantDataIterationi=0:LoadadistributedcacheIterationi>0:JoinDifferenceΔAi-1mapreducemapreduceR(0)mapR(0)reducemapA(0)R(1)mapR(1)iA(0)mapreduceA(1)(1)iAm
4、ap6/17/2013BillHowe,UW3BTC2010,680GBCache-enabledjoinoutercachetime(ms)700nocache600500(s)400time30020010000246810iteration#(3jobsperiteration)6/17/2013BillHowe,UW4EffectofvariousoptimizationsforarecursivegraphqueryonBTC2010(query:transitivereachabil
5、ityfrom7nodes)35000nooptimizations(e):(a)w/ofilecombiner30000(c):(a)w/oprojectopt.25000(b):(a)w/ofastercache)s(d):(a)w/odiffcache(20000e(a)alloptimizationsm15000rawHadoop1000050000050100150200250iteraon#Takeaways:•10ximprovementovernooptimizations.•A
6、lloptimizationsareuseful•We’reapproachingtherawoverheadofHadoop(bottomgrayline)Newtuplesdiscoveredbyiterationnumber100,000,000frontiertuplespreviouslydiscoveredtuplesremoved10,000,0001,000,000100,00010,0001,000#oftuplesdiscovered100101020406080100120
7、140160180iteration#6/17/2013BillHowe,UW6
此文档下载收益归作者所有
点击更多查看相关文章~~