资源描述:
《The HiBench benchmark suite Characterization of the MapReduce-based data analysis》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、TheHiBenchBenchmarkSuite:CharacterizationoftheMapReduce-BasedDataAnalysisShengshengHuang,JieHuang,JinquanDai,TaoXie,andBoHuangIntelChinaSoftwareCenter,Shanghai,P.R.China,200241{shengsheng.huang,jie.huang,jason.dai,tao.xie,bo.huang}@intel.comAbstract—TheMapReducemodelis
2、becomingprominentforthecloud.Inaddition,manynewsystemsbuiltontopofHadooplarge-scaledataanalysisinthecloud.Inthispaper,wepresent(e.g.,Pig[3],Hive[4],Mahout[5]andHBase[6])havethebenchmarking,evaluationandcharacterizationofHadoop,emergedandbeenusedbyawiderangeofdataanalys
3、isanopen-sourceimplementationofMapReduce.Wefirstapplications.introduceHiBench,anewbenchmarksuiteforHadoop.ItTherefore,itisessentialtoquantitativelyevaluateandconsistsofasetofHadoopprograms,includingbothsyntheticmicro-benchmarksandreal-worldHadoopapplications.Wethenchar
4、acterizetheHadoopframeworkthroughextensiveevaluateandcharacterizetheHadoopframeworkusingbenchmarking,soastooptimizetheperformanceandtotalHiBench,intermsofspeed(i.e.,jobrunningtime),throughputcostofownershipofHadoopdeployments,andtounderstand(i.e.,thenumberoftaskscomple
5、tedperminute),HDFSthetradeoffsofnewcomputersystemdesignsforthebandwidth,systemresource(e.g.,CPU,memoryandI/O)MapReduce-baseddataanalysisusingHadoop.Unfortunately,utilizations,anddataaccesspatterns.existingHadoopbenchmarkprograms(e.g.,GridMix[7]andI.INTRODUCTIONtheHivep
6、erformancebenchmark[9])cannotproperlyThetransitiontocloudcomputingisadisruptivetrend,evaluatetheHadoopframeworkduetothelimitationsintheirwheremostuserswillperformtheircomputingworkbyrepresentativenessanddiversity.Forinstance,Yahoohasaccessingservicesinthecloudthroughth
7、eclients.Thereareresortedtothesimplisticsortingprograms[10]toevaluatedramaticdifferencesbetweendeliveringsoftwareasaservicetheirHadoopclusters[11].inthecloudformillionstouse,versusdistributingsoftwareasInthispaper,wefirstproposeHiBench,anew,realisticandbitsformillionst
8、orunontheirPCs.Firstandforemost,comprehensivebenchmarksuiteforHadoop,whichconsistsservicesmustbehighlyscalable,storin