欢迎来到天天文库
浏览记录
ID:40091814
大小:1.30 MB
页数:17页
时间:2019-07-20
《Revisiting Sorting for GPGPU Stream Architectures》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、RevisitingSortingforGPGPUStreamArchitectures1DuaneMerrill(dgm4d@virginia.edu)AndrewGrimshaw(grimshaw@virginia.edu)AbstractThisreportpresentsefficientstrategiesforsortinglargesequencesoffixed-lengthkeys(andvalues)usingGPGPUstreamprocessors.Ourradixsortingmethodsdemonstratesortingrates
2、of482millionkey-valuepairspersecond,and550millionkeyspersecond(32-bit).Comparedtothestate-of-the-art,ourimplementationsexhibitspeedupofatleast2xforallfully-programmablegenerationsofNVIDIAGPUs,andupto3.7xforcurrentGT200-basedmodels.Forthisdomainofsortingproblems,webelieveoursortingpri
3、mitivetobethefastestavailableforanyfully-programmablemicroarchitecture.Weobtainoursortingperformancebyusingaparallelscanstreamprimitivethathasbeengeneralizedintwoways:(1)withlocalinterfacesforproducer/consumeroperations(visitinglogic),and(2)withinterfacesforperformingmultiplerelated,
4、concurrentprefixscans(multi-scan).Theseabstractionsallowustoimprovetheoverallutilizationofmemoryandcomputationalresourceswhilemaintainingtheflexibilityofareusablecomponent.Werequire38%fewerbytestobemovedthroughtheglobalmemorysubsystemanda64%reductioninthenumberofthread-cyclesneededfo
5、rcomputation.Aspartofthiswork,wedemonstrateamethodforencodingmultiplecompactionproblemsintoasingle,compositeparallelscan.Thistechniqueprovidesourlocalsortingstrategieswitha2.5xspeedupoverbitonicsortingnetworksforsmallprobleminstances,i.e.,sequencesthatcanbeentirelysortedwithintheshar
6、edmemorylocaltoasingleGPUcore.1IntroductionThetransformationofthefixed-functiongraphicsprocessingunitintoafully-programmable,high-bandwidthcoprocessor(GPGPU)hasintroducedawealthofperformanceopportunitiesformanydata-parallelproblems.Asanewanddisruptivegenreofmicroarchitecture,itwillbe
7、importanttoestablishefficientcomputationalprimitivesforthecorrespondingprogrammingmodel.Computationalprimitivespromotesoftwareflexibilityviaabstractionandreuse,andmuchefforthasbeenspentinvestigatingefficientprimitivesforGPGPUstreamarchitectures.Parallelsortinghasbeennoexception:thene
8、edtorankando
此文档下载收益归作者所有