资源描述:
《基于gpgpu的并行影像匹配算法》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、第39卷第1期测绘学报Vol.39,No.12010年2月ActaGeodaeticaetCartographicaSinicaFeb.,2010文章编号:1001-1595(2010)01-0046-06基于GPGPU的并行影像匹配算法1,21肖汉,张祖勋1.武汉大学遥感信息工程学院,湖北武汉430079;2.郑州师范高等专科学校,河南郑州450044ParallelImageMatchingAlgorithmBasedonGPGPUXIAOHan1,2,ZHANGZuxun11.SchoolofRemoteSensi
2、ngandInformationEngineering,WuhanUniversity,Wuhan430079,China;2.ZhengzhouTeachersCollege,Zhengzhou450044,ChinaAbstract:Withthedevelopmentofsatelliteremotesensingtechnology,itisthekeyissueinremotesensingfieldtotransformmassivedataintouserinformationinshorttime.The
3、traditionalimagematchingalgorithmsforoptimiza-tionandimplementationwhichweredesignedforcommonprocessorCPU,couldnotbeeffectivelyappliedongraph-icsprocessingunit(GPU).Afastimagematchingparallelalgorithmispresentedbasedongenera-lpurposecompu-tingongraphicsprocessing
4、units(GPGPU)whichsupportComputeUnifiedDeviceArchitecture(CUDA).Thealgo-rithmcanexecutehighperformanceparallelcomputinginSingleInstructionMultipleThread(SIMT)Pattern.OnthebasisoftheparallelarchitectureandhardwarecharacteristicofGPU,theparallelalgorithmintroducesth
5、reespeedupmethodstoimprovetheimplementationperformance:executionconfigurationtechnology,high-speedstor-agetechnologyandglobalstoragetechnologyoptimizesthedatastoragestructureandimprovesthedataaccessefficiency.TheexperimentresultshowsthatGPUcanwithhighefficiencyim
6、plementtheparallelalgorithmandpro-cessingefficiencyof8-bit1280@1024picturescanbeuptothehighestMultiprocessorWarpOccupancy,process-ingspeedis7timesfasterthanCPU-basedimplementation.ThecomparisonbetweenCUDAandCPUinimagematchingalgorithmsshowstheadvanceoftheCUDAinhi
7、gharithmeticintensityrea-ltimeprocessingandcomputingdataprocessingandthisprovidesnewmethodsandideastooptimizeimagematchingperformanceandGPGPU.Keywords:fine-grainedparallelcomputing;GPGPU;CUDA;imagematching;SIMT摘要:提出一种基于GPGPU的CUDA架构快速影像匹配并行算法,它能够在SIMT模式下完成高性能并行计算。
8、并行算法根据GPU的并行结构和硬件特点,采用执行配置技术、高速存储技术和全局存储技术三种加速技术,优化数据存储结构,提高数据访问效率。实验结果表明,并行算法充分利用GPU的并行处理能力,在处理1280@1024分辨率的8位灰度图像时可达到最高多处理器warp占有率,速度是基于CPU实现的7倍。CUDA在高运算强度数据