欢迎来到天天文库
浏览记录
ID:39256689
大小:3.28 MB
页数:54页
时间:2019-06-28
《OpenCL_Best_Practices_Guide英文文献资料》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、OptimizationOpenCLBestPracticesGuideMay27,2010OpenCLBestPracticesGuideREVISIONSJuly2009(OriginalRelease)April2010May2010iiMay27,2010TableofContentsPreface............................................................................................................................vWhatIsT
2、hisDocument?vWhoShouldReadThisGuide?vRecommendationsandBestPracticesvContentsSummaryviChapter1.HeterogeneousComputingwithOpenCL.....................................................11.1DifferencesBetweenHostandDevice11.2WhatRunsonanOpenCL-EnabledDevice?21.3MaximumPerformanceBenefit3Chap
3、ter2.PerformanceMetrics.....................................................................................52.1Timing52.1.1UsingCPUTimers52.1.2UsingOpenCLGPUTimers62.2Bandwidth62.2.1TheoreticalBandwidthCalculation62.2.2EffectiveBandwidthCalculation72.2.3ThroughputReportedbytheOpenCLVi
4、sualProfiler7Chapter3.MemoryOptimizations..................................................................................93.1DataTransferBetweenHostandDevice93.1.1PinnedMemory93.1.2AsynchronousTransfers123.1.3OverlappingTransfersandDeviceComputation123.2DeviceMemorySpaces163.2.1Coale
5、scedAccesstoGlobalMemory173.2.1.1ASimpleAccessPattern183.2.1.2ASequentialbutMisalignedAccessPattern183.2.1.3EffectsofMisalignedAccesses193.2.1.4StridedAccesses213.2.2SharedMemory223.2.2.1SharedMemoryandMemoryBanks223.2.2.2SharedMemoryinMatrixMultiplication(C=AB)23T3.2.2.3SharedMemoryin
6、MatrixMultiplication(C=AA)27April30,2010iiiOpenCLBestPracticesGuide3.2.2.4SharedMemoryUsebyKernelArguments293.2.3LocalMemory293.2.4TextureMemory303.2.4.1TexturedFetchvs.GlobalMemoryRead303.2.4.2AdditionalTextureCapabilities303.2.5ConstantMemory313.2.6Registers313.2.6.1RegisterPressure3
7、1Chapter4.NDRangeOptimizations..............................................................................334.1Occupancy334.2CalculatingOccupancy334.3HidingRegisterDependencies354.4ThreadandBlockHeuristics354.5EffectsofSharedMemory36Chapter5.InstructionOptimizations................
此文档下载收益归作者所有