资源描述:
《5-Eyeriss A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、2016ACM/IEEE43rdAnnualInternationalSymposiumonComputerArchitectureEyeriss:ASpatialArchitectureforEnergy-EfficientDataflowforConvolutionalNeuralNetworksYu-HsinChen∗,JoelEmer∗†andVivienneSze∗∗EECS,MIT†NVIDIAResearch,NVIDIACambridge,MA02139Westford,MA01886∗{yhchen,
2、jsemer,sze}@mit.eduAbstract—Deepconvolutionalneuralnetworks(CNNs)areThelargesizeofsuchnetworksposesboththroughputwidelyusedinmodernAIsystemsfortheirsuperioraccuracyandenergyefficiencychallengestotheunderlyingprocessingbutatthecostofhighcomputationalcomplexity.T
3、hecomplex-hardware.Convolutionsaccountforover90%oftheCNNitycomesfromtheneedtosimultaneouslyprocesshundredsoperationsanddominatesruntime[10].Althoughtheseoffiltersandchannelsinthehigh-dimensionalconvolutions,whichinvolveasignificantamountofdatamovement.Althoughop
4、erationscanleveragehighly-parallelcomputeparadigms,highly-parallelcomputeparadigms,suchasSIMD/SIMT,effec-suchasSIMD/SIMT,throughputmaynotscaleaccordinglytivelyaddressthecomputationrequirementtoachievehighduetotheaccompanyingbandwidthrequirement,andthethroughpu
5、t,energyconsumptionstillremainshighasdataenergyconsumptionremainshighasdatamovementcanbemovementcanbemoreexpensivethancomputation.Accord-moreexpensivethancomputation[11–13].Inordertoachieveingly,findingadataflowthatsupportsparallelprocessingwithminimaldatamoveme
6、ntcostiscrucialtoachievingenergy-energy-efficientCNNprocessingwithoutcompromisingefficientCNNprocessingwithoutcompromisingaccuracy.throughput,weneedtodevelopdataflowsthatsupportparallelInthispaper,wepresentanoveldataflow,calledrow-processingwithminimaldatamovement
7、.Thedifferencesstationary(RS),thatminimizesdatamovementenergycon-indatamovementenergycostbasedonwherethedataissumptiononaspatialarchitecture.Thisisrealizedbyex-storedalsoneedstobeaccountedfor.Forinstance,fetchingploitinglocaldatareuseoffilterweightsandfeaturema
8、ppixels,i.e.,activations,inthehigh-dimensionalconvolutions,datafromoff-chipDRAMscostsordersofmagnitudemoreandminimizingdatamovementofpartialsumaccumulations.energythanfromon-chipst