资源描述:
《Memory Barriers a Hardware View for Software Hackers.pdf》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、MemoryBarriers:aHardwareViewforSoftwareHackersPaulE.McKenneyLinuxTechnologyCenterIBMBeavertonpaulmck@linux.vnet.ibm.comJuly23,2010SowhatpossessedCPUdesignerstocausethemingteninstructionspernanosecond,butwillrequiretoinflictmemorybarriersonpoorunsuspectingSM
2、Pmanytensofnanosecondstofetchadataitemfromsoftwaredesigners?mainmemory.Thisdisparityinspeed—morethanInshort,becausereorderingmemoryreferencesal-twoordersofmagnitude—hasresultedinthemulti-lowsmuchbetterperformance,andsomemorybarri-megabytecachesfoundonmoder
3、nCPUs.Theseersareneededtoforceorderinginthingslikesynchro-cachesareassociatedwiththeCPUsasshowninFig-nizationprimitiveswhosecorrectoperationdependsure1,andcantypicallybeaccessedinafewcycles.1onorderedmemoryreferences.Gettingamoredetailedanswertothisquestio
4、nrequiresagoodunderstandingofhowCPUcachesCPU0CPU1work,andespeciallywhatisrequiredtomakecachesreallyworkwell.Thefollowingsections:1.presentthestructureofacache,CacheCache2.describehowcache-coherencyprotocolsensureInterconnectthatCPUsagreeonthevalueofeachloc
5、ationinmemory,and,finally,3.outlinehowstorebuffersandinvalidatequeuesMemoryhelpcachesandcache-coherencyprotocolsachievehighperformance.Wewillseethatmemorybarriersareanecessaryevilthatisrequiredtoenablegoodperformanceandscal-Figure1:ModernComputerSystemCacheS
6、tructureability,anevilthatstemsfromthefactthatCPUsareordersofmagnitudefasterthanareboththein-DataflowsamongtheCPUs’cachesandmemoryterconnectsbetweenthemandthememorytheyareinfixed-lengthblockscalled“cachelines”,whichareattemptingtoaccess.normallyapoweroftwoin
7、size,rangingfrom16to256bytes.Whenagivendataitemisfirstaccessedby1CacheStructure1Itisstandardpracticetousemultiplelevelsofcache,withasmalllevel-onecacheclosetotheCPUwithsingle-cycleac-cesstime,andalargerlevel-twocachewithalongeraccessModernCPUsaremuchfastert
8、hanaremodernmem-time,perhapsroughlytenclockcycles.Higher-performanceorysystems.A2006CPUmightbecapableofexecut-CPUsoftenhavethreeorevenfourlevelsofcache.1Way0Way1agivenCPU,itwillbeabsentfromthatCPU’scache,0x00