mit-introduction to numerical methods-hd

mit-introduction to numerical methods-hd

ID:33757306

大小:1.54 MB

页数:93页

时间:2019-02-28

mit-introduction to numerical methods-hd_第1页
mit-introduction to numerical methods-hd_第2页
mit-introduction to numerical methods-hd_第3页
mit-introduction to numerical methods-hd_第4页
mit-introduction to numerical methods-hd_第5页
资源描述:

《mit-introduction to numerical methods-hd》由会员上传分享,免费在线阅读,更多相关内容在教育资源-天天文库

1、18.335Fall2008PerformanceExperimentswithMatrixMultiplicationStevenG.JohnsonHardware:2.66GHzIntelCore2Duo64-bitmode,doubleprecision,gcc4.1.2optimizedBLASdgemm:ATLAS3.6.0http://math-atlas.sourceforge.net/Atrivialproblem?C=ABm×pm×nn×pthe“obvious”Ccode:fori=1tomforj=1

2、top/*C=AB,whereAismxn,Bisnxp,nandCismxp,inrow-majororder*/C=ABvoidmatmul(constdouble*A,constdouble*B,ij∑ikkjdouble*C,intm,intn,intp)k=1{inti,j,k;for(i=0;i

3、}}justthreeloops,howcomplicatedcanitget?flops/timeisnotconstant!(squarematrices,m=n=p)L1cache(2.66GHzprocessor?exceeded?why<1gigaflops?)L2cacheexceeded?L1cacheexceededforsinglerow?Notall“noise”israndomAllflopsarenotcreatedequalnearlypeaktheoreticalfloprate(2flops/

4、cycleviaSSE2instructions)same#operationssameabstractalgorithmfactorof10inspeedThingstoremember•Wecannotunderstandperformancewithoutunderstandingmemoryefficiency(caches).–~10timesmoreimportantthanarithmeticcount•Computersaremorecomplicatedthanyouthink.•Evenatrivial

5、algorithmisnontrivialtoimplementwell.–matrixmultiplication:10linesofcode→130,000+(ATLAS)MITOpenCourseWarehttp://ocw.mit.edu18.335J/6.337JIntroductiontoNumericalMethodsFall2010ForinformationaboutcitingthesematerialsorourTermsofUse,visit:http://ocw.mit.edu/terms.ide

6、alcacheCPUmainmemoryZitemscachehit:CPUneedsitemincache(fast)cachemiss:CPUneedsitemnotincache—itemloadedintocacheforfutureuse,replacingsomeotheritemoptimalreplacement:oncachemiss,loadeditemreplacesitemthatwillnotbeneededforthelongesttimeinthefuture[morerealisticsch

7、eme:LRUreplacement—replaceleastrecentlyuseditem—provablywithinsmallconstantfactorofoptimal,butmuchhardertoanalyze]fullyassociative—anyiteminmemorycangoanywhereinthecache[realcacheshavelimitedassociativity,whichcauses“unlucky”memory-accesspatternstogosameplaceincac

8、he…effectivelyshrinkscacheinthesecases]temporallocality—sameitemisre-usedforseveralcomputationsthatareclosetooneanotherintime⇒stillin-cache⇒efficient[th

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。