资源描述:
《斯坦福深度学习课件8 Topology and Geometry of Half-Rectified Network Optimizationstanford_nov15》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、TopologyandGeometryofHalf-RectifiedNetworkOptimizationDanielFreeman1andJoanBruna21UCBerkeley2CourantInstituteandCenterforDataScience,NYUStats385StanfordNov15thMotivation•WeconsiderthestandardMLsetup:X1Pˆ= (xi,yi)Eˆ(⇥)=E`( (X;⇥),Y)+R(⇥)ni(X,Y)⇠Pˆ`(z)convexE(⇥)=E(X,Y)⇠P`( (X;⇥),Y).
2、R(⇥):regularizationMotivation•WeconsiderthestandardMLsetup:X1Pˆ= (xi,yi)Eˆ(⇥)=E`( (X;⇥),Y)+R(⇥)ni(X,Y)⇠Pˆ`(z)convexE(⇥)=E(X,Y)⇠P`( (X;⇥),Y).R(⇥):regularization•Populationlossdecomposition(aka“fundamentaltheoremofML”):⇤⇤⇤⇤E(⇥)=Eˆ(⇥)+E(⇥) Eˆ(⇥).
3、{z}
4、{z}trainingerrorgeneralizationg
5、ap•Longhistoryoftechniquestoprovablycontrolgeneralizationerrorviaappropriateregularization.•Generalizationerrorandoptimizationareentangled[Bottou&Bousquet]Motivation•However,whenisalarge,deepnetwork,currentbest (X;⇥)mechanismtocontrolgeneralizationgaphastwokeyingredients:–Stochas
6、ticOptimization❖“Duringtraining,itaddsthesamplingnoisethatcorrespondstoempirical-populationmismatch”[LéonBottou].–Makethemodelaslargeaspossible.❖seee.g.“UnderstandingDeepLearningRequiresRethinkingGeneralization”,[Ch.Zhangetal,ICLR’17].Motivation•However,whenisalarge,deepnetwork,c
7、urrentbest (X;⇥)mechanismtocontrolgeneralizationgaphastwokeyingredients:–StochasticOptimization❖“duringtraining,itaddsthesamplingnoisethatcorrespondstoempirical-populationmismatch”[LéonBottou].–Makethemodelaslargeaspossible.❖seee.g.“UnderstandingDeepLearningRequiresRethinkingGene
8、ralization”,[Ch.Zhangetal,ICLR’17].•Wefirstaddresshowoverparametrizationaffectstheenergylandscapes.E(⇥),Eˆ(⇥)•Goal1:Studysimpletopologicalpropertiesoftheselandscapesforhalf-rectifiedneuralnetworks.•Goal2:Estimatesimplegeometricpropertieswithefficient,scalablealgorithms.Diagnostictoo
9、l.OutlineoftheLecture•TopologyofDeepNetworkEnergyLandscapes•GeometryofDeepNetworkEnergyLandscapes•EnergyLandscapes,StatisticalInferenceandPhaseTransitions.PriorRelatedWork•ModelsfromStatisticalphysicshavebeenconsideredaspossibleapproximations[Dauphinetal.’14,Choromanskaetal.’15,S
10、egunetal.’15]•Tensorfactorizationmodelsc