资源描述:
《overfitting@chee-in-SCU》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、RichCaruana,SteveLawrence,C.LeeGiles.OverfittinginNeuralNetworks:Backpropagation,ConjugateGradient,andEarlyStopping,NeuralInformationProcessingSystems,Denver,Colorado,November28–30,2000.OverfittinginNeuralNets:Backpropagation,ConjugateGradient,andEarlyStoppingRichCarua
2、naSteveLawrenceLeeGilesCALD,CMUNECResearchInstituteInformationSciences5000ForbesAve.4IndependenceWayPennStateUniversityPittsburgh,PA15213Princeton,NJ08540UniversityPark,PA16801caruana@cs.cmu.edulawrence@research.nj.nec.comgiles@ist.psu.eduAbstractTheconventionalwisdo
3、misthatbackpropnetswithexcesshiddenunitsgeneralizepoorly.Weshowthatnetswithexcesscapacitygeneralizewellwhentrainedwithbackpropandearlystopping.Experimentssug-gesttworeasonsforthis:1)Overfittingcanvarysignificantlyindifferentregionsofthemodel.Excesscapacityallowsbetterfi
4、ttoregionsofhighnon-linearity,andbackpropoftenavoidsoverfittingtheregionsoflownon-linearity.2)Regardlessofsize,netslearntasksubcomponentsinsimilarsequence.Bignetspassthroughstagessimilartothoselearnedbysmallernets.Earlystoppingcanstoptrainingthelargenetwhenitgeneraliz
5、escomparablytoasmallernet.Wealsoshowthatconjugategradientcanyieldworsegeneralizationbecauseitoverfitsregionsoflownon-linearitywhenlearningtofitregionsofhighnon-linearity.1IntroductionItiscommonlybelievedthatlargemulti-layerperceptrons(MLPs)generalizepoorly:netswithtoom
6、uchcapacityoverfitthetrainingdata.Restrictingnetcapacitypreventsoverfit-tingbecausethenethasinsufficientcapacitytolearnmodelsthataretoocomplex.ThisbeliefisconsistentwithaVC-dimensionanalysisofnetcapacityvs.generalization:themorefreeparametersinthenetthelargertheVC-dimen
7、sionofthehypothesisspace,andthelesslikelythetrainingsampleislargeenoughtoselecta(nearly)correcthypothesis[2].Onceitbecamefeasibletotrainlargenetsonrealproblems,anumberofMLPusersnotedthattheoverfittingtheyexpectedfromnetswithexcesscapacitydidnotoccur.Largenetsappearedt
8、ogeneralizeaswellassmallernets—sometimesbetter.TheearliestreportofthisthatweareawareofisMartinandPittmanin1991:“Wefindonlymarginalan