资源描述:
《regression models with regularization》由会员上传分享,免费在线阅读,更多相关内容在工程资料-天天文库。
1、8RegressionModelswithRegularizationInChapter6,LinearRegressionAnalysis,andChapter7,TheLogisticRegressionModel,wefocusedonthelinearandthelogisticregressionmodel.Inthemodelselectionissueswiththelinearregressionmodel,wefoundthatacovariateiseitherselectedornotdepe
2、ndingontheassociatedp-value.However,therejectedcovariatesarenotgivenanykindofconsiderationoncethep-valueislesserthanthethreshold.Thismayleadtodiscardingthecovariateseveniftheyhavesomesayontheregressand.Particularly,thefinalmodelmaythusleadtooverfittingofthedat
3、a,andthisproblemneedstobeaddressed.Wewillfirstconsiderfittingapolynomialregressionmodel,withoutthetechnicaldetails,andseehowhigherorderpolynomialsgiveaverygoodfit,whichactuallycomeswithahigherprice.AmoregeneralframeworkofB-splinesisconsiderednext.Thisapproachl
4、eadsustothesmoothsplinemodels,whichareactuallyridgeregressionmodels.Thechapterconcludeswithanextensionoftheridgeregressionforthelinearandlogisticregressionmodels.Formoredetailsofthecoverage,refertoChapter2ofBerk(2008)andChapter5ofHastie,et.al.(2008).Thischapte
5、rwillunfoldonthefollowingtopics:TheproblemofoverfittinginageneralregressionmodelTheuseofregressionsplinesforcertainspecialcasesImprovingestimatorsoftheregressioncoefficients,andovercomingtheproblemofoverfittingwithridgeregressionforlinearandlogisticmodelsThefr
6、ameworkoftrain+validate+testforregressionmodelsRegressionModelswithRegularizationTheoverfittingproblemThelimitationofthelinearregressionmodelisbestunderstoodthroughanexample.Ihavecreatedahypotheticaldatasetforunderstandingtheproblemofoverfitting.Ascatterplotof
7、thedatasetisshowninthefollowingfigure.Itappearsfromthescatterplotthatforxvaluesupto6,thereisalinearincreaseiny,andaneye-birdestimateoftheslopeis(50-10)/(5.5-1.75)=10.67.Thisslopemaybeonaccountofalineartermorevenaquadraticterm.Ontheotherhand,thedeclineiny-value
8、sforx-valuesgreaterthan6isverysteep,approximately(10-50)/(10-6)=-10.Now,lookingatthecompletepicture,itappearsthattheoutputYdependsuponthehigherorderofthecovariateX.Letusfitpolynomi