资源描述:
《Understanding Random Forests - From Theory to Practice》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、UniversityofLiègeFacultyofAppliedSciencesDepartmentofElectricalEngineering&ComputerSciencePhDdissertationUNDERSTANDINGRANDOMFORESTSfromtheorytopracticebyGillesLouppeLGLGAdvisor:Prof.PierreGeurtsJuly2014JURYMEMBERSLouisWehenkel,ProfessorattheUniversitédeLièg
2、e(President);PierreGeurts,ProfessorattheUniversitédeLiège(Advisor);BernardBoigelot,ProfessorattheUniversitédeLiège;RenaudDetry,PostdoctoralResearcherattheUniversitédeLiège;GianlucaBontempi,ProfessorattheUniversitéLibredeBruxelles;GérardBiau,ProfessorattheUn
3、iversitéPierreetMarieCurie(France);iiiABSTRACTDataanalysisandmachinelearninghavebecomeanintegrativepartofthemodernscientificmethodology,offeringautomatedproceduresforthepredictionofaphenomenonbasedonpastobservations,un-ravelingunderlyingpatternsindataandprov
4、idinginsightsabouttheproblem.Yet,cautionshouldavoidusingmachinelearningasablack-boxtool,butratherconsideritasamethodology,witharatio-nalthoughtprocessthatisentirelydependentontheproblemunderstudy.Inparticular,theuseofalgorithmsshouldideallyrequireareasonabl
5、eunderstandingoftheirmechanisms,propertiesandlimi-tations,inordertobetterapprehendandinterprettheirresults.Accordingly,thegoalofthisthesisistoprovideanin-depthanal-ysisofrandomforests,consistentlycallingintoquestioneachandeverypartofthealgorithm,inordertosh
6、ednewlightonitslearn-ingcapabilities,innerworkingsandinterpretability.Thefirstpartofthisworkstudiestheinductionofdecisiontreesandtheconstructionofensemblesofrandomizedtrees,motivatingtheirdesignandpur-posewheneverpossible.Ourcontributionsfollowwithanoriginal
7、complexityanalysisofrandomforests,showingtheirgoodcomputa-tionalperformanceandscalability,alongwithanin-depthdiscussionoftheirimplementationdetails,ascontributedwithinScikit-Learn.Inthesecondpartofthiswork,weanalyzeanddiscussthein-terpretabilityofrandomfore
8、stsintheeyesofvariableimportancemeasures.Thecoreofourcontributionsrestsinthetheoreticalchar-acterizationoftheMeanDecreaseofImpurityvariableimportancemeasure,fromwhichweproveandderivesomeofitsproperties