欢迎来到天天文库
浏览记录
ID:40637132
大小:4.02 MB
页数:46页
时间:2019-08-05
《Linear-Algebra-Methods-for-data-mining》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、LinearAlgebraMethodsforDataMiningSaaraHyv¨onen,Saara.Hyvonen@cs.helsinki.fiSpring2007OverviewofsometopicscoveredandsometopicsnotcoveredonthiscourseLinearAlgebraMethodsforDataMining,Spring2007,UniversityofHelsinkiLinearalgebratoolkit•QRiteration•eigenvalues,eigenvaluedecomposition,general
2、izedeigenvalueproblem•singularvaluedecompositionSVD•NMF•powermethod(forfindingeigenvaluesand-vectors)LinearAlgebraMethodsforDataMining,Spring2007,UniversityofHelsinki1Dataminingtasksencountered•regression•classification•clustering•findinglatentvariables•visualizingandexploration•rankingLin
3、earAlgebraMethodsforDataMining,Spring2007,UniversityofHelsinki2QRwasusedfor...•orthogonalizingasetof(basis)vectorsX=QR.•solvingtheleast-squaresproblem:22TR222krk=kb−Axk=kQb−xk=kb1−Rxk+kb2k.0•leastsquaresproblemswereencounterede.g.whenwewishtoexpressamatrixA∈Rm×nintermsofasetofbasisvec
4、torsX∈Rm×k,k5、ing,Spring2007,UniversityofHelsinki5•Lineardiscriminantanalysis:lineardiscriminants=eigenvectorscorrespondingtothelargesteigen-valuesofthegeneralizedeigenvalueproblemSbw=λSww.LinearAlgebraMethodsforDataMining,Spring2007,UniversityofHelsinki6•Spectralclustering:basedonrunningk-meansclust6、eringonthematrixobtainedfromtheeigenvectorscorrespondingtothelargesteigenvaluesofthegraphlaplacianmatrixL=D−1/2AD−1/2.LinearAlgebraMethodsforDataMining,Spring2007,UniversityofHelsinki72.521.510.50−0.5−1−1.5−2−2.5−2−1.5−1−0.500.511.522.5LinearAlgebraMethodsforDataMining,Spring2007,Univer7、sityofHelsinki8SpectralclusteringUsemethodsfromspectralgraphpartitioningtodoclustering.Needed:pairwisedistancesbetweendatapoints.Thesecanbethoughtofasweightsoflinksinagraph:clusteringproblembecomesagraphpartitioningproblem.Unlikek-means,clustersneednotbeconvex.LinearAlgebraMeth
5、ing,Spring2007,UniversityofHelsinki5•Lineardiscriminantanalysis:lineardiscriminants=eigenvectorscorrespondingtothelargesteigen-valuesofthegeneralizedeigenvalueproblemSbw=λSww.LinearAlgebraMethodsforDataMining,Spring2007,UniversityofHelsinki6•Spectralclustering:basedonrunningk-meansclust
6、eringonthematrixobtainedfromtheeigenvectorscorrespondingtothelargesteigenvaluesofthegraphlaplacianmatrixL=D−1/2AD−1/2.LinearAlgebraMethodsforDataMining,Spring2007,UniversityofHelsinki72.521.510.50−0.5−1−1.5−2−2.5−2−1.5−1−0.500.511.522.5LinearAlgebraMethodsforDataMining,Spring2007,Univer
7、sityofHelsinki8SpectralclusteringUsemethodsfromspectralgraphpartitioningtodoclustering.Needed:pairwisedistancesbetweendatapoints.Thesecanbethoughtofasweightsoflinksinagraph:clusteringproblembecomesagraphpartitioningproblem.Unlikek-means,clustersneednotbeconvex.LinearAlgebraMeth
此文档下载收益归作者所有
点击更多查看相关文章~~