资源描述:
《2 frequent pattern -a.ppt》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、WhatIsFrequentPatternAnalysis?Frequentpattern:apattern(asetofitems,subsequences,substructures,etc.)thatoccursfrequentlyinadatasetFirstproposedbyAgrawal,Imielinski,andSwami[AIS93]inthecontextoffrequentitemsetsandassociationruleminingMotivation:Findinginher
2、entregularitiesindataWhatproductswereoftenpurchasedtogether?—Beeranddiapers?!WhatarethesubsequentpurchasesafterbuyingaPC?WhatkindsofDNAaresensitivetothisnewdrug?Canweautomaticallyclassifywebdocuments?ApplicationsBasketdataanalysis,cross-marketing,catalogd
3、esign,salecampaignanalysis,Weblog(clickstream)analysis,andDNAsequenceanalysis.WhyIsFreq.PatternMiningImportant?DisclosesanintrinsicandimportantpropertyofdatasetsFormsthefoundationformanyessentialdataminingtasksAssociation,correlation,andcausalityanalysisS
4、equential,structural(e.g.,sub-graph)patternsPatternanalysisinspatiotemporal,multimedia,time-series,andstreamdataClassification:associativeclassificationClusteranalysis:frequentpattern-basedclusteringDatawarehousing:icebergcubeandcube-gradientSemanticdatac
5、ompression:fasciclesBroadapplicationsAMultidimensionalViewofFrequentPattenDiscoverytypesofdataorknowledgelatticetransversal/mainoperationsothersassociativepatternsequentialpatternicebergcubereadwritepointotherinterestmeasurecompressionmethodpruningmethodc
6、onstraintsclosed/maxpatternSub-Graphpattern关联规则基本概念Apriori及其改进算法Apriori-basedSub-GraphMining本讲内容DataandKnowledgeTypesAssociativePatterntransactionaltablevsrelationaltablebooleanvsquantitativeSequentialPatternAsequence:<(ef)(ab)(df)cb>(e,f)->(a,b)->coccur5
7、0%ofthetimeIceburgCubetablewithameasuresTIDItems10a,c,d20b,c,e30a,b,c,e40b,eTIDabcde1010110200010130111014001001transactional=binaryMonthCityCust_grpProdCostPriceJanTorEduPrinter500485MarVanEduHD540520………………relationalwithquantitativeattributeMonthCityCust
8、_grpProdCost(Support)JanTor*Printer1040……………cube:usingothermeasureassupportSimulationofLatticeTransversala,ca,eb,eb,ca,bc,eb,c,ea,b,ea,c,ea,b,cabcea,b,c,e{}Thewholeprocessoffrequentpatternminingcanbeseenasasearchint