资源描述:
《数据挖掘——关联规则挖掘.pdf》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、DataMiningYingLiu,Prof.,Ph.DUniversityofChineseAcademyofSciencesReviewDatamining—coreofPatternEvaluationknowledgediscoveryprocessDataMiningSelectionandTransformationDataWarehouseDataCleaningandIntegrationDatabasesFlatfiles2017/10/212MiningAssociationRule
2、sinLargeDatabasesBasicconceptsandaroadmapMiningsingle-dimensionalBooleanassociationrulesMiningmultilevelassociationrulesMiningmultidimensionalassociationrulesSummary2017/10/213MarketBasketAnalysis2017/10/214WhatIsAssociationRulesMining?Associationru
3、lesminingFindingfrequentpatterns,associationsamongsetsofitemsorobjectsintransactiondatabases,relationaldatabases,andotherinformationrepositories.ExamplesWhatproductswereoftenpurchasedtogether?—Beeranddiapers?!WhatDNAsegmentsoftenoccurtogetherinDNAsequ
4、ences?2017/10/215WhatIsAssociationRulesMining?Wheredoesthedatacomefrom?supermarkettransactions,membershipcards,discountcoupons,customercomplaintcallsApplicationsBasketdataanalysisCross-marketingCatalogdesignSalecampaignanalysisWeblog(clickstream)a
5、nalysisDNAsequenceanalysis2017/10/216BasicConceptsTransaction-idItemsbought10A,B,D20A,C,D30A,D,E40B,E,F50B,C,D,E,FItemcollectionX={x,…,x}1mItemset:asetofitems,k-itemsetTransactionTX,eachTassociatesauniqueTidanditemsboughtbyacustomerRuleform=>,X,
6、X,=2017/10/217BasicConceptssupport,s,probabilitythataCustomertransactioncontainsandCustomerbuysbothbuyssupport(=>)=P()Frequentitemset,occurrencegreaterthanamin_supportFrequentitemsetmining,findalltherules=>satisfyingCustomerbuysmin_su
7、pportLetsup=50%,minfrequentItemsets{A:3,B:3,D:4,E:3,AD:3}support(A)=3/5=60%,support(AD)=3/5=60%2017/10/218BasicConceptsconfidence,c,conditionalprobabilitythatatransactionhavingalsocontainsP()count()Confidence(=>)=P()=————=———————P()count(
8、)MeasureofruleinterestingnessRulessatisfymin_supportandmin_confidencearestrongLetsup=50%,conf=50%,minminfrequentitemsets{A:3,B:3,D:4,E:3,AD:3}Associationrules:A=>D(60%,100%)D=>A(60%,75%)2017/10/219Interestingness