欢迎来到天天文库
浏览记录
ID:50383704
大小:426.50 KB
页数:44页
时间:2020-03-08
《数据挖掘导论第4课数据分类和预测.ppt》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、第4课数据分类和预测徐从富,副教授浙江大学人工智能研究所浙江大学本科生《数据挖掘导论》课件内容提纲Whatisclassification?Whatisprediction?IssuesregardingclassificationandpredictionClassificationbydecisiontreeinductionBayesianClassificationPredictionSummaryReferenceClassificationpredictscategoricalclasslabels(discr
2、eteornominal)classifiesdata(constructsamodel)basedonthetrainingsetandthevalues(classlabels)inaclassifyingattributeandusesitinclassifyingnewdataPredictionmodelscontinuous-valuedfunctions,i.e.,predictsunknownormissingvaluesTypicalapplicationsCreditapprovalTargetmark
3、etingMedicaldiagnosisFrauddetectionClassificationvs.PredictionClassification—ATwo-StepProcessModelconstruction:describingasetofpredeterminedclassesEachtuple/sampleisassumedtobelongtoapredefinedclass,asdeterminedbytheclasslabelattributeThesetoftuplesusedformodelcon
4、structionistrainingsetThemodelisrepresentedasclassificationrules,decisiontrees,ormathematicalformulaeModelusage:forclassifyingfutureorunknownobjectsEstimateaccuracyofthemodelTheknownlabeloftestsampleiscomparedwiththeclassifiedresultfromthemodelAccuracyrateistheper
5、centageoftestsetsamplesthatarecorrectlyclassifiedbythemodelTestsetisindependentoftrainingset,otherwiseover-fittingwilloccurIftheaccuracyisacceptable,usethemodeltoclassifydatatupleswhoseclasslabelsarenotknownClassificationProcess(1):ModelConstructionTrainingDataCla
6、ssificationAlgorithmsIFrank=‘professor’ORyears>6THENtenured=‘yes’Classifier(Model)ClassificationProcess(2):UsetheModelinPredictionClassifierTestingDataUnseenData(Jeff,Professor,4)Tenured?Supervisedvs.UnsupervisedLearningSupervisedlearning(classification)Supervisio
7、n:Thetrainingdata(observations,measurements,etc.)areaccompaniedbylabelsindicatingtheclassoftheobservationsNewdataisclassifiedbasedonthetrainingsetUnsupervisedlearning(clustering)TheclasslabelsoftrainingdataisunknownGivenasetofmeasurements,observations,etc.withthea
8、imofestablishingtheexistenceofclassesorclustersinthedataIssuesRegardingClassificationandPrediction(1):DataPreparationDatacleaningPreprocessdatainorderto
此文档下载收益归作者所有