欢迎来到天天文库
浏览记录
ID:40232295
大小:426.50 KB
页数:44页
时间:2019-07-27
《数据挖掘导论第4课数据分类和预测》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、第4课数据分类和预测徐从富,副教授浙江大学人工智能研究所浙江大学本科生《数据挖掘导论》课件内容提纲Whatisclassification?Whatisprediction?IssuesregardingclassificationandpredictionClassificationbydecisiontreeinductionBayesianClassificationPredictionSummaryReferenceClassificationpredictscategoricalclassl
2、abels(discreteornominal)classifiesdata(constructsamodel)basedonthetrainingsetandthevalues(classlabels)inaclassifyingattributeandusesitinclassifyingnewdataPredictionmodelscontinuous-valuedfunctions,i.e.,predictsunknownormissingvaluesTypicalapplicationsCr
3、editapprovalTargetmarketingMedicaldiagnosisFrauddetectionClassificationvs.PredictionClassification—ATwo-StepProcessModelconstruction:describingasetofpredeterminedclassesEachtuple/sampleisassumedtobelongtoapredefinedclass,asdeterminedbytheclasslabelattri
4、buteThesetoftuplesusedformodelconstructionistrainingsetThemodelisrepresentedasclassificationrules,decisiontrees,ormathematicalformulaeModelusage:forclassifyingfutureorunknownobjectsEstimateaccuracyofthemodelTheknownlabeloftestsampleiscomparedwiththeclas
5、sifiedresultfromthemodelAccuracyrateisthepercentageoftestsetsamplesthatarecorrectlyclassifiedbythemodelTestsetisindependentoftrainingset,otherwiseover-fittingwilloccurIftheaccuracyisacceptable,usethemodeltoclassifydatatupleswhoseclasslabelsarenotknownCl
6、assificationProcess(1):ModelConstructionTrainingDataClassificationAlgorithmsIFrank=‘professor’ORyears>6THENtenured=‘yes’Classifier(Model)ClassificationProcess(2):UsetheModelinPredictionClassifierTestingDataUnseenData(Jeff,Professor,4)Tenured?Supervisedv
7、s.UnsupervisedLearningSupervisedlearning(classification)Supervision:Thetrainingdata(observations,measurements,etc.)areaccompaniedbylabelsindicatingtheclassoftheobservationsNewdataisclassifiedbasedonthetrainingsetUnsupervisedlearning(clustering)Theclassl
8、abelsoftrainingdataisunknownGivenasetofmeasurements,observations,etc.withtheaimofestablishingtheexistenceofclassesorclustersinthedataIssuesRegardingClassificationandPrediction(1):DataPreparationDatacleaningPreprocessdatainorderto
此文档下载收益归作者所有