资源描述:
《西电数据挖掘决策树算法》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、数据挖掘算法实验报告1)实验题目棊于决策树的分类算法,属性的选择采用ID3或C4.5策略,采用如不的数据建立分类决策树。ageincomestudentcreditratingbuyscomputer<=30highnofairno<=30highnoexcellentno31...40highnofairyes>40mediumnofairyes>40lowyesfairyes>40lowyesexcellentno31...40lowyesexcellentyes<=30mediumnofairno<=30lo
2、wyesfairyes>40mediumyesfairyes<=30mediumyesexcellentyes31...40mediumnoexcellentyes31...40highyesfairyes>40mediumnoexcellentno2)算法基本思想的描述ID3选择具有最高信息熵增益的属性作为分裂属性,基于这种原则我们首先可以算出初始集合的熵,然后分别求出以各个属性为分裂属性时的熵,然后将通过上面得到的数据算出以各个属性为分裂属性时的信心増益,选择具有最大的信息增益属性作为我们的分裂属性。3)编程实
3、现算法#include#includc〈math.h>#inclucle〈string.h>usingnamespacestd;^defineSIZE14structData{charage[10];charincome[10];charstudent[10];charcredit_rating[20];charbuyscomputer[10];};Datadata[SIZE]={1<=30,high,no,fair,noj,{<=30,high,no,excellent,noj,{31…40,
4、high,no,fair,yes},1>40,medium,no,fair,yesj,{>40,low,yes,iair,yesj,{">40",’’low",〃yes〃,’’excellent","no"},{"31…40","low","yes","excellent","yes"},{<=30,medium,no,fair,no},<=30,low,yes,fair,yesj,t>40,medium,yes,tair,yesb{<=30,medium,yes,excellent,yesj,{〃31…40〃,"
5、medium”,〃no〃,"excellent",〃yes〃},{31...40,high,yes,fair,yes},>40,medium,no,excellent,noj};doublecalculate(doublea,doubleb);voidoriginentropy(Datadata[],double&entropy);voidage_entropy(Datadata[],double&entropy);voidincomc_cntropy(Datadata[],double&cntropy);voi
6、dstudent_entropy(Datadata[],double&entropy);voidcredit_rating_entropy(Datadata[],double&entropy);intmain(){doubleorigin=0,age=0,student=0,credit_rating=0,income=0;origin_entropy(data,origin);age_entropy(data,age);student_entropy(data,student);income_entropy(da
7、ta,income);credit_rating_entropy(data,credit_rating);cout<<,zinfo(D)=//<8、endl;信息增A厶jIILcout<〈〃用student作为分裂属性时:"〈〈""〈〈"熵info(student)(D)=):"〈〈student〈〈"tt"〈〈/z信息增益为:〃〈〈origin-student〈〈encll;cout<〈〃用creditrating作为分裂属性时:〃〈〈〃〃〈〈〃嫡info(creditrating)(D)