资源描述:
《数据挖掘英文题目》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、2.4.Supposethatthedataforanalysisincludestheattributeage.Theagevaluesforthedatatuplesare(inincreasingorder)13,15,16,16,19,20,20,21,22,22,25,25,25,25,30,33,33,35,35,35,35,36,40,45,46,52,70.(a)Whatisthemeanofthedata?Whatisthemedian?(b)Whatisthemodeofthedata?Commentonthedata'smodality(i.e.,bimodal
2、,trimodal,etc.).(c)Whatisthemidrangeofthedata?(d)Canyou¯nd(roughly)the¯rstquartile(Q1)andthethirdquartile(Q3)ofthedata?(e)Givethe¯ve-numbersummaryofthedata.(f)Showaboxplotofthedata.(g)Howisaquantile-quantileplotdi®erentfromaquantileplot?2.9.Supposeahospitaltestedtheageandbodyfatdatafor18randoml
3、yselectedadultswiththefollowingresultage232327273941474950%fat9.526.57.817.831.425.927.427.231.2age525454565758586061%fat34.642.528.833.430.234.132.941.235.7(a)Calculatethemean,medianandstandarddeviationofageand%fat.(b)Drawtheboxplotsforageand%fat.(c)Drawascatterplotandaq-qplotbasedonthesetwova
4、riables.(d)Normalizethetwovariablesbasedonz-scorenormalization.(e)Calculatethecorrelationcoe±cient(Person'sproductmomentcoe±cient).Arethesetwovariablespositivelyornegativelycorrelated?2.11.Usethetwomethodsbelowtonormalizethefollowinggroupofdata:200;300;400;600;1000(a)min-maxnormalizationbysetti
5、ngmin=0andmax=1(b)z-scorenormalization4.4.SupposethatabasecuboidhasthreedimensionsA;B;C,withthefollowingnumberofcells:jAj=1;000;000,jBj=100,andjCj=1000.Supposethateachdimensionisevenlypartitionedinto10portionsforchunking.(a)Assumingeachdimensionhasonlyonelevel,drawthecompletelatticeofthecube.(b
6、)Ifeachcubecellstoresonemeasurewith4bytes,whatisthetotalsizeofthecomputedcubeifthecubeisdense?(c)Statetheorderforcomputingthechunksinthecubethatrequirestheleastamountofspace,andcomputethetotalamountofmainmemoryspacerequiredforcomputingthe2-Dplanes.5.3.Adatabasehas¯vetransactions.Letminsup=60%an
7、dminconf=80%.(a)FindallfrequentitemsetsusingAprioriandFP-growth,respectively.Comparethee±ciencyofthetwominingprocesses.(b)Listallofthestrongassociationrules(withsupportsandcon¯dencec)matchingthefollowingmetarule,whereXisavariabler