资源描述:
《数据分析架构实例与安全云挖掘》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、数据分析架构实例与安全云挖掘中山大学海量数据与云计算研究中心北京师范大学珠海研究院SACC2011SACC2011吕威提纲Part1数据分析架构实例数据挖掘例子数据分析架构实例——网站用户流失预警开源数据分析软件Weka介绍Part2大规模数据挖掘(云挖掘Hadoop)Map-Reduce方法Classification(k-NN)的MapReduce化Part3安全云挖掘微分流形在安全云挖掘中的应用(Matlab)SACC2011SACC2011Part1数据分析架构实例网站用数据挖Weka户行为掘例子介绍分析定义、概念
2、数据分析架构实例开源软件SACC2011SACC2011WhyMineData?CommercialViewpointLotsofdataisbeingcollectedandwarehousedWebdata,e-commercepurchasesatdepartment/grocerystoresBank/CreditCardtransactionsComputershavebecomecheaperandmorepowerfulCompetitivePressureisStrongProvidebetter,custom
3、izedservicesforanedge(e.g.inSACC2011SACC2011CustomerRelationshipManagement)WhyMineData?ScientificViewpointDatacollectedandstoredatenormousspeeds(GB/hour)remotesensorsonasatellitetelescopesscanningtheskiesmicroarraysgeneratinggeneexpressiondataTraditionaltechniquesinfe
4、asibleforrawdataDataminingmayhelpscientistsinclassifyingandsegmentingdatainHypothesisFormationSACC2011SACC2011MiningLargeDataSets-MotivationThereisofteninformation“hidden”inthedatathatisnotreadilyevidentHumananalystsmaytakeweekstodiscoverusefulinformationMuchofthedat
5、aisneveranalyzedatall4,000,0003,500,000TheDataGap3,000,0002,500,0002,000,000Totalnewdisk(TB)since19951,500,0001,000,000Numberof500,000analysts0SACC2011SACC2011From:R.Grossman,C.Kamath,V.Kumar,“DataMiningforScientificandEngineeringApplications”19951996199719981999WhatisData
6、Mining?ManyDefinitionsNon-trivialextractionofimplicit,previouslyunknownandpotentiallyusefulinformationfromdataExploration&analysis,byautomaticorsemi-automaticmeans,oflargequantitiesofdatainordertodiscovermeaningfulpatternsSACC2011SACC2011Whatis(not)DataMining?lWhatisnot
7、DatalWhatisDataMining?Mining?–Lookupphone–CertainnamesaremorenumberinphoneprevalentincertainUSdirectorylocations(O’Brien,O’Rurke,O’Reilly…inBostonarea)–QueryaWeb–Grouptogethersimilarsearchenginefordocumentsreturnedbyinformationaboutsearchengineaccordingto“Amazon”theirconte
8、xt(e.g.Amazonrainforest,Amazon.com,)SACC2011SACC2011OriginsofDataMiningDrawsideasfrommac