资源描述:
《A General Method for Estimating Correlated》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、2012IEEE28thInternationalConferenceonDataEngineeringAGeneralMethodforEstimatingCorrelatedAggregatesoveraDataStreamSrikantaTirthapuraDavidP.WoodruffDept.ofElectricalandComputerEngg.IBMAlmadenResearchCenterIowaStateUniversity,Ames,IA,USASanJose,CA,USAsnt@iastate
2、.edudpwoodru@us.ibm.comAbstract—Onastreamoftwodimensionaldataitems(x,y)aggregatesarisenaturallyinanalyticsonmulti-dimensionalwherexisanitemidentifier,andyisanumericalattribute,astreamingdata,andspace-efficientmethodsforimplementingcorrelatedaggregatequeryrequire
3、sustofirstapplyaselectionsuchaggregatesareusefulinstreaminganalyticssystemssuchpredicatealongthesecond(y)dimension,followedbyanaggrega-asintheIBMSystemS[5].tionalongthefirst(x)dimension.Forselectionpredicatesoftheform(yc),whereparametercisprovidedatquery
4、Oursummariesforcorrelatedaggregatesallowmoreflexibletime,wepresentnewstreamingalgorithmsandlowerboundsforinterrogationonthedatastreamthanispossiblewithatradi-estimatingstatisticsoftheresultingsubstreamofelementsthattionalstreamsummary.Forexample,consideradatast
5、reamsatisfythepredicate.ofIPflowrecords,suchasthoseoutputbynetworkroutersWeprovidethefirstsublinearspacealgorithmsforalargeequippedwithCisco’sNetflow[6].Supposethatwewerefamilyofstatisticsinthismodel,includingfrequencymoments.Weexperimentallyvalidateouralgorithms
6、,showingthattheirinterestedinonlytwoattributesperflowrecord,thedestinationmemoryrequirementsaresignificantlysmallerthanexistingaddressoftheflow,andthesize(numberofbytes)oftheflow.linearstorageschemesforlargedatasets,whilesimultaneouslyUsingasummaryforcorrelatedagg
7、regateAGG,alongwithaachievingfastper-recordprocessingtime.quantilesummaryfortheydimension(anyofthewellknownWealsostudytheproblemwhentheitemshaveweights.streamquantilesummariescanbeusedhere,including[2],Allowingnegativeweightsallowsforanalyzingvalueswhichoccuri
8、nthesymmetricdifferenceoftwodatasets.Wegivea[7]),itispossibleforanetworkadministratortoexecutethestrongspacelowerboundwhichholdsevenifthealgorithmfollowingsequenceofqueriesonthestr