资源描述:
《基于区域划分的XML结构连接》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、1000-9825/2004/15(05)0720©2004JournalofSoftware软件学报Vol.15,No.5∗基于区域划分的XML结构连接1+22王静,孟小峰,王珊1(中国科学院计算技术研究所,北京100080)2(中国人民大学信息学院,北京100872)StructuralJoinofXMLBasedonRangePartitioning1+22WANGJing,MENGXiao-Feng,WANGShan1(InstituteofComputingTechnology,TheCh
2、ineseAcademyofSciences,Beijing100080,China)2(InformationSchool,RenminUniversityofChina,Beijing100872,China)+Correspondingauthor:Phn:+86-10-62515575,Fax:+86-10-62519453,E-mail:jwang_lq@yahoo.com.cnReceived2003-03-02;Accepted2003-06-26WangJ,MengXF,WangS.
3、StructuraljoinofXMLbasedonrangepartitioning.JournalofSoftware,2004,15(5):720~729.http://www.jos.org.cn/1000-9825/15/720.htmAbstract:StructuraljoinisthecoreoperationinXMLqueryprocessing,andcatchestheresearchcommunity’sattention.Efficientalgorithmistheke
4、yofefficientqueryprocessing.Therehavebeenanumberofalgorithmsproposedforstructuraljoin,andmostofthemarebasedononeofthefollowingassumptions:theinputelementsetseitherhaveindexesorareordered.Whentheseassumptionsarenottrue,theperformanceofthesealgorithmswil
5、lbecomeverybadduetothecostofsortinginputsetsorbuildingindexonthefly.Motivatedbythisobservation,astructuraljoinalgorithmbasedonrangepartitioningisproposedinthispaper.Basedontheideaoftaskdecomposition,thisalgorithmtakesadvantageoftheregionnumberingscheme
6、topartitiontheinputsets.Theprocedureofthealgorithmisdescribedindetail,anditsI/Ocomplexityisanalyzed.Theresultsoftheextensiveexperimentsshowthatthisalgorithmhasgoodperformanceandissuperiortotheexistingsort-mergealgorithmswhentheinputdataarenotsortedorin
7、dexed.Itcanprovidequeryplanswithmorechoices.Keywords:XMLqueryprocessing;pathexpression;numberingscheme;structuraljoin摘要:结构连接是XML查询处理的核心操作,受到了研究界的关注.高效的算法是高效查询处理的关键.目前已经提出了许多结构连接的算法,它们中的大多数都基于如下的前提条件之一:输入元素集合存在索引或者∗SupportedbytheNationalNaturalScienceFo
8、undationofChinaunderGrantNos.60073014,60273018(国家自然科学基金);theNationalHigh-TechResearchandDevelopmentPlanofChinaunderGrantNo.2002AA116030(国家高技术研究发展计划(863));theKeyProjectoftheMinistryofEducationofChinaunderGrantNo.03044(国家教育部科学技术重点项目);theE