资源描述:
《towards federated search based on web servicesnew》由会员上传分享,免费在线阅读,更多相关内容在教育资源-天天文库。
1、TowardsFederatedSearchBasedonWebServicesJensGraupmann,MichaelBiwer,PatrickZimmerUniversityofSaarland,GermanyDepartmentofComputerScienceP.O.Box151150,D-66041Saarbruck¨enE-mail:graupman@cs.uni-sb.deAbstract:SomeemergingtrendsintherecentdevelopmentoftheWWWcanbeob-served.Thesetrendsaretechnical,li
2、keWebServices,aswellassemantic,liketheintegrationofontologies.Weproposeanarchitectureforanewkindoffederatedsearchsystem,whichtakestheseaspectsandnewdevelopmentsintoaccount.Ama-jorchallengeinthiscontextistocopewithportalsandthedatasourcesbehindtheportals,theso-called“DeepWeb”.Onecomponentofthep
3、roposedarchitectureistheservicemediator,whichgenerateswrapperclassesandadditionallestomakeportalsaccessibleasWebServices.Othercomponentsareanonotologyserver,whichpro-videsWebServicebasedaccesstodifferentontologiesandanXMLFilterserverthatconvertsdifferentsourceformatstoXML.Thislooselycoupledar
4、chitecturesupportsfederatedsearchonsemistructureddataandtheevaluationofsemanticjoinoperations.1Introduction1.1MotivationSomeemergingtrendsintherecentdevelopmentoftheWWWcanbeobserved.Thesetrendsaretechnical,likeWebServices,aswellassemantic,liketheintegrationofon-tologies.Theyarethefoundationoft
5、heNextGenerationWeb.Wewillpointoutsomeaspectsofthesedevelopmentsandproposeasanapplicationexampleanarchitectureforanewkindoffederatedsearchsystem,whichtakestheseaspectsandnewdevelopmentsintoaccount.Oneofthedifferencestoothersystemsisthatwedonotwanttodoanykindof“schemaintegration”.Wedon'tevenkno
6、wthedataformatofsomeofourdatasourcesinadvance.Weonlywanttonddataobjects(HTML,XML,PDFdocumentsetc.)satisfyingoursearchconditionsaswellaspossible.Firstwewillintroducesomeoftheemergingtechnologies.1.1.1The“DeepWeb”(Portals)AlotofinformationintheWWWisstoredindatasources,mostlyrelationaldatabases,
7、whichareconnectedtosomeserverapplicationlogic,whichdynamicallygeneratesWebpages.Theseinformationrepositoriesarehostedbyso-calledWebPortals.Thishighlydynamiccontentleadstoproblemswithtoday'sprevalentcrawlerbasedsearchengines.Themainissue