欢迎来到天天文库
浏览记录
ID:32136745
大小:4.74 MB
页数:61页
时间:2019-01-31
《基于领域模型的数据抽取与集成》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、II摘要硕士研究生学位论文硕士研究生学位论文AbstractIIIAbstractFollo、析ngⅡlequickdeVelopmentofWorldWideW曲,thenet、)旧rkisfilled淅t11moreaIldmoredata,whjcharestinincreaSinginaVer)r11i曲speed.ThemaSsiVe锄ou】吐sofdataonⅡ1ewebhaVebecomingt11emostimportantdatasource.Thesemi一蛐mcnlrediⅢbmationint11eindus臼了
2、domainwebsitesarebecomingmemostimport趾tobjectofda_taminiIlgbecauseofla玛enumbersofadValltage.Thereisalotof、vorksintheofda臆milling,ho、ⅣeVer,ourpuIposeofresearchishowtomineautomaticallythedomaindataini11dus蚵domainwebsites舡1dinte孕锨ethem.ThepaIrtofdataminingandintegrationbaSe
3、dondomainmodelanalysismedifI’erenceofstmctureof、Vebsemi-s仇lcturedataorwebtablebet、Ⅳeenlayouttablea11dattI伯ute/Valueta_ble.AndCombined谢tht11echaracteristicsofmedomaindemalld,WeproposeanewwebdatamodelaIldnewdomaindatamodelbasedon、Ⅳebda慨model,explajntlledatamilling印proachba
4、sedon、Ⅳebdatamodela11ddatainte伊ationapproachbasedondomaindatamodel.Entitiesexpal:1sionisthesupplymentofWEBdataextraCtionaIldneg瑚【tion印proachbaSedondomainmodel.Thedomaindata讪UbeminebyextractionapproachbasedondomainmodelaIldaddtoseedsetfirStly.Then,omersWEB讪ledomaindataWil
5、lbeautomaticallyminebyentit)rexpansionapproachindomainindust珥The、Vebtablearlddomainentit)rwillbemodeledaSbipartite铲印h.Boththesimilari够betweenexpaIlsionenti够setaI]ldseedseta11dtlleclosenessoftheexpallsionentit),setitself谢Ubefigureup.Tllenent时expaIlsion埘ncouIltthesimil撕tya
6、ndclosenesswimthem、Ⅳeightastllequali够scoreofexpansionenti够set.BaSedonq谳it),score,tlleexpansionentit),setwillbeupdateiteratiVelyu11tilt11efouIldexpansionentit)rsethaSt11ebiggestquali够scoreandtheexpansionenti锣setwillnoclmge.AttributesexpallsionalsoismesupplymentofWEBdataex
7、缸-actiona11dintegration印proaChbaSedondomainmodel.Firstly,classifierandcatalogres乜葡nsbecreatedin打aillingphase.Then,attributesexpansionextractstlle撕ibutes’Valueinwebdocumemsanddistributesthemtoattributesindeployphase.Lastly,the砌butesextraCtedwillbeaddedtodomain撕ibutemodel.
8、Tllemainworkofdeployphaseofanributeexpansioniselimilatingmostofwronga:ttributesusingentityrestrainsandc
此文档下载收益归作者所有