资源描述:
《Entity Resolution with Markov Logic》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、EntityResolutionwithMarkovLogicParagSinglaPedroDomingosDepartmentofComputerScienceandEngineeringUniversityofWashingtonSeattle,WA98195-2350,U.S.A.parag,pedrod@cs.washington.eduAbstractTheentityresolutionproblemwasfirstidentifiedbyNewcombeetal.[31],andgive
2、nastatisticalformulationEntityresolutionistheproblemofdeterminingwhichbyFellegiandSunter[14].Mostcurrentapproachesarerecordsinadatabaserefertothesameentities,andisavariantsoftheFellegi-Suntermodel,inwhichentityresolu-crucialandexpensivestepinthedatamin
3、ingprocess.In-tionisviewedasaclassificationproblem:givenavectorofterestinithasgrownrapidlyinrecentyears,andmanyap-similarityscoresbetweentheattributesoftwoentities,clas-proacheshavebeenproposed.However,theytendtoad-sifyitas“Match”or“Non-match.”Aseparate
4、matchdeci-dressonlyisolatedaspectsoftheproblem,andareoftensionismadeforeachcandidatepair,followedbytransitiveadhoc.Thispaperproposesawell-founded,integratedclosuretoeliminateinconsistencies.Typically,alogisticre-solutiontotheentityresolutionproblembase
5、donMarkovgressionmodelisused[1].Onelineofresearchhasfocusedlogic.Markovlogiccombinesfirst-orderlogicandproba-onscalingentityresolutiontolargedatabasesbyavoidingbilisticgraphicalmodelsbyattachingweightstofirst-orderthequadraticnumberofcomparisonsbetweenal
6、lpairsofformulas,andviewingthemastemplatesforfeaturesofentities(e.g.,[20,30,26,7]).AnotherhasfocusedontheMarkovnetworks.Weshowhowanumberofpreviousap-useofactivelearningtechniquestominimizetheneedforproachescanbeformulatedandseamlesslycombinedinlabeledd
7、ata(e.g.,[44,38,4]).Severalauthorshavede-Markovlogic,andhowtheresultinglearningandinferencevised,comparedandlearnedsimilaritymeasuresforuseinproblemscanbesolvedefficiently.Experimentsontwoci-entityresolution(e.g.,[6,45,3]).Anumberofalternatetationdataba
8、sesshowtheutilityofthisapproach,andeval-formulationshavealsobeenproposed(e.g.,[5]).Entityres-uatethecontributionofthedifferentcomponents.olutionhasbeenappliedinawidevarietyofdomains(e.g.,[33,10])andtodifferenttypesofdata,includingtext(e