资源描述:
《clique是一个图中两两相邻的一个点集,或是一个完全子图》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、EfficientUnsupervisedDiscoveryofWordCategoriesUsingSymmetricPatternsandHighFrequencyWordsDmitryDavidov,AriRappoportTheHebrewUniversityACL2006IntroductionDiscoveringwordcategories,setsofwordssharingasignificantaspectoftheirmeaningcontextfeaturevectorspattern-base
2、ddiscoveryManuallypreparedpatternset(ex.xandy)requiringPOStaggingorpartialorfullparsingPatternCandidateshighfrequencyword(HFW)AwordappearingmorethanTHtimespermillionwordsEx.and,or,from,to…contentword(CW)wordappearinglessthanTCtimesperamillionwordsPatternCandidat
3、esmeta-patternsobeythefollowingconstraintsatmost4wordsexactlytwocontentwordsnotwoconsecutiveCWsExampleCHC,CHCH,CHHC,andHCHCfromxtoy(HCHC),xandy(CHC),xanday(CHHC)SymmetricPatternsInordertofindausablesubsetfrompatterncandidates,wefocusonthesymmetricpatternsExample
4、xbelongstoy(asymmetricrelationships)Xandy(asymmetricrelationships)SymmetricPatternsWeusesinglepatterngraphG(P)toidentifyingsymmetricpatternsthereisadirectedarcA(x,y)fromnodextonodeyiffthewordsxandybothappearinaninstanceofthepatternPasitstwoCWsxprecedesyinPSymG(P
5、),thesymmetricsubgraphofG(P),containingonlythebidirectionalarcsandnodesofG(P)SymmetricPatternsWecomputethreemeasuresonG(P)M1countstheproportionofwordsthatcanappearinbothslotsofthepatternM2,M3measurescounttheproportionofthenumberofsymmetricnodesandedgesinG(P)Symm
6、etricPatternsWeremovedpatternsthatappearinthecorpuslessthanTPtimespermillionwordsWeremovepatternsthatarenotinthetopZTinanyofthethreelistsWeremovepatternsthatareinthebottomZBinatleastoneofthelistsDiscoveryofCategorieswordsthatarehighlyinterconnectedaregoodcandida
7、testoformacategorywordrelationshipgraphGmergingallofthesingle-patterngraphsintoasingleunifiedgraphclique是一個圖中兩兩相鄰的一個點集,或是一個完全子圖TheClique-SetMethodStrongn-cliquessubgraphscontainingnnodesthatareallbidirectionallyinterconnectedAcliqueQdefinesacategorythatcontainst
8、henodesinQplusallofthenodesthatare(1)atleastunidirectionallyconnectedtoallnodesinQ(2)bidirectionallyconnectedtoatleastonenodeinQTheClique-SetMethodweuse2-cliquesForea