资源描述:
《Towards a representation of idioms in WordNet》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、TowardsaRepresentationofIdiomsinWordNetChristianeFellbaumCognitiveScienceLaboratory,PrincetonUniversityRiderUniversityPrinceton,NewJersey,USA1Introductionisnoagreementontheboundarybetweenlit-eralandnon-literallanguage,seee.g.(0).Cri-WordNet(0;0)isperhapsthemostwidelyusedteriath
2、atarecommonlyacceptedincludese-electronicdictionaryofEnglishandservesasmanticnon-compositionalityandsyntacticcon-thelexiconforavarityofdierentNLPap-straintsoninternalmodication(suchasadjec-plicationsincludingInformationRetrieval(IR),tiveandadverbinsertion)andmovementtrans-Wor
3、dSenseDisambiguation(WSD),andMa-formations.OurpurposehereisnottoattemptchineTranslation(MT).DespiteWordNet'sacleardelimitationordenitionofnon-literallargecoverage,whichcomprisessome100,000language,buttoexaminehowextendedsensesconceptslexicalizedbyapproximately120,000ofwordsand
4、phrasesfromdierentsyntacticwordforms(strings)andiscomparabletothatandlexicalcategories-orconformingtononeofofacollegiatedictionary,itcontainsrelativelythestandardcategories-arecompatiblewiththelittlegurativelanguage.WordNetincludesanetworkstructureofarelationallexiconlikenumb
5、erofmulti-wordstrings,suchasphrasalWordNetanditsparticularwayofrepresent-verbs,butmanyidiomaticverbphraseslikeingwordsandconcepts.Ourdiscussionwillfo-smellarat,knowtheropes,andeathumblepie,cuson,butnotbelimitedto,idiomaticverbaremissing.Idiomsandmetaphorsaboundinphrases.everyda
6、ylanguageandarefoundintextsspan-ningmanygenres(see,e.g.,(0)foranumerical2AsimpleclassicationestimateofthefrequencyofidiomsandxedAninspectionofidiomdictionarysourcessuchexpression).Clearly,adictionarythatincludesas(0)suggestsathree-folddistinctionamongextendedsensesofwordsandp
7、hrasesislikelyidiomsforourpurposes.toyieldmoresuccessfulNLPapplications.Ontheonehand,nosystemwantstoretrievethe3Constructionsstringbucketfromtheidiomkickthebucket.Ontheotherhand,MTandWSDeortsneedFirst,someidiomaticconstructionsaresimplytodistinguishthesenseofropesinphraseslike
8、toocomplextobeintegratedintoWordNetandknow/learn/teach