资源描述:
《EXTRACTING KNOWLEDGE BASES FROM MACHINE- READABLE DICTIONARIES HAVE WE WASTED OUR TIME》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、EXTRACTINGKNOWLEDGEBASESFROMMACHINE-READABLEDICTIONARIES:HAVEWEWASTEDOURTIME?NancyIdeandJeanVéronisDepartmentofComputerScienceLaboratoireParoleetLangageVassarCollegeCNRS&UniversitédeProvencePoughkeepsie,NewYork12601(U.S.A.)29,AvenueRobertSchumane-mail:{ide,veroni
2、s}@cs.vassar.edu13621Aix-en-ProvenceCedex1(France)e-mail:{ide,veronis}@fraix11.univ-aix.frABSTRACTtopic)isreduced,indicatingeitherthatextractionMachine-readableversionsofeverydaymethodsarewell-establishedandrobust(whichisdictionarieshavebeenseenasalikelysourceofu
3、nlikely)orthatresearchhasturnedtootherinformationforuseinnaturallanguageareas.Atthesametime,theNLPcommunityhasprocessingbecausetheycontainanenormousturneditsattentiontocorporaasasourceofamountoflexicalandsemanticknowledge.linguisticknowledge,evidentbothintheupsur
4、geHowever,after15yearsofresearch,theresultsinthenumberofpapers,journals,workshops,etc.appeartobedisappointing.Nocomprehensivedealingwithcorporaasalinguisticresource(e.g.,evaluationofmachine-readabledictionariestherecentissueofComputationalLinguistics,vol.(MRDs)as
5、aknowledgesourcehasbeenmadeto19,1-2,1993,devotedtocorpus-basedwork,thedate,althoughthisisnecessarytodeterminewhat,workshopatACL'93oncorpora,etc.)andinifanything,canbegainedfromMRDresearch.recentlarge-scalefundingpatterns(e.g.,theTothisend,thispaperwillfirstconsid
6、ertheEuropeanLREprogramforcorpora,ARPA'spostulatesuponwhichMRDresearchhasbeenLinguisticDataConsortiumintheU.S.,etc.).Itisbasedoverthepastfifteenyears,discusstheclearthatMRDsfailedtoliveuptoearlyvalidityofthesepostulates,andevaluatetheresultsexpectationsthattheywo
7、uldprovideasourceofofthiswork.Wewillthenproposepossiblefutureready-made,comprehensivelexicalknowledge.directionsandapplicationsthatmayexploittheseButdoesthismeanthatthesemanyyearsofworkyearsofeffort,inthelightofcurrentdirectionsinonMRDsconstituteswastedeffort?Doe
8、sitmeannotonlyNLPresearch,butalsofieldssuchasthatMRDsareconclusivelyunsuitableasasourcelexicographyandelectronicpublishing.forautomaticallybuildingknowledgebas