资源描述:
《Xml-based nlp tools for analysing and annotating medical language》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、XML-BasedNLPToolsforAnalysingandAnnotatingMedicalLanguageClaireGrover,EwanKlein,MirellaLapataandAlexLascaridesDivisionofInformaticsTheUniversityofEdinburgh2BuccleuchPlaceEdinburghEH89LW,UKfC.Grover,E.Klein,M.Lapata,A.Lascaridesg@ed.ac.ukAbstractstatisticaltextclassificationmethod
2、sfordataanal-Wedescribetheuseofasuiteofhighlyflexibleysis.Ourmorelinguisticapproachmaybeofas-XML-basedNLPtoolsinaprojectforprocessingandsistenceinIE:seeCravenandKumlien(1999)forinterpretingtextinthemedicaldomain.ThemaindiscussionofmethodsforIEfromMEDLINE.aimofthepaperistodemonstr
3、atethecentralroleOurprocessingparadigmisXML-based.AsathatXMLmark-upandXMLNLPtoolshaveplayedmark-uplanguageforNLPtasks,XMLisexpres-intheanalysisprocessandtodescribetheresultantsiveandflexibleyetconstrainable.Furthermore,annotatedcorpusofMEDLINEabstracts.Inadditionthereexistawidera
4、ngeofXML-basedtoolsforNLPtotheXMLtools,wehavesucceededinintegratingapplicationswhichlendthemselvestoamodular,avarietyofnon-XML`offtheshelf'NLPtoolsintopipelinedapproachtoprocessingwherebylinguis-ourpipelines,sothattheiroutputisaddedintotheticknowledgeiscomputedandaddedasXMLan-ma
5、rk-up.Wedemonstratetheutilityoftheanno-notationsinanincrementalfashion.Inprocessingtationsthatresultintwoways.First,weinvestigateMEDLINEabstractswehavebuiltanumberofsuchhowtheycanbeusedtoimproveparsecoverageofapipelinesusingaskeycomponentstheprogramshand-craftedgrammarthatgenera
6、teslogicalforms.distributedwiththeLTTTTandLTXMLtoolsetsAndsecond,weinvestigatehowtheycontributeto(Groveretal.,2000;Thompsonetal.,1997).Weautomaticlexicalsemanticacquisitionprocesses.havealsosuccessfullyintegratednon-XMLpublic-domaintoolsintoourpipelinesandincorporated1Introducti
7、ontheiroutputintotheXMLmark-upusingtheLTXMLInthispaperwedescribeouruseofXMLforananal-programxmlperl(McKelvie,2000).ysisofmedicallanguagewhichinvolvesanumberInSection2wedescribeouruseofXML-basedofcomplexlinguisticprocessingstages.Theulti-tokenisationtoolsandtechniquesandinSection
8、s3mateaimoftheprojectistotoacquirelexicalse-and