A comparative evaluation of modern English corpus.pdf

A comparative evaluation of modern English corpus.pdf

ID:49865443

大小:91.76 KB

页数:18页

时间:2020-03-05

A comparative evaluation of modern English corpus.pdf_第1页
A comparative evaluation of modern English corpus.pdf_第2页
A comparative evaluation of modern English corpus.pdf_第3页
A comparative evaluation of modern English corpus.pdf_第4页
A comparative evaluation of modern English corpus.pdf_第5页
资源描述:

《A comparative evaluation of modern English corpus.pdf》由会员上传分享,免费在线阅读,更多相关内容在应用文档-天天文库

1、AcomparativeevaluationofmodernEnglishcorpusgrammaticalannotationschemesEricAtwell,GeorgeDemetriou,JohnHughes,AmandaSchiffrin,CliveSouterandSeanWilcockCentreforComputerAnalysisofLanguageandSpeech(CCALAS)1IntroductionManyEnglishCorpusLinguisticsprojects

2、reportedinICAMEJournalandelse-whereinvolvegrammaticalanalysisortaggingofEnglishtexts(egAtwell1983,Leechetal1983,Booth1985,Owen1987,Souter1989a,O’Donoghue1991,Belmore1991,KytöandVoutilainen1995,Aarts1996,QiaoandHuang1998).Eachnewprojecthastoreviewexist

3、ingtaggingschemes,anddecidewhichtoadoptand/oradapt.TheAMALGAMprojectcanhelpinthisdecision,bypro-vidingdescriptionsandanalysesofarangeoftaggingschemes,andaninternet-basedserviceforresearcherstotryouttherangeoftaggingschemesontheirowndata.TheprojectAMAL

4、GAM(AutomaticMappingAmongLexico-GrammaticalAnnotationModels)exploredarangeofPart-of-SpeechtagsetsandphrasestructureparsingschemesusedinmodernEnglishcorpus-basedresearch.ThePoS-taggingschemesinclude:Brown(GreeneandRubin1981),LOB(Atwell1982,Johanssoneta

5、l1986),Parts(man1986),SEC(TaylorandKnowles1988),POW(Souter1989b),UPenn(Santorini1990),LLC(Eeg-Olofsson1991),ICE(Greenbaum1993),andBNC(Garside1996).Theparsingschemesincludesomewhichhavebeenusedforhandannotationofcorporaormanualpost-editingofautomaticpa

6、rsers,andotherswhichareuneditedoutputofaparsingprogram.Projectdeliverablesinclude:–adetaileddescriptionofeachPoS-taggingscheme,atacomparablelevelofdetail.ThisincludesalistofPoS-tagswithdescriptionsandexampleusesfromthesourceCorpus.Thedescriptionoftheu

7、seofPoS-tagsisalsoillus-tratedinamulti-taggedcorpus:asetofsampletextsPoS-taggedinparallelwitheachPoS-tagset(andproofreadbyexperts),forcomparativestudies7ICAMEJournalNo.24–ananalysisofthedifferentlexicaltokenizationrulesusedinthesourceCor-pora,toarrive

8、ata‘Corpus-neutral’tokenizationscheme(andconsequentadjustmentstothePoS-tagsetsinourstudytoacceptmodifiedtokenization)–animplementationofeachPoS-tagsetinconjunctionwithourstandardisedtokenizer,asafamilyofPoS-taggers,oneforeachPoS-tagset–amethod

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。