资源描述:
《Statistical Phrase-Based Translation》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、StatisticalPhrase-BasedTranslationPhilippKoehn,FranzJosefOch,DanielMarcuInformationSciencesInstituteDepartmentofComputerScienceUniversityofSouthernCaliforniakoehn@isi.edu,och@isi.edu,marcu@isi.eduAbstractmethodtoextractphrasetranslationpairs?Inordertoinvestigatethisquestion,wecreatedaunifo
2、rmevaluationWeproposeanewphrase-basedtranslationframeworkthatenablesthecomparisonofdifferentwaysmodelanddecodingalgorithmthatenablestobuildaphrasetranslationtable.ustoevaluateandcompareseveral,previ-Ourexperimentsshowthathighlevelsofperformanceouslyproposedphrase-basedtranslationmod-canbea
3、chievedwithfairlysimplemeans.Infact,els.Withinourframework,wecarryoutaformostofthestepsnecessarytobuildaphrase-basedlargenumberofexperimentstounderstandbet-system,toolsandresourcesarefreelyavailableforre-terandexplainwhyphrase-basedmodelsout-searchersinthefield.Moresophisticatedapproachesth
4、atperformword-basedmodels.Ourempiricalre-makeuseofsyntaxdonotleadtobetterperformance.Insults,whichholdforallexaminedlanguagefact,imposingsyntacticrestrictionsonphrases,asusedinpairs,suggestthatthehighestlevelsofperfor-recentlyproposedsyntax-basedtranslationmodels[Ya-mancecanbeobtainedthrou
5、ghrelativelysim-madaandKnight,2001],provestobeharmful.Ourex-plemeans:heuristiclearningofphrasetrans-perimentsalsoshow,thatsmallphrasesofuptothreelationsfromword-basedalignmentsandlexi-wordsaresufficientforobtaininghighlevelsofaccuracy.calweightingofphrasetranslations.Surpris-Performancediff
6、erswidelydependingonthemethodsingly,learningphraseslongerthanthreewordsusedtobuildthephrasetranslationtable.Wefoundex-andlearningphrasesfromhigh-accuracyword-tractionheuristicsbasedonwordalignmentstobebetterlevelalignmentmodelsdoesnothaveastrongthanamoreprincipledphrase-basedalignmentmetho
7、d.impactonperformance.Learningonlysyntac-However,whatconstitutesthebestheuristicdiffersfromticallymotivatedphrasesdegradestheperfor-languagepairtolanguagepairandvarieswiththesizeofmanceofoursystems.thetrainingcorpus.1Introduction2EvaluationFrameworkVariousrese