欢迎来到天天文库
浏览记录
ID:5787146
大小:1.65 MB
页数:56页
时间:2017-12-24
《hidden markov model for text analysis》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、HiddenMarkovModelforTextAnalysisStudent:TunTaoTsaiAdvisor:Dr.MarkStampCommitteemember:Dr.JeffSmithCommitteeMember:Dr.ChrisPollettDepartmentofComputerScience,SanJoseStateUniversityEmail:joetsai@nanocluster.net56HiddenMarkovModelforTextAnalysis1Abstract31.Introduction42.TheBas
2、icprobabilitytheory62.TheCaveandNeuwirthexperiment73.HiddenMarkovmodel133.1TheMarkovproperty143.2TheHiddenMarkovmodeldefinition153.3Findingtheprobabilityoftheobservedsequence153.4Theforward-backwardalgorithm163.4.1Theforwardrecursion173.4.2Thebackwardrecursion183.5Choosingth
3、ebeststatesequence193.6Parameterre-estimation194.Chineseinformationprocessing215.Phonology225.1Chinesephonemictranscription225.2Englishphonemetranscription246.Experimentdesignandthesoftware256.1Numberofiterations256.2Numbersofstates266.3Chinesecorpus266.4Thesoftware277.Exper
4、imentresults287.1Englishalphabetexperimentresults287.2Englishphonemeresults287.2.1Englishphonemeexperimentusing2States297.2.2Englishphonemeexperimentwithmorethantwostates297.3Chinesecharactersexperimentresult307.4Zhuyinexperimentresults317.5Entropy338.Summaryandconclusions35
5、9.Futurework35Reference36Appendix1:Experimentresults38Appendix2:Entropyexperimentresults48Appendix3:BrownUniversitycorpus.51Appendix4.Chinesecharacterencoding52Appendix5:Zhuyin–Pinyinconversiontable54Appendix6:CMUpronouncingdictionaryphonemechart55Appendix7:Trialexperimentre
6、sulttodeterminethenumberofiterationstouse.5656AbstractInthefieldofNaturalLanguageprocessing,theHiddenMarkovModel(hereafterasHMM)methodisproventobeusefulintheapplicationareaoffindingpatternsfromsequenceofdata.Inthisstudy,weapplyHMMtechniquetotheBrowncorpus[1],theBrowncorpusin
7、itsphonemicrepresentation,aChinesecorpus,andtheChinesecorpusinitsZhuyinrepresentationinanattempttofindthestatisticalmodelsthatcanexplaincertainlanguagefeatures.WefirstgiveabriefoverviewtotheoriginalexperimentconductedbyCaveandNeuwirth[14],theHiddenMarkovModel,theChineselangu
8、ageprocessingissuesandEnglishphoneticpropertiesthatarerelatedtoourexperimen
此文档下载收益归作者所有