资源描述:
《foundations_of_statistical_natural_language_processing》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、pPrefaceTheneedforathoroughtextbookforStatisticalNaturalLanguagePro-cessinghardlyneedstobearguedforintheageofon-lineinformation,electroniccommunicationandtheWorldWideWeb.Increasingly,busi-nesses,governmentagenciesandindividualsareconfrontedwithlargeamoun
2、tsoftextthatarecriticalforworkingandliving,butnotwellenoughunderstoodtogettheenormousvalueoutofthemthattheypo-tentiallyhide.Atthesametime,theavailabilityoflargetextcorporahaschangedthescientificapproachtolanguageinlinguisticsandcognitivescience.Phenomenat
3、hatwerenotdetectableorseemeduninterestinginstudyingtoydomainsandindividualsentenceshavemovedintothecenterfieldofwhatisconsideredimportanttoexplain.Whereasasrecentlyastheearly1990squantitativemethodswereseenassoinadequateforlinguisticsthatanimportanttextbo
4、okformathematicallinguisticsdidnotcovertheminanyway,theyarenowincreasinglyseenascrucialforlinguistictheory.Inthisbookwehavetriedtoachieveabalancebetweentheoryandpractice,andbetweenintuitionandrigor.Weattempttogroundap-proachesintheoreticalideas,bothmathe
5、maticalandlinguistic,butsi-multaneouslywetrytonotletthematerialgettoodry,andtrytoshowhowtheoreticalideashavebeenusedtosolvepracticalproblems.Todothis,wefirstpresentkeyconceptsinprobabilitytheory,statistics,infor-mationtheory,andlinguisticsinordertogivestu
6、dentsthefoundationstounderstandthefieldandcontributetoit.Thenwedescribetheprob-lemsthatareaddressedinStatisticalNaturalLanguageProcessing(NLP),liketagginganddisambiguation,andaselectionofimportantworksoiipxxxPrefacethatstudentsaregroundedintheadvancesthat
7、havebeenmadeand,havingunderstoodthespecialproblemsthatlanguageposes,canmovethefieldforward.Whenwedesignedthebasicstructureofthebook,wehadtomakeanumberofdecisionsaboutwhattoincludeandhowtoorganizethematerial.Akeycriterionwastokeepthebooktoamanageablesize.(
8、Wedidn'tentirelysucceed!)Thusthebookisnotacompleteintroductiontoprobabilitytheory,informationtheory,statistics,andthemanyotherareasofmathematicsthatareusedinStatisticalNLP.Wehavetriedtocoverthosetopicsthatseemmostimportant