欢迎来到天天文库
浏览记录
ID:50921068
大小:335.50 KB
页数:47页
时间:2020-03-16
《语料库语料库研究方法概述.ppt》由会员上传分享,免费在线阅读,更多相关内容在教育资源-天天文库。
1、选题、设计与方法Putitaltogether李文中中国外语教育研究中心2012语料库不是人学的,正则表达式不是女人学的。Corpus-drivenisbasicallycorpusbased.Anycorpus-basedresearchisnecessarilydrivenbycorpusdata.目标:通过语料库分析和研究:验证假设、直觉获得新发现建立新的假设构建新的理论验证已有的发现解决难题创新:数据方法技术解读/理论/视角√新√√√√√√√√√√基于语料库方法是一种验证程序语料库驱动方法是一种发现程序理据:
2、任何感知都是推断Anyperceptionisbutinferencing.worldofrealityworldoftextEinsteinGulfUnbridgeable眼耳鼻舌身意色声香味触法学问思辨行文本基本步骤:确定题目提出问题确定总体和样本选择工具处理数据描述结果:分类、总结特征(description)解释结果:观察、描述、解释(explanation)解读结果:意义、价值、应用(interpretation)IdentifyingaproblemSomethingorphenomenon:outofe
3、xpectationIncongruentNeedasolutionpuzzlingReadingtobebetterinformedWhathasbeendoneascontributionWhathasbeenleftundoneWhathasbeendonewrongNevercountsomeoneelse’smoney.FormulatingresearchquestionsNaming:whatis…Classificatory:Howaretheyinterrelated(patterned)?Expl
4、anatory:towhatextentdotheyco-occur?Predictive:Whatwillhappenif…?Neveraskaquestiontowhichyoualreadyknowtheanswer;neverask'howto'questionFindingamethodPopulationSampleSamplingP(population)S(Sample)R(Result)I(Interpretation)SamplingvalidityreliabilityValidityGener
5、alizabilityIFPSSRRITHENIPDescriptiveresearchsingletexttextvs.textpeoplevs.textResearchquestionsHowmanydifferentwordformsareusedinthetext?Howmanyrunningwordsareused?Whatistheirdistribution?Towhatextentcanthelevelofdifficultyofthetextbecomputedonthebasisofthe
6、gradedwordlists?Howmanydifferentwordclassesareused?Whatisthenumberofeachwordclass?MethodToanswerRQ1,generateawordlistofthegiventextandobserve:ThenumberoftypesThenumberoftokensthetype/tokenratio(TTR)Ifthetextisverylarge,standardizetheTTRthetypesandtheirfrequency
7、cumulativepercentageToanswerRQ2,computethewordlistagainstabatchofgradedwordlists,andobserve:HowmanytypesonLevel1,2,and3listsareusedinthetext?Andwhatistheirpercentage?Whatabouttheirtokens?Howmanytypesthatarenotonanylistareusedinthetext?Summarizetheirfeatures.Toa
8、nswerRQ3,retrieveeachwordclassfromthePOStaggedtext,andsortthemonfrequencyindecreasingorderRetrieveallthenouns,verbs,andadjectivesSortthelistInstrumentsUseAntconc3.0togenerat
此文档下载收益归作者所有