Reinforcement_Learning_An_Introduction

Reinforcement_Learning_An_Introduction

ID:39446879

大小:2.79 MB

页数:369页

时间:2019-07-03

Reinforcement_Learning_An_Introduction_第页
预览图正在加载中,预计需要20秒,请耐心等待
资源描述:

《Reinforcement_Learning_An_Introduction》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库

1、Sutton&BartoBook:ReinforcementLearning:AnIntroductionReinforcementLearning:AnIntroductionRichardS.SuttonandAndrewG.BartoMITPress,Cambridge,MA,1998ABradfordBookEndorsementsCodeSolutionsFiguresErrataCourseSlidesThisintroductorytextbookonreinforcementlearningistar

2、getedtowardengineersandscientistsinartificialintelligence,operationsresearch,neuralnetworks,andcontrolsystems,andwehopeitwillalsobeofinteresttopsychologistsandneuroscientists.Ifyouwouldliketoorderacopyofthebook,orifyouarequalifiedinstructorandwouldliketoseeanex

3、aminationcopy,pleaseseetheMITPresshomepageforthisbook.Oryoumightbeinterestedinthereviewsatamazon.com.ThereisalsoaJapanesetranslationavailable.Thetableofcontentsofthebookisgivenbelow,withassociatedHTML.TheHTMLversionhasanumberofpresentationproblems,anditstextiss

4、lightlydifferentfromtherealbook,butitmaybeusefulforsomepurposes.●PrefacePartI:TheProblem●1Introduction❍1.1ReinforcementLearning❍1.2Examples❍1.3ElementsofReinforcementLearning❍1.4AnExtendedExample:Tic-Tac-Toefile:///C¦/book/the-book.html(1of5)[28/08/138203:12:45

5、ユネヘ]Sutton&BartoBook:ReinforcementLearning:AnIntroduction❍1.5Summary❍1.6HistoryofReinforcementLearning❍1.7BibliographicalRemarks●2EvaluativeFeedback❍2.1Ann-armedBanditProblem❍2.2Action-ValueMethods❍2.3SoftmaxActionSelection❍2.4EvaluationversusInstruction❍2.5Inc

6、rementalImplementation❍2.6TrackingaNonstationaryProblem❍2.7OptimisticInitialValues❍2.8ReinforcementComparison❍2.9PursuitMethods❍2.10AssociativeSearch❍2.11Conclusion❍2.12BibliographicalandHistoricalRemarks●3TheReinforcementLearningProblem❍3.1TheAgent-Environment

7、Interface❍3.2GoalsandRewards❍3.3Returns❍3.4AUnifiedNotationforEpisodicandContinualTasks❍3.5TheMarkovProperty❍3.6MarkovDecisionProcesses❍3.7ValueFunctions❍3.8OptimalValueFunctions❍3.9OptimalityandApproximation❍3.10Summary❍3.11BibliographicalandHistoricalRemarksP

8、artII:ElementaryMethods●4DynamicProgramming❍4.1PolicyEvaluation❍4.2PolicyImprovement❍4.3PolicyIteration❍4.4ValueIterationfile:///C¦/book/the-book.html(2of5)[28/08/138203:12:

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
相关文章
更多
相关标签