Monte Carlo Methods for Solving Reinforcement Learning Problems

-

Dissecting “Reinforcement Learning” by Richard S. Sutton with Custom Python Implementations, Episode III

We proceed our deep dive into Sutton’s great book about RL [1] and here deal with Monte Carlo (MC) methods. These are capable of learn from experience alone, i.e. don’t require any type of model of the environment, as e.g. required by the Dynamic programming (DP) methods we introduced within the previous post.

This is incredibly tempting — as often the model is just not known, or it is difficult to model the transition probabilities. Consider the sport of Blackjack: although we fully understand the sport and the foundations, solving it via DP methods could be very tedious — we might must compute every kind of probabilities, e.g. given the currently played cards, how likely is a “blackjack”, how likely is it that one other seven is dealt … Via MC methods, we don’t must take care of any of this, and easily play and learn from experience.

Photo by Jannis Lucas on Unsplash

As a result of not using a model, MC methods are unbiased. They’re conceptually easy and simple to grasp, but exhibit a high variance and can’t be solved in iterative fashion (bootstrapping).

As mentioned, here we are going to introduce these methods following Chapter 5 of Sutton’s book…

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x