Check out this 2D race car learning to drive through a track by using On-Policy Monte Carlo control. The car doesn’t know anything about the track; it only sees its current location, velocity, and rewards it gets while driving. The car can choose to change it’s velocity by 1 unit in x and/or y during each time step, and it eventually learns how to get to the finish line!

Check out this other windy one:

The problem was taken from 5.6 of Sutton and Barto’s Intro to Reinforcement Learning (with modified rewards to get both right and left turns). I highly recommend the book!

If you’re also learning Reinforcement Learning, please contribute to my attempt at solutions to Intro to RL. Thanks to my co-workers Abhi and Chuck for helping code up the OG algo and environment for 5.4.

comments powered by Disqus
Blog Logo

Baruch Tabanpour