Reinforcement Learning

Reinforcement learning is the art of analyzing situations and mapping them to actions in order to maximize a numerical reward signal.

In this independent study, I as well as Dr. Stephen Davies, will explore the Reinforcement Learning problem and its subproblems. We will go over the bandit problem, markov decision processes, and discover how best to translate a problem in order to make decisions.

I have provided a list of topics that I wish to explore in a syllabus

Readings

In order to spend more time learning, I decided to follow a textbook this time.

Reinforcement Learning: An Introduction

By Richard S. Sutton and Andrew G. Barto

Reading Schedule

Notes

The notes for this course, is going to be an extreemly summarized version of the textbook. There will also be notes on whatever side tangents Dr. Davies and I explore.

Notes page

I wrote a small little quirky/funny report describing the bandit problem. Great for learning about the common considerations for Reinforcement Learning problems.

The Bandit Report

Code

Code will occasionally be written to solidify the learning material and to act as aids for more exploration.

Github Link

Specifically, if you want to see agents I’ve created to solve some OpenAI environments, take a look at this specific folder in the Github Repository. Github Link

~/Reinforcement Learning

Brandon Rozek

Readings

Notes

Code