Reinforcement learning is the art of analyzing situations and mapping them to actions in order to maximize a numerical reward signal.
In this independent study, I as well as Dr. Stephen Davies, will explore the Reinforcement Learning problem and its subproblems. We will go over the bandit problem, markov decision processes, and discover how best to translate a problem in order to make decisions.
I have provided a list of topics that I wish to explore in a syllabus
Readings
In order to spend more time learning, I decided to follow a textbook this time.
Reinforcement Learning: An Introduction
By Richard S. Sutton and Andrew G. Barto
Notes
The notes for this course, is going to be an extreemly summarized version of the textbook. There will also be notes on whatever side tangents Dr. Davies and I explore.
I wrote a small little quirky/funny report describing the bandit problem. Great for learning about the common considerations for Reinforcement Learning problems.
Code
Code will occasionally be written to solidify the learning material and to act as aids for more exploration.
Specifically, if you want to see agents I’ve created to solve some OpenAI environments, take a look at this specific folder in the Github Repository. Github Link