In this notebook, we implement a very simple model based on the Q-learning algorithm. This notebook is intend to show a basic form of Reinforcement Learning that doesn't require a deep understanding of neural networks or advanced mathematics and how one might deploy such a model in DataRobot.
This example shows the Grid World problem, where an agent learns to navigate a grid to reach a goal.
The notebook will go through the following steps:
Define State and Action Space
Create a Q-table to store expected rewards for each state/action combination