Area width
Area height
Epoch
Reward
Reward object
Returns cloned environment.
Cloned environment
Do action and returns new state.
Actions to be performed by the agent
Agent
state, reward, done
Do actioin without changing environment and returns new state.
Environment state
Actions to be performed by the agent
state, reward, done
Smooth maze environment