Deep Deterministic Policy Gradient agent
Environment
Resolution of actions
Network layers
Optimizer of the network
Returns a action.
Current states
Random factor
Action
Returns a score.
Score values
Update model.
Next states
Reward
Learning rate
Batch size
Loss value
Deep Deterministic Policy Gradient agent