Policy gradient agent

Constructors

Properties

_history: any[]
_table: SoftmaxPolicyGradient

Methods

  • Returns a action.

    Parameters

    • state: any[]

      Current states

    Returns any[]

    Action

  • Returns a score.

    Returns number[][][]

    Score values

  • Reset agent.

    Returns void

  • Update model.

    Parameters

    • action: any[]

      Action

    • state: any[]

      Next states

    • reward: number

      Reward

    • done: boolean

      Done epoch or not

    • learning_rate: number

      Learning rate

    Returns void