sarsa

class sarsa.Sarsa(algo_config: TDAlgoConfig)

SARSA algorithm

__init__(algo_config: TDAlgoConfig)

Constructor

actions_after_episode_ends(env: Env, episode_idx, **options) → None

Execute any actions the algorithm needs after the training episode ends

Parameters

Return type

None

actions_before_training_begins(env: Env, **options) → None

Any actions before the training begins

Parameters

Return type

None

do_on_training_episode(env: Env, episode_idx: int, **options) → EpisodeInfo

Train the agent on the environment at the given episode.

Parameters

Return type

An instance of the EpisodeInfo class

update_q_table(reward: float, current_action: int, next_state: int, next_action: int) → None

Update the underlying q table

Parameters

Return type

None