q_learning

The module q_learning. Implements a tabular-based Q-learning algorithm

class q_learning.QLearning(algo_config: TDAlgoConfig)

Q-learning algorithm

__init__(algo_config: TDAlgoConfig) → None

Constructor. Initialize by passing the configuration options

_update_q_table(env: Env, state: int, action: int, reward: float, next_state: Optional[int] = None) → None

Update the underlying q table

Parameters

Return type

None

actions_after_episode_ends(env: Env, episode_idx: int, **options) → None

Execute any actions the algorithm needs after ending the episode

Parameters

Return type

None

actions_before_training_begins(env: Env, **options) → None

Execute any actions the algorithm needs before training starts

Parameters

Return type

None

on_training_episode(env: Env, episode_idx: int, **options) → EpisodeInfo

Train the algorithm on the episode

Parameters

Return type

An instance of EpisodeInfo