value_iteration

The value_iteration module. Provides a simple implementation of value iteration algorithm

class value_iteration.ValueIteration(algo_config: DPAlgoConfig, policy_adaptor: PolicyAdaptorBase)

The class ValueIteration implements the value iteration algorithm

__init__(algo_config: DPAlgoConfig, policy_adaptor: PolicyAdaptorBase) → None

Constructor

Parameters

actions_before_training_begins(env: Env, **options) → None

Execute any actions the algorithm needs before starting the iterations

Parameters

Return type

None

on_training_episode(env: Env, episode_idx: int, **options) → EpisodeInfo

Train the algorithm on the episode

Parameters

Return type

An instance of EpisodeInfo