iterative_policy_evaluation
Module iterative_policy_evaluation. Implements a tabular version of the iterative policy evaluation algorithm as described in the book
http://incompleteideas.net/book/RLbook2020.pdf
- class iterative_policy_evaluation.IterativePolicyEvaluator(algo_config: DPAlgoConfig)
Implements iterative policy evaluation algorithm
- __init__(algo_config: DPAlgoConfig) None
Constructor. Initialize the algorithm by passing the configuration instance needed.
- Parameters
configuration (algo_config Algorithm) –
- actions_before_training_begins(env: Env, **options) None
Execute any actions the algorithm needs before starting the iterations
- on_training_episode(env: Env, episode_idx: int, **options) EpisodeInfo
Train the algorithm on the episode
- Parameters
env (The environment to run the training episode) –
episode_idx (The episode index) –
options (Options that client code may pass) –
- Return type
An instance of EpisodeInfo