td_algorithm_base
- class td_algorithm_base.TDAlgoConfig(n_episodes: int = 0, tolerance: float = 1e-08, render_env: bool = False, render_env_freq: int = - 1, gamma: float = 1.0, alpha: float = 0.1, policy: Optional[Policy] = None, n_itrs_per_episode: int = 100)
Configuration class for TD like algorithms
- class td_algorithm_base.TDAlgoBase(algo_config: TDAlgoConfig)
Base class for temporal differences algorithms
- __init__(algo_config: TDAlgoConfig)
Constructor. Initialize the agent with the configuration instance needed
- Parameters
config (The configuration of the agent) –
- actions_after_training_ends(env: Env, **options) None
Execute any actions the algorithm needs after the iterations are finished
- Parameters
env (The environment to train on) –
options (Any options passed by the client code) –
- Return type
None
- actions_before_training_begins(env: Env, **options) None
Execute any actions the algorithm needs before
- Parameters
env (The environment to train on) –
options (Any options passed by the client code) –
- Return type
None