src.runners package
Submodules
src.runners.ModelFreeRunner module
Generalized runner for single-process RL. Takes in encoded observations, applies them to a buffer, and trains.
- class src.runners.ModelFreeRunner.ModelFreeRunner(agent_config_path: str, buffer_config_path: str, encoder_config_path: str, model_save_dir: str, experiment_name: str, experiment_state_path: str, num_test_episodes: int, num_run_episodes: int, save_every_nth_episode: int, update_model_after: int, update_model_every: int, eval_every: int, max_episode_length: int, resume_training: bool = False, use_container: bool = True)
Bases:
BaseRunnerMain configurable runner.
Initialize ModelFreeRunner.
- Parameters
agent_config_path (str) – Path to agent configuration YAML.
buffer_config_path (str) – Path to replay buffer configuration YAML.
encoder_config_path (str) – Path to encoder configuration YAML.
model_save_dir (str) – Path to save model
experiment_name (str) – Experiment name in WandB
experiment_state_path (str) – Path to save experiment state for resuming.
num_test_episodes (int) – Number of test episodes
num_run_episodes (int) – Number of training episodes
save_every_nth_episode (int) – Train save frequency ( episode ).
update_model_after (int) – Update model after some number of training steps.
update_model_every (int) – Update model every __ training steps, for ___ training steps.
eval_every (int) – Evaluate every ___ episodes.
max_episode_length (int) – Maximum episode length ( BAD PARAM / BUGGY. )
use_container (bool, optional) – Whether to use the provided wrapper (HIGHLY ENCOURAGED). Defaults to True.
- checkpoint_model(ep_ret, ep_number)
Conditionally save a checkpoint of the model if return is larger than best, or if self.save_every_nth_episode episodes have passed.
- Parameters
ep_ret (float) – Current return
ep_number (int) – Current episode number
- eval(env)
Evaluate model on the evaluation environment, using a deterministic agent if possible.
- Parameters
env (gym.env) – Some gym-compliant environment.
- Returns
The max reward for each test session.
- Return type
float
- classmethod instantiate_from_config(config_file_location)
Initialize class from config file
- Parameters
config_file_location (path) – Path to config file
- Raises
ValueError – Error loading file
- Returns
object from class.
- Return type
cls
- classmethod instantiate_from_config_dict(config)
Initialize class from config dictionary
- Parameters
config (dictionary) – Create instance of class from dictionary.
- Returns
Object from class and config.
- Return type
cls
- run(env, api_key: str = '')
Train an agent, with our given parameters, on the environment in question.
- Parameters
env (gym.env) – Some gym-compliant environment, preferrably wrapped using a wrapper
api_key (str, optional) – Wandb API key for logging. Defaults to ‘’.
- save_experiment_state(ep_number)
Save running variables for experiment state resuming.
- Parameters
ep_number (int) – Current episode number
- Raises
Exception – Must specify json file name with experiment state path.
Exception – Must specify experiment state path
- schema = Map({'agent_config_path': Str(), 'buffer_config_path': Str(), 'encoder_config_path': Str(), 'model_save_dir': Str(), 'experiment_name': Str(), 'experiment_state_path': Str(), 'num_test_episodes': Int(), 'num_run_episodes': Int(), 'save_every_nth_episode': Int(), 'update_model_after': Int(), 'update_model_every': Int(), 'eval_every': Int(), 'max_episode_length': Int(), Optional("resume_training"): Bool(), Optional("use_container"): Bool()})
src.runners.base module
Base Runner. Inherit from here, and respect the protocol.
Module contents
Runners for different training systems.