src.agents package
Subpackages
Submodules
src.agents.PETSAgent module
- class src.agents.PETSAgent.PETSAgent(network_config_path: str, planner_config: ConfigurableDict, n_ensembles: int = 7, lr: float = 0.01, model_save_path: str = '/mnt/blah', load_checkpoint: bool = False, deterministic: bool = False)
Bases:
BaseAgentAdopted from https://github.com/BY571/PETS-MPC. Currently not using CEM, but random AS.
Initialize PETS Agent
- Parameters
network_config_path (str) – Path to network config.
planner_config (ConfigurableDict) – Configuration of planner
n_ensembles (int, optional) – Number of networks in ensemble. Defaults to 7.
lr (float, optional) – Learning rate. Defaults to 1e-2.
model_save_path (str, optional) – Model save path (unused). Defaults to ‘/mnt/blah’.
load_checkpoint (bool, optional) – Whether to load checkpoint or not (unused). Defaults to False.
deterministic (bool, optional) – Whether to act deterministically (unused). Defaults to False.
- classmethod instantiate_from_config(config_file_location)
Initialize class from config file
- Parameters
config_file_location (path) – Path to config file
- Raises
ValueError – Error loading file
- Returns
object from class.
- Return type
cls
- classmethod instantiate_from_config_dict(config)
Initialize class from config dictionary
- Parameters
config (dictionary) – Create instance of class from dictionary.
- Returns
Object from class and config.
- Return type
cls
- load_model(path)
Unused but load model from data
- Parameters
path (str) – Path to data
- Returns
self
- Return type
model
- register_reset(obs) array
Handle episode reset
- Parameters
obs (np.array) – Observation
- Returns
Action
- Return type
np.array
- save_model(path)
Unusued but save model to path
- Parameters
path (str) – _description_
- schema = Map({'network_config_path': Str(), 'planner_config': Map({'name': Str(), 'config': Any()}), Optional("n_ensembles"): Int(), Optional("lr"): Float(), Optional("model_save_path"): Str(), Optional("load_checkpoint"): Bool(), Optional("deterministic"): Bool()})
- select_action(obs) array
Select action given obs
- Parameters
obs (np.array) – Observation
- Returns
Action
- Return type
np.array
- update(data)
Update given data
- Parameters
data (dict) – Dict of data from SimpleReplayBuffer
src.agents.PPOAgent module
PPOAgent Definition.
- class src.agents.PPOAgent.PPOAgent(steps_to_sample_randomly: int, lr: float, clip_ratio: float, load_checkpoint_from: str = '', train_pi_iters: int = 80, train_v_iters: int = 80, target_kl: float = 0.01, actor_critic_cfg_path: str = '')
Bases:
BaseAgentProximal Policy Optimization Agent
Initialize Proximal Policy Optimization Agent
- Parameters
steps_to_sample_randomly (int) – Number of steps to sample randomly
lr (float) – Learning rate
clip_ratio (float) – Clip ratio
load_checkpoint_from (str, optional) – Where to load checkpoint from. Using default does not load any checkpoint. Defaults to ‘’.
train_pi_iters (int, optional) – Number of update iterations for policy per call to update. Defaults to 80.
train_v_iters (int, optional) – Number of update iterations for value per call to update. Defaults to 80.
target_kl (float, optional) – Target Kubler-Leibleck Divergence. Defaults to 0.01.
actor_critic_cfg_path (str, optional) – Path to AC cfg. Defaults to ‘’.
- classmethod instantiate_from_config(config_file_location)
Initialize class from config file
- Parameters
config_file_location (path) – Path to config file
- Raises
ValueError – Error loading file
- Returns
object from class.
- Return type
cls
- classmethod instantiate_from_config_dict(config)
Initialize class from config dictionary
- Parameters
config (dictionary) – Create instance of class from dictionary.
- Returns
Object from class and config.
- Return type
cls
- load_model(path)
Load model from path
- Parameters
path (str) – Load path using str
- register_reset(obs)
Handle reset of episode.
- save_model(path)
Save model to path
- Parameters
path (str) – Save path using str
- schema = Map({'steps_to_sample_randomly': Int(), 'lr': Float(), 'clip_ratio': Float(), Optional("load_checkpoint_from"): Str(), Optional("train_pi_iters"): Int(), Optional("train_v_iters"): Int(), Optional("target_kl"): Float(), Optional("actor_critic_cfg_path"): Str()})
- select_action(obs) array
Select action given observation array.
- Parameters
obs (np.array) – Observation array
- Returns
Action array
- Return type
np.array
- update(data)
Update parameters given batch of data.
- Parameters
data (dict) – Dict of batched data to update params from.
src.agents.SACAgent module
This is OpenAI’ Spinning Up PyTorch implementation of Soft-Actor-Critic with minor adjustments. For the official documentation, see below: https://spinningup.openai.com/en/latest/algorithms/sac.html#documentation-pytorch-version Source: https://github.com/openai/spinningup/blob/master/spinup/algos/pytorch/sac/sac.py
- class src.agents.SACAgent.SACAgent(steps_to_sample_randomly: int, gamma: float, alpha: float, polyak: float, lr: float, actor_critic_cfg_path: str, load_checkpoint_from: str = '')
Bases:
BaseAgentAdopted from https://github.com/learn-to-race/l2r/blob/main/l2r/baselines/rl/sac.py
Initialize Soft Actor-Critic Agent
- Parameters
steps_to_sample_randomly (int) – Number of steps to sample randomly
gamma (float) – Gamma parameter
alpha (float) – Alpha parameter
polyak (float) – Polyak parameter coef.
lr (float) – Learning rate parameter.
actor_critic_cfg_path (str) – Actor Critic Config Path
load_checkpoint_from (str, optional) – Load checkpoint from path. If ‘’, then doesn’t load anything. Defaults to ‘’.
- classmethod instantiate_from_config(config_file_location)
Initialize class from config file
- Parameters
config_file_location (path) – Path to config file
- Raises
ValueError – Error loading file
- Returns
object from class.
- Return type
cls
- classmethod instantiate_from_config_dict(config)
Initialize class from config dictionary
- Parameters
config (dictionary) – Create instance of class from dictionary.
- Returns
Object from class and config.
- Return type
cls
- load_model(path)
Load model from path.
- Parameters
path (str) – Load model from path.
- register_reset(obs)
Same input/output as select_action, except this method is called at episodal reset.
- save_model(path)
Save model to path
- Parameters
path (str) – Save model to path
- schema = Map({'steps_to_sample_randomly': Int(), 'gamma': Float(), 'alpha': Float(), 'polyak': Float(), 'lr': Float(), 'actor_critic_cfg_path': Str(), Optional("load_checkpoint_from"): Str()})
- select_action(obs)
Select action from obs.
- Parameters
obs (np.array) – Observation to act on.
- Returns
Action object.
- Return type
ActionObj
- update(data)
Update SAC Agent given data
- Parameters
data (dict) – Data from ReplayBuffer object.
src.agents.base module
Base agent definition. May be out of date.
- class src.agents.base.BaseAgent(action_space=Box(-1.0, 1.0, (2,), float32))
Bases:
ABCBase Agent Definition.
Initialize Agent Space
- Parameters
action_space (gym.spaces.Box, optional) – Default action space. Defaults to default_action_space.
- default_action_space = Box(-1.0, 1.0, (2,), float32)
- load_model(path)
Load model checkpoint from path
- Parameters
path (str) – Path to checkpoint
- Raises
NotImplementedError – Need to overload
- register_reset(obs) array
Handle reset of episode.
- Parameters
obs (np.array) – Observation
- Returns
Action
- Return type
np.array
- save_model(path)
Save model checkpoint to path
- Parameters
path (str) – Path to checkpoint
- Raises
NotImplementedError – Need to overload.
- select_action(obs) array
Select action based on obs
- Parameters
obs (np.array) – Observation. See wrapper / env for details
- Raises
NotImplementedError – Need to implement in subclass
- Returns
Action
- Return type
np.array
- update(data)
Model update given data
- Parameters
data (dict) – Data.
- Raises
NotImplementedError – Need to overload
src.agents.random_agent module
Simple Random Agent.
- class src.agents.random_agent.RandomAgent(action_space=Box(-1.0, 1.0, (2,), float32))
Bases:
BaseAgentRandomly pick actions in the space.
Initialize Agent Space
- Parameters
action_space (gym.spaces.Box, optional) – Default action space. Defaults to default_action_space.
- select_action(obs) array
Selection action through random sampling.
- Parameters
obs (np.array) – Observation (unused)
- Returns
Action
- Return type
np.array
Module contents
Agent Definitions.