src.agents package

Submodules

src.agents.PETSAgent module

class src.agents.PETSAgent.PETSAgent(network_config_path: str, planner_config: ConfigurableDict, n_ensembles: int = 7, lr: float = 0.01, model_save_path: str = '/mnt/blah', load_checkpoint: bool = False, deterministic: bool = False)

Bases: BaseAgent

Adopted from https://github.com/BY571/PETS-MPC. Currently not using CEM, but random AS.

Initialize PETS Agent

Parameters

network_config_path (str) – Path to network config.
planner_config (ConfigurableDict) – Configuration of planner
n_ensembles (int, optional) – Number of networks in ensemble. Defaults to 7.
lr (float, optional) – Learning rate. Defaults to 1e-2.
model_save_path (str, optional) – Model save path (unused). Defaults to ‘/mnt/blah’.
load_checkpoint (bool, optional) – Whether to load checkpoint or not (unused). Defaults to False.
deterministic (bool, optional) – Whether to act deterministically (unused). Defaults to False.

classmethod instantiate_from_config(config_file_location)

Initialize class from config file

Parameters: config_file_location (path) – Path to config file
Raises: ValueError – Error loading file
Returns: object from class.
Return type: cls

classmethod instantiate_from_config_dict(config)

Initialize class from config dictionary

Parameters: config (dictionary) – Create instance of class from dictionary.
Returns: Object from class and config.
Return type: cls

load_model(path)

Unused but load model from data

Parameters: path (str) – Path to data
Returns: self
Return type: model

register_reset(obs) → array

Handle episode reset

Parameters: obs (np.array) – Observation
Returns: Action
Return type: np.array

save_model(path)

Unusued but save model to path

Parameters: path (str) – _description_

schema = Map({'network_config_path': Str(), 'planner_config': Map({'name': Str(), 'config': Any()}), Optional("n_ensembles"): Int(), Optional("lr"): Float(), Optional("model_save_path"): Str(), Optional("load_checkpoint"): Bool(), Optional("deterministic"): Bool()})

select_action(obs) → array

Select action given obs

Parameters: obs (np.array) – Observation
Returns: Action
Return type: np.array

update(data)

Update given data

Parameters: data (dict) – Dict of data from SimpleReplayBuffer

src.agents.PPOAgent module

PPOAgent Definition.

class src.agents.PPOAgent.PPOAgent(steps_to_sample_randomly: int, lr: float, clip_ratio: float, load_checkpoint_from: str = '', train_pi_iters: int = 80, train_v_iters: int = 80, target_kl: float = 0.01, actor_critic_cfg_path: str = '')

Bases: BaseAgent

Proximal Policy Optimization Agent

Initialize Proximal Policy Optimization Agent

Parameters

steps_to_sample_randomly (int) – Number of steps to sample randomly
lr (float) – Learning rate
clip_ratio (float) – Clip ratio
load_checkpoint_from (str, optional) – Where to load checkpoint from. Using default does not load any checkpoint. Defaults to ‘’.
train_pi_iters (int, optional) – Number of update iterations for policy per call to update. Defaults to 80.
train_v_iters (int, optional) – Number of update iterations for value per call to update. Defaults to 80.
target_kl (float, optional) – Target Kubler-Leibleck Divergence. Defaults to 0.01.
actor_critic_cfg_path (str, optional) – Path to AC cfg. Defaults to ‘’.

classmethod instantiate_from_config(config_file_location)

Initialize class from config file

Parameters: config_file_location (path) – Path to config file
Raises: ValueError – Error loading file
Returns: object from class.
Return type: cls

classmethod instantiate_from_config_dict(config)

Initialize class from config dictionary

Parameters: config (dictionary) – Create instance of class from dictionary.
Returns: Object from class and config.
Return type: cls

load_model(path)

Load model from path

Parameters: path (str) – Load path using str

register_reset(obs): Handle reset of episode.

save_model(path)

Save model to path

Parameters: path (str) – Save path using str

schema = Map({'steps_to_sample_randomly': Int(), 'lr': Float(), 'clip_ratio': Float(), Optional("load_checkpoint_from"): Str(), Optional("train_pi_iters"): Int(), Optional("train_v_iters"): Int(), Optional("target_kl"): Float(), Optional("actor_critic_cfg_path"): Str()})

select_action(obs) → array

Select action given observation array.

Parameters: obs (np.array) – Observation array
Returns: Action array
Return type: np.array

update(data)

Update parameters given batch of data.

Parameters: data (dict) – Dict of batched data to update params from.

src.agents.SACAgent module

This is OpenAI’ Spinning Up PyTorch implementation of Soft-Actor-Critic with minor adjustments. For the official documentation, see below: https://spinningup.openai.com/en/latest/algorithms/sac.html#documentation-pytorch-version Source: https://github.com/openai/spinningup/blob/master/spinup/algos/pytorch/sac/sac.py

class src.agents.SACAgent.SACAgent(steps_to_sample_randomly: int, gamma: float, alpha: float, polyak: float, lr: float, actor_critic_cfg_path: str, load_checkpoint_from: str = '')

Bases: BaseAgent

Adopted from https://github.com/learn-to-race/l2r/blob/main/l2r/baselines/rl/sac.py

Initialize Soft Actor-Critic Agent

Parameters

steps_to_sample_randomly (int) – Number of steps to sample randomly
gamma (float) – Gamma parameter
alpha (float) – Alpha parameter
polyak (float) – Polyak parameter coef.
lr (float) – Learning rate parameter.
actor_critic_cfg_path (str) – Actor Critic Config Path
load_checkpoint_from (str, optional) – Load checkpoint from path. If ‘’, then doesn’t load anything. Defaults to ‘’.

classmethod instantiate_from_config(config_file_location)

Initialize class from config file

Parameters: config_file_location (path) – Path to config file
Raises: ValueError – Error loading file
Returns: object from class.
Return type: cls

classmethod instantiate_from_config_dict(config)

Initialize class from config dictionary

Parameters: config (dictionary) – Create instance of class from dictionary.
Returns: Object from class and config.
Return type: cls

load_model(path)

Load model from path.

Parameters: path (str) – Load model from path.

register_reset(obs): Same input/output as select_action, except this method is called at episodal reset.

save_model(path)

Save model to path

Parameters: path (str) – Save model to path

schema = Map({'steps_to_sample_randomly': Int(), 'gamma': Float(), 'alpha': Float(), 'polyak': Float(), 'lr': Float(), 'actor_critic_cfg_path': Str(), Optional("load_checkpoint_from"): Str()})

select_action(obs)

Select action from obs.

Parameters: obs (np.array) – Observation to act on.
Returns: Action object.
Return type: ActionObj

update(data)

Update SAC Agent given data

Parameters: data (dict) – Data from ReplayBuffer object.

src.agents.base module

Base agent definition. May be out of date.

class src.agents.base.BaseAgent(action_space=Box(-1.0, 1.0, (2,), float32))

Bases: ABC

Base Agent Definition.

Initialize Agent Space

Parameters: action_space (gym.spaces.Box, optional) – Default action space. Defaults to default_action_space.

default_action_space = Box(-1.0, 1.0, (2,), float32)

load_model(path)

Load model checkpoint from path

Parameters: path (str) – Path to checkpoint
Raises: NotImplementedError – Need to overload

register_reset(obs) → array

Handle reset of episode.

Parameters: obs (np.array) – Observation
Returns: Action
Return type: np.array

save_model(path)

Save model checkpoint to path

Parameters: path (str) – Path to checkpoint
Raises: NotImplementedError – Need to overload.

select_action(obs) → array

Select action based on obs

Parameters: obs (np.array) – Observation. See wrapper / env for details
Raises: NotImplementedError – Need to implement in subclass
Returns: Action
Return type: np.array

update(data)

Model update given data

Parameters: data (dict) – Data.
Raises: NotImplementedError – Need to overload

src.agents.random_agent module

Simple Random Agent.

class src.agents.random_agent.RandomAgent(action_space=Box(-1.0, 1.0, (2,), float32))

Bases: BaseAgent

Randomly pick actions in the space.

Initialize Agent Space

Parameters: action_space (gym.spaces.Box, optional) – Default action space. Defaults to default_action_space.

select_action(obs) → array

Selection action through random sampling.

Parameters: obs (np.array) – Observation (unused)
Returns: Action
Return type: np.array

Module contents

Agent Definitions.

src.agents package

Subpackages

Submodules

src.agents.PETSAgent module

src.agents.PPOAgent module

src.agents.SACAgent module

src.agents.base module

src.agents.random_agent module

Module contents