src.agents package

Subpackages

Submodules

src.agents.PETSAgent module

class src.agents.PETSAgent.PETSAgent(network_config_path: str, planner_config: ConfigurableDict, n_ensembles: int = 7, lr: float = 0.01, model_save_path: str = '/mnt/blah', load_checkpoint: bool = False, deterministic: bool = False)

Bases: BaseAgent

Adopted from https://github.com/BY571/PETS-MPC. Currently not using CEM, but random AS.

Initialize PETS Agent

Parameters
  • network_config_path (str) – Path to network config.

  • planner_config (ConfigurableDict) – Configuration of planner

  • n_ensembles (int, optional) – Number of networks in ensemble. Defaults to 7.

  • lr (float, optional) – Learning rate. Defaults to 1e-2.

  • model_save_path (str, optional) – Model save path (unused). Defaults to ‘/mnt/blah’.

  • load_checkpoint (bool, optional) – Whether to load checkpoint or not (unused). Defaults to False.

  • deterministic (bool, optional) – Whether to act deterministically (unused). Defaults to False.

classmethod instantiate_from_config(config_file_location)

Initialize class from config file

Parameters

config_file_location (path) – Path to config file

Raises

ValueError – Error loading file

Returns

object from class.

Return type

cls

classmethod instantiate_from_config_dict(config)

Initialize class from config dictionary

Parameters

config (dictionary) – Create instance of class from dictionary.

Returns

Object from class and config.

Return type

cls

load_model(path)

Unused but load model from data

Parameters

path (str) – Path to data

Returns

self

Return type

model

register_reset(obs) array

Handle episode reset

Parameters

obs (np.array) – Observation

Returns

Action

Return type

np.array

save_model(path)

Unusued but save model to path

Parameters

path (str) – _description_

schema = Map({'network_config_path': Str(), 'planner_config': Map({'name': Str(), 'config': Any()}), Optional("n_ensembles"): Int(), Optional("lr"): Float(), Optional("model_save_path"): Str(), Optional("load_checkpoint"): Bool(), Optional("deterministic"): Bool()})
select_action(obs) array

Select action given obs

Parameters

obs (np.array) – Observation

Returns

Action

Return type

np.array

update(data)

Update given data

Parameters

data (dict) – Dict of data from SimpleReplayBuffer

src.agents.PPOAgent module

PPOAgent Definition.

class src.agents.PPOAgent.PPOAgent(steps_to_sample_randomly: int, lr: float, clip_ratio: float, load_checkpoint_from: str = '', train_pi_iters: int = 80, train_v_iters: int = 80, target_kl: float = 0.01, actor_critic_cfg_path: str = '')

Bases: BaseAgent

Proximal Policy Optimization Agent

Initialize Proximal Policy Optimization Agent

Parameters
  • steps_to_sample_randomly (int) – Number of steps to sample randomly

  • lr (float) – Learning rate

  • clip_ratio (float) – Clip ratio

  • load_checkpoint_from (str, optional) – Where to load checkpoint from. Using default does not load any checkpoint. Defaults to ‘’.

  • train_pi_iters (int, optional) – Number of update iterations for policy per call to update. Defaults to 80.

  • train_v_iters (int, optional) – Number of update iterations for value per call to update. Defaults to 80.

  • target_kl (float, optional) – Target Kubler-Leibleck Divergence. Defaults to 0.01.

  • actor_critic_cfg_path (str, optional) – Path to AC cfg. Defaults to ‘’.

classmethod instantiate_from_config(config_file_location)

Initialize class from config file

Parameters

config_file_location (path) – Path to config file

Raises

ValueError – Error loading file

Returns

object from class.

Return type

cls

classmethod instantiate_from_config_dict(config)

Initialize class from config dictionary

Parameters

config (dictionary) – Create instance of class from dictionary.

Returns

Object from class and config.

Return type

cls

load_model(path)

Load model from path

Parameters

path (str) – Load path using str

register_reset(obs)

Handle reset of episode.

save_model(path)

Save model to path

Parameters

path (str) – Save path using str

schema = Map({'steps_to_sample_randomly': Int(), 'lr': Float(), 'clip_ratio': Float(), Optional("load_checkpoint_from"): Str(), Optional("train_pi_iters"): Int(), Optional("train_v_iters"): Int(), Optional("target_kl"): Float(), Optional("actor_critic_cfg_path"): Str()})
select_action(obs) array

Select action given observation array.

Parameters

obs (np.array) – Observation array

Returns

Action array

Return type

np.array

update(data)

Update parameters given batch of data.

Parameters

data (dict) – Dict of batched data to update params from.

src.agents.SACAgent module

This is OpenAI’ Spinning Up PyTorch implementation of Soft-Actor-Critic with minor adjustments. For the official documentation, see below: https://spinningup.openai.com/en/latest/algorithms/sac.html#documentation-pytorch-version Source: https://github.com/openai/spinningup/blob/master/spinup/algos/pytorch/sac/sac.py

class src.agents.SACAgent.SACAgent(steps_to_sample_randomly: int, gamma: float, alpha: float, polyak: float, lr: float, actor_critic_cfg_path: str, load_checkpoint_from: str = '')

Bases: BaseAgent

Adopted from https://github.com/learn-to-race/l2r/blob/main/l2r/baselines/rl/sac.py

Initialize Soft Actor-Critic Agent

Parameters
  • steps_to_sample_randomly (int) – Number of steps to sample randomly

  • gamma (float) – Gamma parameter

  • alpha (float) – Alpha parameter

  • polyak (float) – Polyak parameter coef.

  • lr (float) – Learning rate parameter.

  • actor_critic_cfg_path (str) – Actor Critic Config Path

  • load_checkpoint_from (str, optional) – Load checkpoint from path. If ‘’, then doesn’t load anything. Defaults to ‘’.

classmethod instantiate_from_config(config_file_location)

Initialize class from config file

Parameters

config_file_location (path) – Path to config file

Raises

ValueError – Error loading file

Returns

object from class.

Return type

cls

classmethod instantiate_from_config_dict(config)

Initialize class from config dictionary

Parameters

config (dictionary) – Create instance of class from dictionary.

Returns

Object from class and config.

Return type

cls

load_model(path)

Load model from path.

Parameters

path (str) – Load model from path.

register_reset(obs)

Same input/output as select_action, except this method is called at episodal reset.

save_model(path)

Save model to path

Parameters

path (str) – Save model to path

schema = Map({'steps_to_sample_randomly': Int(), 'gamma': Float(), 'alpha': Float(), 'polyak': Float(), 'lr': Float(), 'actor_critic_cfg_path': Str(), Optional("load_checkpoint_from"): Str()})
select_action(obs)

Select action from obs.

Parameters

obs (np.array) – Observation to act on.

Returns

Action object.

Return type

ActionObj

update(data)

Update SAC Agent given data

Parameters

data (dict) – Data from ReplayBuffer object.

src.agents.base module

Base agent definition. May be out of date.

class src.agents.base.BaseAgent(action_space=Box(-1.0, 1.0, (2,), float32))

Bases: ABC

Base Agent Definition.

Initialize Agent Space

Parameters

action_space (gym.spaces.Box, optional) – Default action space. Defaults to default_action_space.

default_action_space = Box(-1.0, 1.0, (2,), float32)
load_model(path)

Load model checkpoint from path

Parameters

path (str) – Path to checkpoint

Raises

NotImplementedError – Need to overload

register_reset(obs) array

Handle reset of episode.

Parameters

obs (np.array) – Observation

Returns

Action

Return type

np.array

save_model(path)

Save model checkpoint to path

Parameters

path (str) – Path to checkpoint

Raises

NotImplementedError – Need to overload.

select_action(obs) array

Select action based on obs

Parameters

obs (np.array) – Observation. See wrapper / env for details

Raises

NotImplementedError – Need to implement in subclass

Returns

Action

Return type

np.array

update(data)

Model update given data

Parameters

data (dict) – Data.

Raises

NotImplementedError – Need to overload

src.agents.random_agent module

Simple Random Agent.

class src.agents.random_agent.RandomAgent(action_space=Box(-1.0, 1.0, (2,), float32))

Bases: BaseAgent

Randomly pick actions in the space.

Initialize Agent Space

Parameters

action_space (gym.spaces.Box, optional) – Default action space. Defaults to default_action_space.

select_action(obs) array

Selection action through random sampling.

Parameters

obs (np.array) – Observation (unused)

Returns

Action

Return type

np.array

Module contents

Agent Definitions.