src.networks package

Submodules

src.networks.critic module

Network definitions for all critic functions.

class src.networks.critic.ActivationType(value)

Bases: Enum

Enum class to indicate the type of activation

ReLU = <class 'torch.nn.modules.activation.ReLU'>
Tanh = <class 'torch.nn.modules.activation.Tanh'>
class src.networks.critic.ActorCritic(activation: str = 'ReLU', critic_cfg: ConfigurableDict = {'config': {'state_dim': 32}, 'name': 'Qfunction'}, state_dim: int = 32, action_dim: int = 2, max_action_value: float = 1.0, speed_encoder_hiddens: List[int] = [8, 8], fusion_hiddens: List[int] = [32, 64, 64, 32, 32], use_speed: bool = True)

Bases: Module

The actor-critic class that allows the basic A2C to be initialized and used in the agent files. This initializes the actor and critic networks and then defines a wrapper function for the policy and a function to get an action from the action network.

Initialize the observation dimension and action space dimensions, as well as the actor and critic networks.

act(obs_feat, deterministic=False)

Uses the policy to get and return an action on the appropriate device in the right format.

classmethod instantiate_from_config(config_file_location)

Initialize class from config file

Parameters

config_file_location (path) – Path to config file

Raises

ValueError – Error loading file

Returns

object from class.

Return type

cls

classmethod instantiate_from_config_dict(config)

Initialize class from config dictionary

Parameters

config (dictionary) – Create instance of class from dictionary.

Returns

Object from class and config.

Return type

cls

pi(obs_feat, deterministic=False)

Wrapper around the policy. Helps manage dimensions and add/remove features from the input space.

schema = Map({Optional("activation"): Str(), Optional("critic_cfg"): Map({'name': Str(), 'config': Any()}), Optional("state_dim"): Int(), Optional("action_dim"): Int(), Optional("max_action_value"): Float(), Optional("speed_encoder_hiddens"): Seq(Int()), Optional("fusion_hiddens"): Seq(Int()), Optional("use_speed"): Bool()})
training: bool
class src.networks.critic.Qfunction(state_dim: int = 32, action_dim: int = 2, speed_encoder_hiddens: List[int] = [8, 8], fusion_hiddens: List[int] = [32, 64, 64, 32, 32], use_speed: bool = True)

Bases: Module

“Multimodal Architecture Fusing State, Action, and a Speed Embedding together to regress rewards.

Initialize Q (State, Action) -> Value Regressor

Parameters
  • state_dim (int, optional) – State dimension. Defaults to 32.

  • action_dim (int, optional) – Action dimension. Defaults to 2.

  • speed_encoder_hiddens (List[int], optional) – List of hidden layer dims for the speed encoder. Defaults to [1,8,8].

  • fusion_hiddens (List[int], optional) – List of hidden layer dims for the fusion section. Defaults to [32,64,64,32,32].

  • use_speed (bool, optional) – Whether to include a speed encoder or not. Defaults to True.

forward(obs_feat: Tensor, action: Tensor) Tensor

Get (s,a) value estimates

Parameters
  • obs_feat (torch.Tensor) – Input encoded and concatenated with speed (bs, dim)

  • action (torch.Tensor) – Action tensor (bs, action_dim)

Returns

torch.Tensor of dim (bs,)

Return type

value

classmethod instantiate_from_config(config_file_location)

Initialize class from config file

Parameters

config_file_location (path) – Path to config file

Raises

ValueError – Error loading file

Returns

object from class.

Return type

cls

classmethod instantiate_from_config_dict(config)

Initialize class from config dictionary

Parameters

config (dictionary) – Create instance of class from dictionary.

Returns

Object from class and config.

Return type

cls

schema = Map({Optional("state_dim"): Int(), Optional("action_dim"): Int(), Optional("speed_encoder_hiddens"): Seq(Int()), Optional("fusion_hiddens"): Seq(Int()), Optional("use_speed"): Bool()})
training: bool
class src.networks.critic.SquashedGaussianMLPActor(obs_dim, act_dim, hidden_sizes, activation, act_limit)

Bases: Module

Squashed Gaussian MLP Actor.

Initialize Squashed Gaussian Actor

Parameters
  • obs_dim (int) – Observation dimension

  • act_dim (int) – Action dimension

  • hidden_sizes (list[int]) – List of hidden sizes

  • activation (nn.Module) – Activation function

  • act_limit (int) – Action limit

forward(obs, deterministic=False, with_logprob=True)

Get action from obs.

Parameters
  • obs (int) – Observation

  • deterministic (bool, optional) – Whether to use means instead of rsample. Defaults to False.

  • with_logprob (bool, optional) – Whether to return log probability. Defaults to True.

Returns

Tuple of action, logprob

Return type

tuple

training: bool
class src.networks.critic.Vfunction(state_dim: int = 32, speed_encoder_hiddens: List[int] = [8, 8], fusion_hiddens: List[int] = [32, 64, 64, 32, 32], use_speed: bool = True)

Bases: Module

“Multimodal Architecture Fusing State, and a Speed Embedding together to regress rewards.

Initialize V (State,) -> Value Regressor

Parameters
  • state_dim (int, optional) – State dimension. Defaults to 32.

  • speed_encoder_hiddens (List[int], optional) – List of hidden layer dims for the speed encoder. Defaults to [1,8,8].

  • fusion_hiddens (List[int], optional) – List of hidden layer dims for the fusion section. Defaults to [32,64,64,32,32].

  • use_speed (bool, optional) – Whether to include a speed encoder or not. Defaults to True.

forward(obs_feat)

Get state value estimates

Parameters

obs_feat (torch.Tensor) – Input encoded and concatenated with speed (bs, dim)

Returns

torch.Tensor of dim (bs,)

Return type

value

classmethod instantiate_from_config(config_file_location)

Initialize class from config file

Parameters

config_file_location (path) – Path to config file

Raises

ValueError – Error loading file

Returns

object from class.

Return type

cls

classmethod instantiate_from_config_dict(config)

Initialize class from config dictionary

Parameters

config (dictionary) – Create instance of class from dictionary.

Returns

Object from class and config.

Return type

cls

schema = Map({Optional("state_dim"): Int(), Optional("speed_encoder_hiddens"): Seq(Int()), Optional("fusion_hiddens"): Seq(Int()), Optional("use_speed"): Bool()})
training: bool
src.networks.critic.mlp(sizes, activation=<class 'torch.nn.modules.activation.ReLU'>, output_activation=<class 'torch.nn.modules.linear.Identity'>)

Generate MLP from inputs

Parameters
  • sizes (list[int]) – List of sizes

  • activation (nn.Module, optional) – Activation function for hidden layers. Defaults to nn.ReLU.

  • output_activation (nn.Module, optional) – Activation function for output layer. Defaults to nn.Identity.

Returns

MLP

Return type

nn.Module

src.networks.pets module

class src.networks.pets.DynamicsNetwork(state_size: int = 32, action_size: int = 2, ensemble_size: int = 7, hidden_layer: int = 3, hidden_size: int = 200)

Bases: Module

PETS Dynamics Network. See paper for details, as this is a bit complex.

Dynamics network init.

Parameters
  • state_size (int, optional) – State size. Defaults to 32.

  • action_size (int, optional) – Action dimension. Defaults to 2.

  • ensemble_size (int, optional) – Number of probabilistic ensembles. Must agree with n_ensembles in petsagent. Defaults to 7.

  • hidden_layer (int, optional) – Hidden layer count. Defaults to 3.

  • hidden_size (int, optional) – Hidden layer dimension. Defaults to 200.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod instantiate_from_config(config_file_location)

Initialize class from config file

Parameters

config_file_location (path) – Path to config file

Raises

ValueError – Error loading file

Returns

object from class.

Return type

cls

classmethod instantiate_from_config_dict(config)

Initialize class from config dictionary

Parameters

config (dictionary) – Create instance of class from dictionary.

Returns

Object from class and config.

Return type

cls

predict(states, actions, deterministic=False)
schema = Map({Optional("state_size"): Int(), Optional("action_size"): Int(), Optional("ensemble_size"): Int(), Optional("hidden_layer"): Int(), Optional("hidden_size"): Int()})
training: bool
class src.networks.pets.Ensemble_FC_Layer(in_features, out_features, ensemble_size, bias=True)

Bases: Module

Convenience layer for PETS dynamics.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x) Tensor

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool

Module contents

Network definitions