scope_rl.ope.weight_value_learning.function#

Weight and Value Functions.

Classes

ContinuousQFunction

Q Function (for continuous action space).

ContinuousStateActionWeightFunction

State Action Weight Function (for continuous action space).

DiscreteQFunction

Q Function (for discrete action space).

DiscreteStateActionWeightFunction

State Action Weight Function (for discrete action space).

StateWeightFunction

State Weight Function (for both discrete and continuous action space).

VFunction

Value Function (for both discrete and continuous action space).

class scope_rl.ope.weight_value_learning.function.VFunction(state_dim, hidden_dim=100)[source]#

Value Function (for both discrete and continuous action space).

Bases: torch.nn.Module

Imported as: scope_rl.ope.weight_value_learning.function.VFunction

Parameters:
  • state_dim (int (> 0)) – Dimensions of the state space.

  • hidden_dim (int, default=100 (> 0)) – Hidden dimension of the network.

class scope_rl.ope.weight_value_learning.function.StateWeightFunction(state_dim, hidden_dim=100, enable_gradient_reversal=False)[source]#

State Weight Function (for both discrete and continuous action space).

Bases: torch.nn.Module

Imported as: scope_rl.ope.weight_value_learning.function.StateWeightFunction

Parameters:
  • state_dim (int (> 0)) – Dimensions of the state space.

  • hidden_dim (int, default=100 (> 0)) – Hidden dimension of the network.

  • enable_gradient_reversal (bool = False) – Whether to enable gradient reversal layer (for loss maximization).

class scope_rl.ope.weight_value_learning.function.DiscreteQFunction(n_actions, state_dim, hidden_dim=100, device='cuda:0')[source]#

Q Function (for discrete action space).

Bases: torch.nn.Module

Imported as: scope_rl.ope.weight_value_learning.function.DiscreteQFunction

Parameters:
  • n_actions (int (> 0)) – Number of actions.

  • state_dim (int (> 0)) – Dimensions of the state space.

  • hidden_dim (int, default=100 (> 0)) – Hidden dimension of the network.

  • device (str, default="cuda:0") – Specifies device used for torch.

Methods

all

argmax

expectation

max

class scope_rl.ope.weight_value_learning.function.ContinuousQFunction(action_dim, state_dim, hidden_dim=100)[source]#

Q Function (for continuous action space).

Bases: torch.nn.Module

Imported as: scope_rl.ope.weight_value_learning.function.ContinuousQFunction

Parameters:
  • action_dim (int (> 0)) – Dimensions of the action space.

  • state_dim (int (> 0)) – Dimensions of the state space.

  • hidden_dim (int, default=100 (> 0)) – Hidden dimension of the network.

class scope_rl.ope.weight_value_learning.function.DiscreteStateActionWeightFunction(n_actions, state_dim, hidden_dim=100, enable_gradient_reversal=False, device='cuda:0')[source]#

State Action Weight Function (for discrete action space).

Bases: torch.nn.Module

Imported as: scope_rl.ope.weight_value_learning.function.DiscreteStateActionWeightFunction

Parameters:
  • n_actions (int (> 0)) – Number of actions.

  • state_dim (int (> 0)) – Dimensions of the state space.

  • hidden_dim (int, default=100 (> 0)) – Hidden dimension of the network.

  • enable_gradient_reversal (bool = False) – Whether to enable gradient reversal layer (for loss maximization).

  • device (str, default="cuda:0") – Specifies device used for torch.

class scope_rl.ope.weight_value_learning.function.ContinuousStateActionWeightFunction(action_dim, state_dim, hidden_dim=100, enable_gradient_reversal=False)[source]#

State Action Weight Function (for continuous action space).

Bases: torch.nn.Module

Imported as: scope_rl.ope.weight_value_learning.function.ContinuousStateActionWeightFunction

Parameters:
  • action_dim (int (> 0)) – Dimensions of the action space.

  • state_dim (int (> 0)) – Dimensions of the state space.

  • hidden_dim (int, default=100 (> 0)) – Hidden dimension of the network.

  • enable_gradient_reversal (bool = False) – Whether to enable gradient reversal layer (for loss maximization).