scope_rl.ope.weight_value_learning.function#

Weight and Value Functions.

Classes

`ContinuousQFunction`	Q Function (for continuous action space).
`ContinuousStateActionWeightFunction`	State Action Weight Function (for continuous action space).
`DiscreteQFunction`	Q Function (for discrete action space).
`DiscreteStateActionWeightFunction`	State Action Weight Function (for discrete action space).
`StateWeightFunction`	State Weight Function (for both discrete and continuous action space).
`VFunction`	Value Function (for both discrete and continuous action space).

class scope_rl.ope.weight_value_learning.function.VFunction(state_dim, hidden_dim=100)[source]#

Value Function (for both discrete and continuous action space).

Bases: torch.nn.Module

Imported as: scope_rl.ope.weight_value_learning.function.VFunction

Parameters:

state_dim (int (> 0)) – Dimensions of the state space.
hidden_dim (int, default=100 (> 0)) – Hidden dimension of the network.

class scope_rl.ope.weight_value_learning.function.StateWeightFunction(state_dim, hidden_dim=100, enable_gradient_reversal=False)[source]#

State Weight Function (for both discrete and continuous action space).

Bases: torch.nn.Module

Imported as: scope_rl.ope.weight_value_learning.function.StateWeightFunction

Parameters:

state_dim (int (> 0)) – Dimensions of the state space.
hidden_dim (int, default=100 (> 0)) – Hidden dimension of the network.
enable_gradient_reversal (bool = False) – Whether to enable gradient reversal layer (for loss maximization).

class scope_rl.ope.weight_value_learning.function.DiscreteQFunction(n_actions, state_dim, hidden_dim=100, device='cuda:0')[source]#

Q Function (for discrete action space).

Bases: torch.nn.Module

Imported as: scope_rl.ope.weight_value_learning.function.DiscreteQFunction

Parameters:

n_actions (int (> 0)) – Number of actions.
state_dim (int (> 0)) – Dimensions of the state space.
hidden_dim (int, default=100 (> 0)) – Hidden dimension of the network.
device (str, default="cuda:0") – Specifies device used for torch.

Methods

all
argmax
expectation
max

class scope_rl.ope.weight_value_learning.function.ContinuousQFunction(action_dim, state_dim, hidden_dim=100)[source]#

Q Function (for continuous action space).

Bases: torch.nn.Module

Imported as: scope_rl.ope.weight_value_learning.function.ContinuousQFunction

Parameters:

action_dim (int (> 0)) – Dimensions of the action space.
state_dim (int (> 0)) – Dimensions of the state space.
hidden_dim (int, default=100 (> 0)) – Hidden dimension of the network.

class scope_rl.ope.weight_value_learning.function.DiscreteStateActionWeightFunction(n_actions, state_dim, hidden_dim=100, enable_gradient_reversal=False, device='cuda:0')[source]#

State Action Weight Function (for discrete action space).

Bases: torch.nn.Module

Imported as: scope_rl.ope.weight_value_learning.function.DiscreteStateActionWeightFunction

Parameters:

n_actions (int (> 0)) – Number of actions.
state_dim (int (> 0)) – Dimensions of the state space.
hidden_dim (int, default=100 (> 0)) – Hidden dimension of the network.
enable_gradient_reversal (bool = False) – Whether to enable gradient reversal layer (for loss maximization).
device (str, default="cuda:0") – Specifies device used for torch.

class scope_rl.ope.weight_value_learning.function.ContinuousStateActionWeightFunction(action_dim, state_dim, hidden_dim=100, enable_gradient_reversal=False)[source]#

State Action Weight Function (for continuous action space).

Bases: torch.nn.Module

Imported as: scope_rl.ope.weight_value_learning.function.ContinuousStateActionWeightFunction

Parameters:

action_dim (int (> 0)) – Dimensions of the action space.
state_dim (int (> 0)) – Dimensions of the state space.
hidden_dim (int, default=100 (> 0)) – Hidden dimension of the network.
enable_gradient_reversal (bool = False) – Whether to enable gradient reversal layer (for loss maximization).