scope_rl.utils#

Useful tools.

Functions

`check_array`	Input validation on array.
`check_input_dict`	Check input dict keys.
`check_logged_dataset`	Check logged dataset keys.
`cosine_kernel`	Cosine kernel.
`defaultdict_to_dict`	Transform a defaultdict into a corresponding dict.
`epanechnikov_kernel`	Epanechnikov kernel.
`estimate_confidence_interval_by_bootstrap`	Estimate the confidence interval by a nonparametric bootstrap-like procedure.
`estimate_confidence_interval_by_empirical_bernstein`	Estimate the confidence interval by the empirical bernstein inequality.
`estimate_confidence_interval_by_hoeffding`	Estimate the confidence interval by the Hoeffding's inequality.
`estimate_confidence_interval_by_t_test`	Estimate the confidence interval by Student T-test.
`gaussian_kernel`	Gaussian kernel.
`l2_distance`	Calcilate L2 distance.
`triangular_kernel`	Triangular kernel.
`uniform_kernel`	Uniform kernel.

Classes

`MultipleInputDict`	This class contains paths to multiple input dictionaries for OPE and returns input_dict.
`MultipleLoggedDataset`	This class contains paths to multiple logged datasets and returns logged_dataset.
`NewGymAPIWrapper`	This class converts old gym outputs (gym<0.26.0) to the new ones (gym>=0.26.0).
`OldGymAPIWrapper`	This class converts new gym outputs (gym>=0.26.0) to the old ones (gym<0.26.0).

class scope_rl.utils.MultipleLoggedDataset(action_type, path, save_relative_path=False)[source]#

This class contains paths to multiple logged datasets and returns logged_dataset.

Parameters:

action_type ({"discrete", "continuous"}) – Type of the action space.
path (str) – Path to the directory. Either absolute or relative path is acceptable.
save_relative_path (bool, default=False.) –
Whether to save a relative path. If True, a path relative to the scope-rl directory will be saved. If False, the absolute path will be saved.

Note that this option was added in order to run examples in the documentation properly. Otherwise, the default setting (False) is recommended.

Attributes:

behavior_policy_names
n_datasets

Methods

`add`(logged_dataset, behavior_policy_name)	Save logged dataset.
`get`(behavior_policy_name, dataset_id)	Load logged dataset.

add(logged_dataset, behavior_policy_name)[source]#

Save logged dataset.

Parameters:

logged_dataset (LoggedDataset.) – Logged dataset to save.
behavior_policy_name (str) – Name of the behavior policy that generated the logged dataset.

get(behavior_policy_name, dataset_id)[source]#

Load logged dataset.

Parameters:

behavior_policy_name (str) – Name of the behavior policy that generated the logged dataset.
dataset_id (int) – Id of the logged dataset.

Returns:

logged_dataset – Logged dataset.

Return type:

LoggedDataset.

class scope_rl.utils.MultipleInputDict(action_type, path, save_relative_path=False)[source]#

This class contains paths to multiple input dictionaries for OPE and returns input_dict.

Parameters:

action_type ({"discrete", "continuous"}) – Type of the action space.
path (str) – Path to the directory. Either absolute or relative path is acceptable.
save_relative_path (bool, default=False.) –
Whether to save a relative path. If True, a path relative to the scope-rl directory will be saved. If False, the absolute path will be saved.

Note that this option was added in order to run examples in the documentation properly. Otherwise, the default setting (False) is recommended.

Attributes:

behavior_policy_names
n_datasets
n_eval_policies: Check the number of evaluation policies of each input dict.
use_same_eval_policy_across_dataset: Check if the contained logged datasets use the same evaluation policies.

Methods

`add`(input_dict, behavior_policy_name, dataset_id)	Save input_dict.
`get`(behavior_policy_name, dataset_id)	Load input_dict.

add(input_dict, behavior_policy_name, dataset_id)[source]#

Save input_dict.

Parameters:

input_dict (OPEInputDict.) – Input dictionary for OPE to save.
behavior_policy_name (str) – Name of the behavior policy that generated the logged dataset.
dataset_id (int) – Id of the logged dataset.

get(behavior_policy_name, dataset_id)[source]#

Load input_dict.

Parameters:

behavior_policy_name (str) – Name of the behavior policy that generated the logged dataset.
dataset_id (int) – Id of the logged dataset.

Returns:

input_dict – Input dictionary for OPE.

Return type:

OPEInputDict.

property use_same_eval_policy_across_dataset#: Check if the contained logged datasets use the same evaluation policies.

property n_eval_policies#: Check the number of evaluation policies of each input dict.

scope_rl.utils.l2_distance(x, y, bandwidth=1.0)[source]#

Calcilate L2 distance.

Parameters:

x (array-like of shape (n_samples, n_dim)) – Input array 1.
y (array-like of shape (n_samples, n_dim)) – Input array 2.

Returns:

distance – distance between x and y.

Return type:

ndarray of (n_samples, )

scope_rl.utils.gaussian_kernel(x, y, bandwidth=1.0)[source]#

Gaussian kernel.

x: array-like of shape (n_samples, n_dim): Input array 1.
y: array-like of shape (n_samples, n_dim): Input array 2.
bandwidth: float, default=1.0: Bandwidth hyperparameter of the Gaussian kernel.

Returns:: kernel_density – kernel density of x given y.
Return type:: ndarray of (n_samples, )

scope_rl.utils.triangular_kernel(x, y, bandwidth=1.0)[source]#

Triangular kernel.

Parameters:

x (array-like of shape (n_samples, n_dim)) – Input array 1.
y (array-like of shape (n_samples, n_dim)) – Input array 2.
bandwidth (float, default=1.0) – Bandwidth hyperparameter of the Trianglar kernel.

Returns:

kernel_density – kernel density of x given y.

Return type:

ndarray of (n_samples, )

scope_rl.utils.epanechnikov_kernel(x, y, bandwidth=1.0)[source]#

Epanechnikov kernel.

Parameters:

x (array-like of shape (n_samples, n_dim)) – Input array 1.
y (array-like of shape (n_samples, n_dim)) – Input array 2.
bandwidth (float, default=1.0) – Bandwidth hyperparameter of the Trianglar kernel.

Returns:

kernel_density – kernel density of x given y.

Return type:

ndarray of (n_samples, )

scope_rl.utils.cosine_kernel(x, y, bandwidth=1.0)[source]#

Cosine kernel.

x: array-like of shape (n_samples, n_dim): Input array 1.
y: array-like of shape (n_samples, n_dim): Input array 2.
bandwidth: float, default=1.0: Bandwidth hyperparameter of the Trianglar kernel.

Returns:: kernel_density – kernel density of x given y.
Return type:: ndarray of (n_samples, )

scope_rl.utils.uniform_kernel(x, y, bandwidth=1.0)[source]#

Uniform kernel.

Parameters:

x (array-like of shape (n_samples, n_dim)) – Input array 1.
y (array-like of shape (n_samples, n_dim)) – Input array 2.
bandwidth (float, default=1.0) – Bandwidth hyperparameter of the Trianglar kernel.

Returns:

kernel_density – kernel density of x given y.

Return type:

ndarray of (n_samples, )

scope_rl.utils.estimate_confidence_interval_by_bootstrap(samples, alpha=0.05, n_bootstrap_samples=100, random_state=None)[source]#

Estimate the confidence interval by a nonparametric bootstrap-like procedure.

Parameters:

samples (array-like) – Samples.
alpha (float, default=0.05) – Significance level. The value should be within [0, 1).
n_bootstrap_samples (int, default=10000 (> 0)) – Number of resampling performed in the bootstrap procedure.
random_state (int, default=None (>= 0)) – Random state.

Returns:

estimated_confidence_interval – Dictionary storing the estimated mean and upper-lower confidence bounds.

Return type:

dict

scope_rl.utils.estimate_confidence_interval_by_hoeffding(samples, alpha=0.05, **kwargs)[source]#

Estimate the confidence interval by the Hoeffding’s inequality.

Note

The Hoeffding’s inequality provides high-probability bounds of the expectation \(\mu := \mathbb{E}[X], X \sim p(X)\) as follows.

\[|\hat{\mu} - \mu| \leq X_{\max} \sqrt{\frac{\log(1 / \alpha)}{2 n}},\]

which holds with probability \(1 - \alpha\) where \(n\) is the data size.

Parameters:

samples (array-like) – Samples.
alpha (float, default=0.05) – Significance level. The value should be within [0, 1).

Returns:

estimated_confidence_interval – Dictionary storing the estimated mean and upper-lower confidence bounds.

Return type:

dict

scope_rl.utils.estimate_confidence_interval_by_empirical_bernstein(samples, alpha=0.05, **kwargs)[source]#

Estimate the confidence interval by the empirical bernstein inequality.

Note

The empirical bernstein inequality provides high-probability bounds of the expectation \(\mu := \mathbb{E}[X], X \sim p(X)\) as follows.

\[|\hat{\mu} - \mu| \leq \frac{7 X_{\max} \log(2 / \alpha)}{3 (n - 1)} + \sqrt{\frac{2 \hat{\mathbb{V}}(X) \log(2 / \alpha)}{n(n - 1)}},\]

which holds with probability \(1 - \alpha\) where \(n\) is the data size and \(\hat{\mathbb{V}}\) is the sample variance.

Parameters:

samples (array-like) – Samples.
alpha (float, default=0.05) – Significance level. The value should be within [0, 1).

Returns:

estimated_confidence_interval – Dictionary storing the estimated mean and upper-lower confidence bounds.

Return type:

dict

scope_rl.utils.estimate_confidence_interval_by_t_test(samples, alpha=0.05, **kwargs)[source]#

Estimate the confidence interval by Student T-test.

Note

Student T-test assumes that \(X \sim p(X)\) follows a normal distribution. Based on this assumption, the \(1 - \alpha\) % confidence interval of \(\mu := \mathbb{E}[X]\) is derived as follows.

\[|\hat{\mu} - \mu| \leq \frac{T_{\mathrm{test}}(1 - \alpha, n-1)}{\sqrt{n} / \hat{\sigma}},\]

where \(n\) is the data size, \(T_{\mathrm{test}}(\cdot,\cdot)\) is the T-value, and \(\sigma\) is the standard deviation, respectively.

Parameters:

samples (NDArray) – Samples.
alpha (float, default=0.05) – Significance level. The value should be within [0, 1).

Returns:

estimated_confidence_interval – Dictionary storing the estimated mean and upper-lower confidence bounds.

Return type:

dict

scope_rl.utils.defaultdict_to_dict(dict_)[source]#

Transform a defaultdict into a corresponding dict.

scope_rl.utils.check_array(array, name, expected_dim=1, expected_dtype=None, min_val=None, max_val=None)[source]#

Input validation on array.

Parameters:

array (object) – Input array to check.
name (str) – Name of the input array.
expected_dim (int, default=1) – Expected dimension of the input array.
expected_dtype ({type, tuple of type}, default=None) – Expected dtype of the input array.
min_val (float, default=None) – Minimum value allowed in the input array.
max_val (float, default=None) – Maximum value allowed in the input array.

scope_rl.utils.check_logged_dataset(logged_dataset)[source]#

Check logged dataset keys.

Parameters:: logged_dataset (LoggedDataset) – Logged dataset.

scope_rl.utils.check_input_dict(input_dict)[source]#

Check input dict keys.

Parameters:: input_dict (OPEInputDict) – Input Dict.

class scope_rl.utils.NewGymAPIWrapper(env)[source]#

This class converts old gym outputs (gym<0.26.0) to the new ones (gym>=0.26.0).

Methods

reset
step

class scope_rl.utils.OldGymAPIWrapper(env)[source]#

This class converts new gym outputs (gym>=0.26.0) to the old ones (gym<0.26.0).

Methods

reset
seed
step