scope_rl.utils#
Useful tools.
Functions
Input validation on array. |
|
Check input dict keys. |
|
Check logged dataset keys. |
|
Cosine kernel. |
|
Transform a defaultdict into a corresponding dict. |
|
Epanechnikov kernel. |
|
Estimate the confidence interval by a nonparametric bootstrap-like procedure. |
|
Estimate the confidence interval by the empirical bernstein inequality. |
|
Estimate the confidence interval by the Hoeffding's inequality. |
|
Estimate the confidence interval by Student T-test. |
|
Gaussian kernel. |
|
Calcilate L2 distance. |
|
Triangular kernel. |
|
Uniform kernel. |
Classes
This class contains paths to multiple input dictionaries for OPE and returns input_dict. |
|
This class contains paths to multiple logged datasets and returns logged_dataset. |
|
This class converts old gym outputs (gym<0.26.0) to the new ones (gym>=0.26.0). |
|
This class converts new gym outputs (gym>=0.26.0) to the old ones (gym<0.26.0). |
- class scope_rl.utils.MultipleLoggedDataset(action_type, path, save_relative_path=False)[source]#
This class contains paths to multiple logged datasets and returns logged_dataset.
- Parameters:
action_type ({"discrete", "continuous"}) – Type of the action space.
path (str) – Path to the directory. Either absolute or relative path is acceptable.
save_relative_path (bool, default=False.) –
Whether to save a relative path. If True, a path relative to the scope-rl directory will be saved. If False, the absolute path will be saved.
Note that this option was added in order to run examples in the documentation properly. Otherwise, the default setting (False) is recommended.
- Attributes:
- behavior_policy_names
- n_datasets
Methods
add(logged_dataset, behavior_policy_name)Save logged dataset.
get(behavior_policy_name, dataset_id)Load logged dataset.
- class scope_rl.utils.MultipleInputDict(action_type, path, save_relative_path=False)[source]#
This class contains paths to multiple input dictionaries for OPE and returns input_dict.
- Parameters:
action_type ({"discrete", "continuous"}) – Type of the action space.
path (str) – Path to the directory. Either absolute or relative path is acceptable.
save_relative_path (bool, default=False.) –
Whether to save a relative path. If True, a path relative to the scope-rl directory will be saved. If False, the absolute path will be saved.
Note that this option was added in order to run examples in the documentation properly. Otherwise, the default setting (False) is recommended.
- Attributes:
- behavior_policy_names
- n_datasets
n_eval_policiesCheck the number of evaluation policies of each input dict.
use_same_eval_policy_across_datasetCheck if the contained logged datasets use the same evaluation policies.
Methods
add(input_dict, behavior_policy_name, dataset_id)Save input_dict.
get(behavior_policy_name, dataset_id)Load input_dict.
- property use_same_eval_policy_across_dataset#
Check if the contained logged datasets use the same evaluation policies.
- property n_eval_policies#
Check the number of evaluation policies of each input dict.
- scope_rl.utils.l2_distance(x, y, bandwidth=1.0)[source]#
Calcilate L2 distance.
- Parameters:
x (array-like of shape (n_samples, n_dim)) – Input array 1.
y (array-like of shape (n_samples, n_dim)) – Input array 2.
- Returns:
distance – distance between x and y.
- Return type:
ndarray of (n_samples, )
- scope_rl.utils.gaussian_kernel(x, y, bandwidth=1.0)[source]#
Gaussian kernel.
- x: array-like of shape (n_samples, n_dim)
Input array 1.
- y: array-like of shape (n_samples, n_dim)
Input array 2.
- bandwidth: float, default=1.0
Bandwidth hyperparameter of the Gaussian kernel.
- Returns:
kernel_density – kernel density of x given y.
- Return type:
ndarray of (n_samples, )
- scope_rl.utils.triangular_kernel(x, y, bandwidth=1.0)[source]#
Triangular kernel.
- Parameters:
x (array-like of shape (n_samples, n_dim)) – Input array 1.
y (array-like of shape (n_samples, n_dim)) – Input array 2.
bandwidth (float, default=1.0) – Bandwidth hyperparameter of the Trianglar kernel.
- Returns:
kernel_density – kernel density of x given y.
- Return type:
ndarray of (n_samples, )
- scope_rl.utils.epanechnikov_kernel(x, y, bandwidth=1.0)[source]#
Epanechnikov kernel.
- Parameters:
x (array-like of shape (n_samples, n_dim)) – Input array 1.
y (array-like of shape (n_samples, n_dim)) – Input array 2.
bandwidth (float, default=1.0) – Bandwidth hyperparameter of the Trianglar kernel.
- Returns:
kernel_density – kernel density of x given y.
- Return type:
ndarray of (n_samples, )
- scope_rl.utils.cosine_kernel(x, y, bandwidth=1.0)[source]#
Cosine kernel.
- x: array-like of shape (n_samples, n_dim)
Input array 1.
- y: array-like of shape (n_samples, n_dim)
Input array 2.
- bandwidth: float, default=1.0
Bandwidth hyperparameter of the Trianglar kernel.
- Returns:
kernel_density – kernel density of x given y.
- Return type:
ndarray of (n_samples, )
- scope_rl.utils.uniform_kernel(x, y, bandwidth=1.0)[source]#
Uniform kernel.
- Parameters:
x (array-like of shape (n_samples, n_dim)) – Input array 1.
y (array-like of shape (n_samples, n_dim)) – Input array 2.
bandwidth (float, default=1.0) – Bandwidth hyperparameter of the Trianglar kernel.
- Returns:
kernel_density – kernel density of x given y.
- Return type:
ndarray of (n_samples, )
- scope_rl.utils.estimate_confidence_interval_by_bootstrap(samples, alpha=0.05, n_bootstrap_samples=100, random_state=None)[source]#
Estimate the confidence interval by a nonparametric bootstrap-like procedure.
- Parameters:
- Returns:
estimated_confidence_interval – Dictionary storing the estimated mean and upper-lower confidence bounds.
- Return type:
- scope_rl.utils.estimate_confidence_interval_by_hoeffding(samples, alpha=0.05, **kwargs)[source]#
Estimate the confidence interval by the Hoeffding’s inequality.
Note
The Hoeffding’s inequality provides high-probability bounds of the expectation \(\mu := \mathbb{E}[X], X \sim p(X)\) as follows.
\[|\hat{\mu} - \mu| \leq X_{\max} \sqrt{\frac{\log(1 / \alpha)}{2 n}},\]which holds with probability \(1 - \alpha\) where \(n\) is the data size.
- scope_rl.utils.estimate_confidence_interval_by_empirical_bernstein(samples, alpha=0.05, **kwargs)[source]#
Estimate the confidence interval by the empirical bernstein inequality.
Note
The empirical bernstein inequality provides high-probability bounds of the expectation \(\mu := \mathbb{E}[X], X \sim p(X)\) as follows.
\[|\hat{\mu} - \mu| \leq \frac{7 X_{\max} \log(2 / \alpha)}{3 (n - 1)} + \sqrt{\frac{2 \hat{\mathbb{V}}(X) \log(2 / \alpha)}{n(n - 1)}},\]which holds with probability \(1 - \alpha\) where \(n\) is the data size and \(\hat{\mathbb{V}}\) is the sample variance.
- scope_rl.utils.estimate_confidence_interval_by_t_test(samples, alpha=0.05, **kwargs)[source]#
Estimate the confidence interval by Student T-test.
Note
Student T-test assumes that \(X \sim p(X)\) follows a normal distribution. Based on this assumption, the \(1 - \alpha\) % confidence interval of \(\mu := \mathbb{E}[X]\) is derived as follows.
\[|\hat{\mu} - \mu| \leq \frac{T_{\mathrm{test}}(1 - \alpha, n-1)}{\sqrt{n} / \hat{\sigma}},\]where \(n\) is the data size, \(T_{\mathrm{test}}(\cdot,\cdot)\) is the T-value, and \(\sigma\) is the standard deviation, respectively.
- scope_rl.utils.defaultdict_to_dict(dict_)[source]#
Transform a defaultdict into a corresponding dict.
- scope_rl.utils.check_array(array, name, expected_dim=1, expected_dtype=None, min_val=None, max_val=None)[source]#
Input validation on array.
- Parameters:
array (object) – Input array to check.
name (str) – Name of the input array.
expected_dim (int, default=1) – Expected dimension of the input array.
expected_dtype ({type, tuple of type}, default=None) – Expected dtype of the input array.
min_val (float, default=None) – Minimum value allowed in the input array.
max_val (float, default=None) – Maximum value allowed in the input array.
- scope_rl.utils.check_logged_dataset(logged_dataset)[source]#
Check logged dataset keys.
- Parameters:
logged_dataset (LoggedDataset) – Logged dataset.
- scope_rl.utils.check_input_dict(input_dict)[source]#
Check input dict keys.
- Parameters:
input_dict (OPEInputDict) – Input Dict.