scope_rl.ope.estimators_base.BaseStateActionMarginalOPEEstimator#

class scope_rl.ope.estimators_base.BaseStateActionMarginalOPEEstimator[source]#

Base class for State-Action Marginal OPE estimators.

Bases: scope_rl.ope.BaseMarginalOPEEstimator -> scope_rl.ope.BaseOffPolicyEstimator

Imported as: scope_rl.ope.BaseStateActionMarginalOPEEstimator

Note

This abstract base class also implements the following private methods.

abstract _estimate_trajectory_value:

Estimate the trajectory-wise expected reward.

_calc_behavior_policy_pscore_discrete:

Calculate the behavior policy pscore (action choice probability) in the case of discrete action spaces.

_calc_behavior_policy_pscore_continuous:

Calculate the behavior policy pscore (action choice probability) in the case of continuous action spaces.

_calc_evaluation_policy_pscore_discrete:

Calculate the evaluation policy pscore (action choice probability) in the case of discrete action spaces.

_calc_similarity_weight:

Calculate the similarity weight (for continuous action case) in the case of continuous action spaces.

_calc_marginal_importance_weight(self):

Calculate the marginal importance weight.

property _estimate_confidence_interval:

Dictionary containing names and functions of ci methods.

key: [
    bootstrap,
    hoeffding,
    bernstein,
    ttest,
]

Methods

`estimate_interval`()	Estimate the confidence interval of the policy value.
`estimate_policy_value`()	Estimate the policy value of the evaluation policy.

abstract estimate_interval()#

Estimate the confidence interval of the policy value.

abstract estimate_policy_value()#

Estimate the policy value of the evaluation policy.

Methods