Example Codes#

SCOPE-RL#

See also

Please also refer to SCOPE-RL package reference for APIs.

Basic and High-Confidence Off-Policy Evaluation (OPE):#

Basic OPE

background image
  • Logged datasets and inputs

  • Basic Off-Policy Evaluation (DM, PDIS, DR, ..)

  • Marginal Off-Policy Evaluation (SMIS, SMDR, SAMIS, SAMDR, ..)

  • High-Confidence Off-Policy Evaluation (Hoeffding, Bernstein, ..)

  • Extention to continuous action space

Cumulative Distribution OPE (CD-OPE):#

Cumulative Distribution OPE

background image
  • Logged datasets and inputs

  • Estimating Cumulative Distribution Function

  • Estimating risk-functions (mean, variance, CVaR, ..)

Off-Policy Selection#

Off-Policy Selection (OPS)

background image
  • OPS via Basic OPE

  • OPS via Cumulative Distribution OPE

  • Obtaining oracle selection results

Assessing OPE Estimators#

Off-Policy Selection (OPS)

background image
  • Conventional “accuracy” metrics

  • Top-\(k\) risk-return tradeoff metrics

  • Validation visualization

Implementing Custom OPE Estimators:#

Basic OPE (Continuous)

background image
  • Custom Basic OPE estimators

  • Custom Cumulative Distribution OPE estimators

Handling Multiple Datasets:#

Basic OPE (Continuous)

background image
  • Logged datasets and inputs

  • (Basic) Off-Policy Evaluation

  • Cumulative Distribution Off-Policy Evaluation

  • Off-Policy Selection

  • Assessments of OPE and OPS

Handling Real-World Datasets:#

Basic OPE (Continuous)

background image
  • Logged dataset

  • Input dict

See also

For the data collection and integration with d3rlpy in policy learning, please also refer to this page.

See also

The comprehensive quickstart examples with the provided sub-packages are available in the GitHub repository:

<<< Prev Quickstart

Next >>> Basic OPE