rtbgym.envs.simulator.bidder#
Bid Price Calculation.
Classes
Class to determine bid price. |
- class rtbgym.envs.simulator.bidder.Bidder(simulator, objective='conversion', reward_predictor=None, scaler=None, random_state=None)[source]#
Class to determine bid price.
Imported as:
rtbgym.envs.simulator.BidderNote
Intended to be called and initialized from RTBEnv class in env.py.
Determine bid price by the following formula.
\[{bid price}_{t, i} = {adjust rate}_{t} \times {predicted reward}_{t,i} ( \times {const.})\]- Parameters:
simulator (BaseSimulator) – Auction simulator.
objective ({"click", "conversion"}, default="conversion") – Objective outcome (i.e., reward) of the auction.
reward_predictor (BaseEstimator, default=None) – A machine learning model to predict the reward to determine the bidding price. If None, the ground-truth (expected) reward is used instead of the predicted one.
scaler ({int, float}, default=None (> 0)) – Scaling factor (constant value) used for bid price determination. If None, one should call auto_fit_scaler().
random_state (int, default=None (>= 0)) – Random state.
References
Di Wu, Xiujun Chen, Xun Yang, Hao Wang, Qing Tan, Xiaoxun Zhang, Jian Xu, and Kun Gai. “Budget Constrained Bidding by Model-free Reinforcement Learning in Display Advertising.” 2018.
Jun Zhao, Guang Qiu, Ziyu Guan, Wei Zhao, and Xiaofei He. “Deep Reinforcement Learning for Sponsored Search Real-time Bidding.” 2018.
- Attributes:
- random_state
- reward_predictor
- scaler
- standard_bid_price
Methods
auto_fit_scaler(step_per_episode[, n_samples])Fit scaling factor used for bid price calculation.
custom_set_reward_predictor(reward_predictor)Set reward predictor used for bid price calculation.
custom_set_scaler(scaler)Set scaling factor used for bid price calculation.
determine_bid_price(timestep, adjust_rate, ...)Determine the bidding price using given adjust rate and the predicted/ground-truth rewards.
fit_reward_predictor(step_per_episode[, ...])Fit reward predictor in advance (pre-train) to use prediction in bidding price determination.
- determine_bid_price(timestep, adjust_rate, ad_ids, user_ids)[source]#
Determine the bidding price using given adjust rate and the predicted/ground-truth rewards.
Note
Determine bid price as follows.
\[{bid price}_{t, i} = {adjust rate}_{t} \times {predicted reward}_{t,i} ( \times {const.})\]- Parameters:
- Returns:
bid_prices – Bid price for each auction.
- Return type:
ndarray of shape(search_volume, )
- custom_set_scaler(scaler)[source]#
Set scaling factor used for bid price calculation.
- Parameters:
scaler ({int, float} (> 0)) – Scaling factor (constant value) used in bid price calculation.
- auto_fit_scaler(step_per_episode, n_samples=100000)[source]#
Fit scaling factor used for bid price calculation.
Note
- scaler is set to approximate reciprocal of the mean predicted/ground-truth rewards.
scaler ~= 1 / mean of predicted/ground-truth rewards
- custom_set_reward_predictor(reward_predictor)[source]#
Set reward predictor used for bid price calculation.
- Parameters:
reward_predictor (BaseEstimator, default=None) – A machine learning model to predict the reward to determine the bidding price. If None, the ground-truth (expected) reward is used instead of the predicted one.
- fit_reward_predictor(step_per_episode, n_samples=100000)[source]#
Fit reward predictor in advance (pre-train) to use prediction in bidding price determination.
Note
Intended to be used only when use_reward_predictor=True option.
- X and y of the prediction model is given as follows.
- X: array-like of shape (search_volume, ad_feature_dim + user_feature_dim + 1)
Concatenated vector of contexts (ad_feature_vector + user_feature_vector) and timestep.
- y: array-like of shape (search_volume, )
Reward (i.e., auction outcome) obtained in each auction.