Multi-Armed Bandits with Censored Consumption of Resources

V. Bengs, E. Hüllermeier, ArXiv:2011.00813 (2020).

Preprint | English
Author
Bengs, Viktor; Hüllermeier, Eyke
Abstract
We consider a resource-aware variant of the classical multi-armed bandit problem: In each round, the learner selects an arm and determines a resource limit. It then observes a corresponding (random) reward, provided the (random) amount of consumed resources remains below the limit. Otherwise, the observation is censored, i.e., no reward is obtained. For this problem setting, we introduce a measure of regret, which incorporates the actual amount of allocated resources of each learning round as well as the optimality of realizable rewards. Thus, to minimize regret, the learner needs to set a resource limit and choose an arm in such a way that the chance to realize a high reward within the predefined resource limit is high, while the resource limit itself should be kept as low as possible. We derive the theoretical lower bound on the cumulative regret and propose a learning algorithm having a regret upper bound that matches the lower bound. In a simulation study, we show that our learning algorithm outperforms straightforward extensions of standard multi-armed bandit algorithms.
Publishing Year
Journal Title
arXiv:2011.00813
LibreCat-ID

Cite this

Bengs V, Hüllermeier E. Multi-Armed Bandits with Censored Consumption of Resources. arXiv:201100813. Published online 2020.
Bengs, V., & Hüllermeier, E. (2020). Multi-Armed Bandits with Censored Consumption of Resources. In arXiv:2011.00813.
@article{Bengs_Hüllermeier_2020, title={Multi-Armed Bandits with Censored Consumption of Resources}, journal={arXiv:2011.00813}, author={Bengs, Viktor and Hüllermeier, Eyke}, year={2020} }
Bengs, Viktor, and Eyke Hüllermeier. “Multi-Armed Bandits with Censored Consumption of Resources.” ArXiv:2011.00813, 2020.
V. Bengs and E. Hüllermeier, “Multi-Armed Bandits with Censored Consumption of Resources,” arXiv:2011.00813. 2020.
Bengs, Viktor, and Eyke Hüllermeier. “Multi-Armed Bandits with Censored Consumption of Resources.” ArXiv:2011.00813, 2020.

Export

Marked Publications

Open Data LibreCat

arXiv 2011.00813