Testification of Condorcet Winners in dueling bandits

Haddenhorst, Björn; Bengs, Viktor; Brandt, Jasmin; Hüllermeier, Eyke

Testification of Condorcet Winners in dueling bandits

B. Haddenhorst, V. Bengs, J. Brandt, E. Hüllermeier, in: C. de Campos, M.H. Maathuis (Eds.), Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, PMLR, 2021, pp. 1195–1205.

Download

No fulltext has been uploaded.

Conference Paper | English

Author

Haddenhorst, Björn; Bengs, Viktor; Brandt, Jasmin; Hüllermeier, Eyke

Editor

de Campos, Cassio; Maathuis, Marloes H.

Abstract

Several algorithms for finding the best arm in the dueling bandits setting assume the existence of a Condorcet winner (CW), that is, an arm that uniformly dominates all other arms. Yet, by simply relying on this assumption but not verifying it, such algorithms may produce doubtful results in cases where it actually fails to hold. Even worse, the problem may not be noticed, and an alleged CW still be produced. In this paper, we therefore address the problem as a ”testification” task, by which we mean a combination of testing and identification: The online identification of the CW is combined with the statistical testing of the CW assumption. Thus, instead of returning a supposed CW at some point, the learner has the possibility to stop sampling and refuse an answer in case it feels confident that the CW assumption is violated. Analyzing the testification problem formally, we derive lower bounds on the expected sample complexity of any online algorithm solving it. Moreover, a concrete algorithm is proposed, which achieves the optimal sample complexity up to logarithmic terms.

Publishing Year

2021

Proceedings Title

Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence

forms.conference.field.series_title_volume.label

Proceedings of Machine Learning Research

Volume

161

Page

1195–1205

LibreCat-ID

30725

Cite this

Haddenhorst B, Bengs V, Brandt J, Hüllermeier E. Testification of Condorcet Winners in dueling bandits. In: de Campos C, Maathuis MH, eds. Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence. Vol 161. Proceedings of Machine Learning Research. PMLR; 2021:1195–1205.

Haddenhorst, B., Bengs, V., Brandt, J., & Hüllermeier, E. (2021). Testification of Condorcet Winners in dueling bandits. In C. de Campos & M. H. Maathuis (Eds.), Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence (Vol. 161, pp. 1195–1205). PMLR.

@inproceedings{Haddenhorst_Bengs_Brandt_Hüllermeier_2021, series={Proceedings of Machine Learning Research}, title={Testification of Condorcet Winners in dueling bandits}, volume={161}, booktitle={Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence}, publisher={PMLR}, author={Haddenhorst, Björn and Bengs, Viktor and Brandt, Jasmin and Hüllermeier, Eyke}, editor={de Campos, Cassio and Maathuis, Marloes H.}, year={2021}, pages={1195–1205}, collection={Proceedings of Machine Learning Research} }

Haddenhorst, Björn, Viktor Bengs, Jasmin Brandt, and Eyke Hüllermeier. “Testification of Condorcet Winners in Dueling Bandits.” In Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, edited by Cassio de Campos and Marloes H. Maathuis, 161:1195–1205. Proceedings of Machine Learning Research. PMLR, 2021.

B. Haddenhorst, V. Bengs, J. Brandt, and E. Hüllermeier, “Testification of Condorcet Winners in dueling bandits,” in Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021, vol. 161, pp. 1195–1205.

Haddenhorst, Björn, et al. “Testification of Condorcet Winners in Dueling Bandits.” Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, edited by Cassio de Campos and Marloes H. Maathuis, vol. 161, PMLR, 2021, pp. 1195–1205.

Export

Marked Publications

Open Data LibreCat

Search this title in

Google Scholar