A Reinforcement Learning Strategy for the Swing-Up of the Double Pendulum on a Cart

Hesse, Michael; Timmermann, Julia; Hüllermeier, Eyke; Trächtler, Ansgar

A Reinforcement Learning Strategy for the Swing-Up of the Double Pendulum on a Cart

M. Hesse, J. Timmermann, E. Hüllermeier, A. Trächtler, Procedia Manufacturing 24 (2018) 15–20.

Download

No fulltext has been uploaded.

Journal Article | English

Author

Hesse, Michael^LibreCat; Timmermann, Julia^LibreCat; Hüllermeier, Eyke^LibreCat; Trächtler, Ansgar^LibreCat

Department

Regelungstechnik und Mechatronik / Heinz Nixdorf Institut

Abstract

The effective control design of a dynamical system traditionally relies on a high level of system understanding, usually expressed in terms of an exact physical model. In contrast to this, reinforcement learning adopts a data-driven approach and constructs an optimal control strategy by interacting with the underlying system. To keep the wear of real-world systems as low as possible, the learning process should be short. In our research, we used the state-of-the-art reinforcement learning method PILCO to design a feedback control strategy for the swing-up of the double pendulum on a cart with remarkably few test iterations at the test bench. PILCO stands for “probabilistic inference for learning control” and requires only few expert knowledge for learning. To achieve the swing-up of a double pendulum on a cart to its upper unstable equilibrium position, we introduce additional state restrictions to PILCO, so that the limited cart distance can be taken into account. Thanks to these measures, we were able to learn the swing up at the real test bench for the first time and in only 27 learning iterations.

Publishing Year

2018

Journal Title

Procedia Manufacturing

Volume

Page

15 - 20

LibreCat-ID

22996

Cite this

Hesse M, Timmermann J, Hüllermeier E, Trächtler A. A Reinforcement Learning Strategy for the Swing-Up of the Double Pendulum on a Cart. Procedia Manufacturing. 2018;24:15-20.

Hesse, M., Timmermann, J., Hüllermeier, E., & Trächtler, A. (2018). A Reinforcement Learning Strategy for the Swing-Up of the Double Pendulum on a Cart. Procedia Manufacturing, 24, 15–20.

@article{Hesse_Timmermann_Hüllermeier_Trächtler_2018, title={A Reinforcement Learning Strategy for the Swing-Up of the Double Pendulum on a Cart}, volume={24}, journal={Procedia Manufacturing}, author={Hesse, Michael and Timmermann, Julia and Hüllermeier, Eyke and Trächtler, Ansgar}, year={2018}, pages={15–20} }

Hesse, Michael, Julia Timmermann, Eyke Hüllermeier, and Ansgar Trächtler. “A Reinforcement Learning Strategy for the Swing-Up of the Double Pendulum on a Cart.” Procedia Manufacturing 24 (2018): 15–20.

M. Hesse, J. Timmermann, E. Hüllermeier, and A. Trächtler, “A Reinforcement Learning Strategy for the Swing-Up of the Double Pendulum on a Cart,” Procedia Manufacturing, vol. 24, pp. 15–20, 2018.

Hesse, Michael, et al. “A Reinforcement Learning Strategy for the Swing-Up of the Double Pendulum on a Cart.” Procedia Manufacturing, vol. 24, 2018, pp. 15–20.

Export

Marked Publications

Open Data LibreCat

Search this title in

Google Scholar