Skip to main navigation Skip to search Skip to main content

Reinforcement Learning for Nash Noncoperative and Pareto Cooperative Optimal Linear Systems

  • Rutgers - The State University of New Jersey, New Brunswick

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper we use the reinforcement learning techniques for optimal control of linear-quadratic systems, and consider two situations: when controllers cooperate (Pareto strategies) and when they do not cooperate (Nash strategies). These situations are encountered in optimal control and reinforcement learning of several engineering systems (especially in energy systems), economics, and in general machine learning. We compare the corresponding optimal costs using normalized data assuming that the system initial conditions are uniformly distributed on the unit sphere, and provide an estimate how much cooperation helps in learning optimal controllers for these kind of problems. The theoretical results are demonstrated on an example of an electric power system.

Original languageEnglish
Title of host publication2023 24th International Arab Conference on Information Technology, ACIT 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350384307
DOIs
StatePublished - 2023
Externally publishedYes
Event24th International Arab Conference on Information Technology, ACIT 2023 - Ajman, United Arab Emirates
Duration: 6 Dec 20238 Dec 2023

Publication series

Name2023 24th International Arab Conference on Information Technology, ACIT 2023

Conference

Conference24th International Arab Conference on Information Technology, ACIT 2023
Country/TerritoryUnited Arab Emirates
CityAjman
Period6/12/238/12/23

Keywords

  • Nash policy iterations
  • Parato policy iterations
  • Reinforcement learning
  • linear-quadratic dynamic optimization
  • styling

Fingerprint

Dive into the research topics of 'Reinforcement Learning for Nash Noncoperative and Pareto Cooperative Optimal Linear Systems'. Together they form a unique fingerprint.

Cite this