Introduction

Copyright © 2022 Intelligent Driving Laboratory (iDLab). All rights reserved.

Description

Solving optimal control problems serves as basic demands of industrial control tasks. Existing methods like model predictive control often suffer from heavy online computational burdens. Reinforcement learning (RL) has shown great promise in computer and board games but has yet to be widely adopted in industrial applications due to lacking accessible and high-accuracy solvers. Therefore, our team “Intelligent Driving Lab (iDLab)” at Tsinghua University has developed General Optimal control Problems Solver (GOPS), an easy-to-use RL solver package that aims to build real-time and high-performance controllers in industrial fields. GOPS is built with a highly modular structure that retains a flexible framework for secondary development. Considering the diversity of industrial control tasks, GOPS also includes a conversion tool that allows for the use of Matlab/Simulink to support environment construction, controller design, and performance validation. To handle large-scale control problems, GOPS can automatically create various serial and parallel trainers by flexibly combining embedded buffers and samplers. It offers a variety of common approximate functions for policy and value functions, including polynomial, multilayer perceptron, convolutional neural network, etc. Additionally, constrained and robust training algorithms for special industrial control systems with state constraints and model uncertainties are also integrated into GOPS.

GOPS provides a variety of algorithms for solving optimal control problems. These built-in algorithms cover the mainstream RL algorithms, including model-free/model-based, on-policy/off-policy, and direct/indirect. Currently supported algorithms are shown as follows:

Features

The main features of GOPS are summarized as follows:

  1. GOPS adopts a highly modular configuration that allows for easy secondary development of environments and algorithms, making it accessible for users without professional RL knowledge or programming skills.

  2. GOPS supports multiple training modes for handling complex and large-scale problems, including serial and parallel modes for on-policy and off-policy, model-free and model-based, and direct and indirect algorithms. It can handle special requirements from industrial control, such as explicit policies, state constraints, and model uncertainties.

  3. Considering the widespread use of Matlab/Simulink in industry control, GOPS offers a convenient conversion tool to support high-performance controller design for Simulink models. This tool enables the transformation of existing Simulink models into GOPS-compatible environments and allows for performance validation and controller deployment by sending the learned policy back to Simulink.

Installation

Installation requirements:

  1. Operating system: compatible with Windows 7 or later, as well as Ubuntu 18.04 or later.

  2. Python version: Python 3.6 or later. For proper functioning with Matlab/Simulink models, it is necessary to install Python 3.8.

  3. Matlab/Simulink (Optional): Matlab/Simulink 2018a or later. This is not mandatory, but it enables seamless integration and enhanced functionality with Matlab/Simulink.

  4. Installation path must be in English and do not contain any special characters.

Installation steps:

  1. Clone the GOPS repository and change to the GOPS directory:

git clone https://github.com/Intelligent-Driving-Laboratory/GOPS.git
cd gops
  1. Create and activate the conda environment for GOPS:

conda env create -f gops_environment.yml
conda activate gops
  1. Install GOPS and its required packages:

pip install -e .

Quick Start

To demonstrate GOPS, one example is given here with inverted double pendulum environment.

  1. Start training a policy with command:

python example_train/fhadp/fhadp_mlp_idpendulum_serial.py
  1. After training is finished, test the policy with command:

python example_run/run_idp_fhadp.py
  1. You can record a video by setting save_render=True in the file run_idp_fhadp.py. The video of testing this environment is shown as follows:

Cite GOPS

If you use GOPS in your research, please cite the following paper:

@article{gops,
    title={GOPS: A general optimal control problem solver for autonomous driving and industrial control applications},
    author={Wenxuan Wang, Yuhang Zhang, Jiaxin Gao, Yuxuan Jiang, Yujie Yang, Zhilong Zheng, Wenjun Zou, Jie Li,
Congsheng Zhang, Wenhan Cao, Genjin Xie, Jingliang Duan, Shengbo Eben Li}
    journal={Communications in Transportation Research},
    volume = {3},
    pages = {100096},
    year={2023},
    issn={2772-4247},
    doi = {https://doi.org/10.1016/j.commtr.2023.100096},
    }

Wang W, Zhang Y, Gao J, et al. GOPS: A general optimal control problem solver for autonomous driving and industrial control applications. Communications in Transportation Research, vol. 3, December 2023.

For more technical details, you can cite this book:

S Eben Li. Reinforcement Learning for Sequential Decision and Optimal Control. Springer Verlag, Singapore, 2023

Download GOPS

You can download the newest version of GOPS from https://github.com/Intelligent-Driving-Laboratory/GOPS.

The history versions of GOPS are listed as follows:

Version

Download URL

New Features

v1.1.0

Github
Tsinghua Cloud

1. Add industrial Optimal Control Environments.
2. Add sys_simulator module for testing trained policy.
3. Intergrate MPC Solver serving as a baseline.

WeChat Group

In order to make it easier for everyone to use GOPS and build a good community, we have established a WeChat group for GOPS users and invite interested users to join by scanning the QR code below. Developers will answer questions for users in the group when using GOPS, and will fix problems in GOPS based on user feedback. In addition, the release of a new version of GOPS will also be notified in the group.

Thanks to all users for your support of GOPS and to all developers for your contributions to GOPS. Let’s work together to make GOPS a valuable, easy-to-use, and popular software!

Contributors

Team Leader:

Shengbo Eben Li: Leader of GOPS.

Dr. Li is now the tenured professor of Tsinghua University, and he is leading Intelligent Driving Laboratory (iDLab) at School of Vehicle and Mobility. His active research interests include intelligent vehicles and driver assistance, deep reinforcement learning, optimal control and estimation, etc. He and his team has received best (student) paper awards of IEEE ITSC, ICCAS, IEEE ICUS, CCCC, ITSAPF, etc. He also serves as Board of Governor of IEEE ITS Society, Senior AE of IEEE OJ ITS, and AEs of IEEE ITSM, IEEE Trans ITS, Automotive Innovation, etc.

Jingliang Duan: Co-leader of GOPS.

Dr. Duan is currently an associate professor in the School of Mechanical Engineering, University of Science and Technology Beijing, China. His research interests include reinforcement learning, optimal control, and self-driving decision-making.

Student Leader:

Wenxuan Wang: Up to GOPS V1.0

Yujie Yang: Working on GOPS V2.0

Team Members (in alphabetical order):

Baiyu Peng, Congsheng Zhang, Genjin Xie, Ziqing Gu, Hao Sun, Jiaxin Gao, Jie Li, Letian Tao, Tong Liu, Wenhan Cao, Wenjun Zou, Weixian He, Xujie Song, Yang Guan, Yinuo Wang, Yuhang Zhang, Yuheng Lei, Yuxuan Jiang, Zhilong Zheng.