Automotive Innovation ›› 2022, Vol. 5 ›› Issue (4): 415-426.doi: 10.1007/s42154-022-00195-z

• • 上一篇    下一篇

Approximate Optimal Filter Design for Vehicle System through Actor-Critic Reinforcement Learning

Yuming Yin1  · Shengbo Eben Li2 · Kaiming Tang2 · Wenhan Cao2 · Wei Wu3 · Hongbo Li3
  

  1. 1. School of Mechanical Engineering, Zhejiang University of Technology 2. School of Vehicle and Mobility, Tsinghua University 3. Beijing Geekplus Tech. Co., Ltd
  • 出版日期:2022-11-20 发布日期:2022-12-01

Approximate Optimal Filter Design for Vehicle System through Actor-Critic Reinforcement Learning

Yuming Yin1  · Shengbo Eben Li2 · Kaiming Tang2 · Wenhan Cao2 · Wei Wu3 · Hongbo Li3   

  1. 1. School of Mechanical Engineering, Zhejiang University of Technology 2. School of Vehicle and Mobility, Tsinghua University 3. Beijing Geekplus Tech. Co., Ltd
  • Online:2022-11-20 Published:2022-12-01

摘要: Precise state and parameter estimations are essential for identification, analysis and control of vehicle engineering problems, especially under significant model and measurement uncertainties. The widely used filtering/estimation algorithms, such as Kalman series like Kalman filter, extended Kalman filter, unscented Kalman filter, and particle filter, generally aim to approach the true state/parameter distribution via iteratively updating the filter gain at each time step. However, the optimality of these filters would be deteriorated by unrealistic initial condition or significant model error. Alternatively, this paper proposes to approximate the optimal filter gain by considering the effect factors within infinite time horizon, on the basis of estimation-control duality. The proposed approximate optimal filter (AOF) problem is designed and subsequently solved by actor-critic reinforcement learning (RL) method. The AOF design transforms the traditional optimal filtering problem with the minimum expected mean square error into an optimal control problem with the minimum accumulated estimation error, in which the estimation error is used as the surrogate system state and the infinite-horizon filter gain is the control input. The estimation-control duality is proved to hold when certain conditions about initial vehicle state distributions and policy structure are maintained. In order to evaluate of the effectiveness of AOF, a vehicle state estimation problem is then demonstrated and compared with the steady-state Kalman filter. The results showed that the obtained filter policy via RL with different discount factors can converge to theoretical optimal gain with an error within 5%, and the average estimation errors of vehicle slip angle and yaw rate are less than 1.5?×?10–4.

Abstract: Precise state and parameter estimations are essential for identification, analysis and control of vehicle engineering problems, especially under significant model and measurement uncertainties. The widely used filtering/estimation algorithms, such as Kalman series like Kalman filter, extended Kalman filter, unscented Kalman filter, and particle filter, generally aim to approach the true state/parameter distribution via iteratively updating the filter gain at each time step. However, the optimality of these filters would be deteriorated by unrealistic initial condition or significant model error. Alternatively, this paper proposes to approximate the optimal filter gain by considering the effect factors within infinite time horizon, on the basis of estimation-control duality. The proposed approximate optimal filter (AOF) problem is designed and subsequently solved by actor-critic reinforcement learning (RL) method. The AOF design transforms the traditional optimal filtering problem with the minimum expected mean square error into an optimal control problem with the minimum accumulated estimation error, in which the estimation error is used as the surrogate system state and the infinite-horizon filter gain is the control input. The estimation-control duality is proved to hold when certain conditions about initial vehicle state distributions and policy structure are maintained. In order to evaluate of the effectiveness of AOF, a vehicle state estimation problem is then demonstrated and compared with the steady-state Kalman filter. The results showed that the obtained filter policy via RL with different discount factors can converge to theoretical optimal gain with an error within 5%, and the average estimation errors of vehicle slip angle and yaw rate are less than 1.5?×?10–4.