Heuristic Q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle.
Clicks: 204
ID: 96681
2019
Article Quality & Performance Metrics
Overall Quality
Improving Quality
0.0
/100
Combines engagement data with AI-assessed academic quality
Reader Engagement
Emerging Content
7.5
/100
25 views
25 readers
Trending
AI Quality Assessment
Not analyzed
Abstract
In order to solve the problem that the existing reinforcement learning algorithm is difficult to converge due to the excessive state space of the three-dimensional path planning of the unmanned aerial vehicle, this article proposes a reinforcement learning algorithm based on the heuristic function and the maximum average reward value of the experience replay mechanism. The knowledge of track performance is introduced to construct heuristic function to guide the unmanned aerial vehicles' action selection and reduce the useless exploration. Experience replay mechanism based on maximum average reward increases the utilization rate of excellent samples and the convergence speed of the algorithm. The simulation results show that the proposed three-dimensional path planning algorithm has good learning efficiency, and the convergence speed and training performance are significantly improved.
| Reference Key |
xie2019heuristicscience
Use this key to autocite in the manuscript while using
SciMatic Manuscript Manager or Thesis Manager
|
|---|---|
| Authors | Xie, Ronglei;Meng, Zhijun;Zhou, Yaoming;Ma, Yunpeng;Wu, Zhe; |
| Journal | Science progress |
| Year | 2019 |
| DOI |
10.1177/0036850419879024
|
| URL | |
| Keywords |
Citations
No citations found. To add a citation, contact the admin at info@scimatic.org
Comments
No comments yet. Be the first to comment on this article.