During the terminal phase, the missile continues to chase the target according to the information provided by its radar until it hits the target or loses the target. Thus, it is necessary for the aircraft radar to continuously detect the target, providing the information for the missile and guiding it to the target. In the intermediate guidance stage, the radar of the missile is not activated. However, after launching, there are two stages in the attack of radar-guided missiles, which are called the midcourse guidance stage and the terminal guidance stage. Therefore, the aircraft can retreat after launching missiles. It can obtain information about the target by means of its infrared detector and then attack the target. The IR-guided missile does not need external equipment to provide target information after it is launched. Radar-guided missiles are supposed to be used for beyond-visual-range and IR-guided missiles for within visual range, because the detection range of the radar is longer than that of the IR detector. Besides, the process of beyond-visual-range air combat is different from that of within-visual-range air combat because the principle and operation method between radar-guided missiles and infrared (IR) missiles are different. Therefore, both sides of the air combat can discover each other and launch missiles at beyond-visual-range. With the development of science and technology, the detection distance of airborne radar and the range of air-to-air missiles have been increased to hundreds of kilometers. Maneuver decision-making means that the aircraft chooses the appropriate maneuver (e.g., normal overload, tangential overload, and roll angle) to change its state according to the acquired information of the target (e.g., azimuth, velocity, height, and distance), so as to defeat the target.Īir combat can be divided into within-visual-range air combat and beyond-visual-range air combat. Therefore, it is urgent to build maneuver decision-making methods. The simulation results of the fixed initial state and random initial state show that the proposed method is efficient and can meet the real-time requirement.Īutonomous air combat through unmanned combat aerial vehicles is the future of air combat and maneuver decision-making is the core of autonomous air combat. Simulations are conducted to verify the effectiveness of the proposed method, and the kinematic model of the missile is used in simulations instead of the missile engagement zone to test whether the maneuver decision-making method is effective or not. Then, repeat the above process to gradually improve the maneuver decision-making ability. These samples are used to train the neural network, and the neural network with a greater winning rate is selected by simulations. It starts from random behaviors and generates samples consisting of states, actions, and results of air combat through self-play without using human knowledge. To this end, Monte Carlo tree search in continuous action space is proposed and neural networks-guided Monte Carlo tree search with self-play is utilized to improve the ability of air combat agents. Therefore, a maneuver decision-making method based on deep reinforcement learning and Monte Carlo tree search is proposed to investigate whether it is feasible for maneuver decision-making without human knowledge or advantage function. Aeronautics Engineering College, Air Force Engineering University, Xi'an, ChinaĪutonomous maneuver decision-making methods for air combat often rely on human knowledge, such as advantage functions, objective functions, or dense rewards in reinforcement learning, which limits the decision-making ability of unmanned combat aerial vehicle to the scope of human experience and result in slow progress in maneuver decision-making.Hongpeng Zhang *, Huan Zhou, Yujie Wei and Changqiang Huang
0 Comments
Leave a Reply. |