基于改进深度子Q网络算法的军队建设规划年度计划生成

陈子夷, 杨克巍, 豆亚杰, 姜江, 谭跃进

系统工程理论与实践 ›› 2025, Vol. 45 ›› Issue (5) : 1673-1686.

PDF(818 KB)
PDF(818 KB)
系统工程理论与实践 ›› 2025, Vol. 45 ›› Issue (5) : 1673-1686. DOI: 10.12011/SETP2023-1126
论文

基于改进深度子Q网络算法的军队建设规划年度计划生成

    陈子夷, 杨克巍, 豆亚杰, 姜江, 谭跃进
作者信息 +

Annual schedules generation for military construction planning based on improved Deep SubQ-Network algorithm

    Ziyi CHEN, Kewei YANG, Yajie DOU, Jiang JIANG, Yuejin TAN
Author information +
文章历史 +

摘要

军队建设规划年度计划编制是在军队建设规划的执行过程中, 根据当年实际进展和使用反馈制定下一年的年度建设计划, 是军队建设规划执行的重要依据. 为此, 本文研究基于深度强化学习和多目标优化理论, 提出基于改进深度子Q网络(Deep SubQ-Network, DSubQN)的动态多目标优化算法, 用于辅助军队建设规划年度计划生成. 首先, 从军队建设规划执行和年度计划编制的过程出发, 分析问题的动态特点; 然后, 构建计划项目年度建设计划迭代多目标优化的数学模型, 用递归形式描述了目标函数; 基于典型的深度强化学习算法框架, 创新性地提出了SubQ网络, 使得深度强化学习求解多目标优化问题成为可能, 设计可以逐步生成每一年的年度建设方案的迭代优化算法; 示例部分基于模拟数据, 设置与其他优化方法的对比实验, 验证了本文模型和算法的可行性与优势.

Abstract

The annual schedules generation for military construction planning is to develop the annual construction schedule for the next year based on the actual situation and feedback in the current year. Based on deep reinforcement learning and multi-objective optimization theory, this paper proposes a dynamic multi-objective optimization algorithm improved by the Deep SubQ-Network. It is used to assist in the generation of annual schedules for military construction planning. First, from the process of implementing military construction planning and generating annual schedules, we analyze the dynamic characteristics of the problem; then, the mathematical model of iterative multi-objective optimization of annual construction schedules for projects is constructed, and the objective function is described in recursive form. Based on a typical deep reinforcement learning algorithm framework, we innovatively propose a SubQ network, which makes it possible to solve multi-objective optimization problems using deep reinforcement learning. We design an iterative optimization algorithm, that can gradually generate annual construction plans for each year; the illustrative part is based on simulated data and set up comparison experiments with other optimization methods to verify the feasibility and advantages of the model and algorithm in this paper.

关键词

军队建设规划 / 军队建设项目 / 动态规划 / 多目标优化 / 深度强化学习

Key words

military construction planning / military construction projects / dynamic programming / multi-objective optimization / deep reinforcement learning

引用本文

导出引用
陈子夷, 杨克巍, 豆亚杰, 姜江, 谭跃进. 基于改进深度子Q网络算法的军队建设规划年度计划生成. 系统工程理论与实践, 2025, 45(5): 1673-1686 https://doi.org/10.12011/SETP2023-1126
Ziyi CHEN, Kewei YANG, Yajie DOU, Jiang JIANG, Yuejin TAN. Annual schedules generation for military construction planning based on improved Deep SubQ-Network algorithm. Systems Engineering - Theory & Practice, 2025, 45(5): 1673-1686 https://doi.org/10.12011/SETP2023-1126
中图分类号: N945   

参考文献

[1]
纪海涛. 军队建设重大项目群管理研究[J]. 国防, 2019(6): 57-61.
Ji H T. Research on the management of major project clusters for military construction[J]. National Defense, 2019(6): 57-61.
[2]
赵得智, 廉振宇, 游光荣. 基于改进RCPSP的军队建设项目中长期规划问题建模与求解[J]. 军事运筹与系统工程, 2021, 35(1): 28-34.
Zhao D Z, Lian Z Y, You G R. Modelling and solving medium and long term planning problems for army construction projects based on improved RCPSP[J]. Military Operations Research and Systems Engineering, 2021, 35(1): 28-34.
[3]
Kyle R H, Saber E, Ivan G, et al. Portfolio optimization for defence applications[J]. IEEE Access, 2020, 8: 60152-60178.
[4]
Darya A, Maryam A, Seyed H G. A multi objective-BSC model for new product development project portfolio selection[J]. Expert Systems with Applications, 2020, 162: 113757.
[5]
Xie F, Li H T, Zhe X. Multi-mode resource-constrained project scheduling with uncertain activity cost[J]. Expert Systems with Applications, 2021, 168: 114475.
[6]
Muhammad B, Lukumon O O. Big data with deep learning for benchmarking profitability performance in project tendering[J]. Expert Systems with Applications, 2020, 147: 113194.
[7]
Li J C, Ge B F, Jiang J, et al. High-end weapon equipment portfolio selection based on a heterogeneous network model[J]. Journal of Global Optimization, 2020, 78: 743-761.
[8]
Dou Y J, Zhou Z X, Zhao D L, et al. Weapons system portfolio selection based on the contribution rate evaluation of system of systems[J]. Journal of Systems Engineering and Electronics, 2019, 30(5): 905-919.
[9]
周宇, 姜江, 赵青松, 等. 武器装备体系组合规划的高维多目标优化决策[J]. 系统工程理论与实践, 2014, 34(11): 2944-2954.
Zhou Y, Jiang J, Zhao Q S, et al. Many-objetive optimization and decision-making for portfolio planning of armament system of systems[J]. Systems Engineering — Theory & Practice, 2014, 34(11): 2944-2954.
[10]
郭栋, 张迎新, 韩高飞, 等. 武器装备体系规划备选方案生成方法[J]. 指挥控制与仿真, 2020, 42(5): 101-107.
Guo D, Zhang Y X, Han G F, et al. Methods for generating alternative weapons system planning shcemes[J]. Command Control & Sumulation, 2020, 42(5): 101-107.
[11]
Ismail M A, Hasan H T, Sondoss E. A military fleet mix problem for high-valued defense assets: A simulation-based optimization approach[J]. Expert Systems with Applications, 2023, 213: 118964.
[12]
林木, 王维平, 王涛, 等. 基于使命能力框架的国防项目组合结构优化方法[J]. 系统工程理论与实践, 2022, 42(10): 2829-2839.
Lin M, Wang W P, Wang T, et al. Optimization method of defense projects portfolio structure based on a mission-capability framework[J]. Systems Engineering — Theory & Practice, 2022, 42(10): 2829-2839.
[13]
Jorge D F, José C. The construction process of the synthetic risk model for military ship building projects in brazil[J]. Procedia Computer Science, 2016, 100: 796-803.
[14]
Lee H C, Liu H Y, Teng S Y. Distributed energy strategy using renewable energy transformation in Kinmen island: Virtual power plants that take the military camps as the mainstay[J]. Energy Strategy Reviews, 2022, 44: 100993.
[15]
Clint B S, Damarys A A, Edith M G, et al. Development of a suitable project management approach for projects with parallel planning and execution[J]. Procedia Manufacturing, 2020, 51: 1544-1550.
[16]
de Oliveira L L, de Oliveira Ribeiro C, Qadrdan M. Analysis of electricity supply and demand intra-annual dynamics in brazil: A multi-period and multi-regional generation expansion planning model[J]. International Journal of Electrical Power & Energy Systems, 2022, 137: 107886.
[17]
Smith C B, Acevedo-Acevedo D, Martínez-Guerra E, et al. Developing water resiliency solutions at military installations[J]. Climate Risk Management, 2022, 37: 100451.
[18]
Kettunen J, Lejeune M A. Data-driven project portfolio selection: Decision-dependent stochastic programming formulations with reliability and time to market requirements[J]. Computers & Operations Research, 2022, 143: 105737.
[19]
Salo A, Andelmin J, Oliveira F. Decision programming for mixed-integer multi-stage optimization under uncertainty[J]. European Journal of Operational Research, 2022, 299(2): 550-565.
[20]
邹鑫, 王仁锋, 张立辉, 等. 计及软逻辑的重复性项目离散时间费用权衡及其约束规划模型研究[J]. 中国管理科学, 2022, 30(10): 109-118.
Zou X, Wang R F, Zhang L H, et al. A study of discrete time cost tradeoffs for repetitive projects and their constrained planning models with soft logic[J]. Chinese Journal of Management Science, 2022, 30(10): 109-118.
[21]
Kedir N S, Somi S, Fayek A R, et al. Hybridization of reinforcement learning and agent-based modeling to optimize construction planning and scheduling[J]. Automation in Construction, 2022, 142: 104498.
[22]
Shi S, Li J J, Li G H, et al. GPM: A graph convolutional network based reinforcement learning framework for portfolio management[J]. Neurocomputing, 2022, 498: 14-27.
[23]
Fang Z G, Tan T, Yan J Y, et al. Automated portfolio-based strategic asset management based on deep neural image classification[J]. Automation in Construction, 2022, 142: 104481.
[24]
Park H, Sim M K, Choi D G. An intelligent financial portfolio trading strategy using deep q-learning[J]. Expert Systems with Applications, 2020, 158: 113573.
[25]
Asghari V, Wang Y, Biglari A J, et al. Reinforcement learning in construction engineering and management: A review[J]. Journal of Construction Engineering and Management, 2022, 148(11): 03122009.
[26]
Wang S Q, Wang Q K. Effect evaluation of construction engineerization management for military projects[J]. Systems Engineering Procedia, 2012, 3: 351-356.
[27]
Hasan M M, Lwin K, Imani M, et al. Dynamic multi-objective optimisation using deep reinforcement learning: Benchmark, algorithm and an application to identify vulnerable zones based on water quality[J]. Engineering Applications of Artificial Intelligence, 2019, 86: 107-135.
[28]
Zhang Z, Wu Z, Zhang H, et al. Meta-learning-based deep reinforcement learning for multiobjective optimization problems[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(10): 7978-7991.
[29]
Li J H, Zhang Y, Yang X Y, et al. Online portfolio management via deep reinforcement learning with high-frequency data[J]. Information Processing & Management, 2023, 60(3): 103247.
[30]
陈亚强, 穆龙新, 翟光华, 等. 海外油气项目多目标投资组合优化方法[J]. 系统工程理论与实践, 2017, 37(11): 3018-3024.
Chen Y Q, Mu L X, Zhai G H, et al. The multi-objective portfolio optimization method for overseas oil & gas projects[J]. Systems Engineering — Theory & Practice, 2017, 37(11): 3018-3024.
[31]
Chagas J B C, Wagner M. A weighted-sum method for solving the bi-objective traveling thief problem[J]. Computers & Operations Research, 2022, 138: 105560.
[32]
Weir T, Johns K. Longitudinal models for project expenditure plans[C]// Proceedings of the 22nd International Congress on Modelling and Simulation, 2017: 695-701.
[33]
Kingma D P. A method for stochastic optimization[J]. ArXiv Preprint, 2014, arXiv: 1412. 6980, 1412: 6980.
[34]
Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[C]// Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, 2010: 249-256.

基金

国家自然科学基金(72231011); 国防科技大学高层次创新人才培养计划; 国防科技大学青年自主创新科学基金(ZK24-28)

版权

系统工程理论与实践编辑部,2025,版权所有,未经授权,不得转载。
PDF(818 KB)

152

Accesses

0

Citation

Detail

段落导航
相关文章

/