您好, 访客   登录/注册

基于深层长短期记忆网络与批规范化的间歇过程故障检测方法

来源:用户上传      作者:

  摘 要:传统的基于数据驱动的间歇过程故障诊断方法往往需要对过程数据的分布进行假设,而且对非线性等复杂数据的监控往往会出现误报和漏报,为此提出一种基于长短期记忆网络(LSTM)与批规范化(BN)结合的监督学习方法,不需要对原始数据的分布进行假设。首先,对间歇过程原始数据运用一种按变量展开并连续采样的预处理方式,使处理后的数据可以向LSTM单元输入;然后,利用改进的深层LSTM网络进行特征学习,该网络通过添加BN层,结合交叉熵损失的表示方法,可以有效提取间歇过程数据的特征并进行快速学习;最后,在一类半导体蚀刻过程上进行仿真实验。实验结果表明,所提方法比多元线性主成分分析(MPCA)方法故障识别的种类更多,可以有效地识别各类故障,对故障的整体检测率达到95%以上;比传统单层LSTM模型建模速度更快,且对故障的整体检测率提高了8个百分点以上,比较适合处理间歇过程中具有非线性、多工况等特征的故障检测问题。
  关键词:数据驱动;深度学习;长短期记忆网络;间歇过程;故障检测
  中图分类号: TP277
  文献标志码:A
  Abstract: Traditional fault detection methods for batch process based on data-driven often need to make assumptions about the distribution of process data, and often lead to false positives and false negatives when dealing with non-linear data and other complex data. To solve this problem, a supervised learning algorithm based on Long Short-Term Memory (LSTM) network and Batch Normalization (BN) was proposed, which does not need to make assumptions about the distribution of original data. Firstly, a preprocessing method based on variable-wise unfolding and continuous sampling was applied to the batch process raw data, so that the processed data could be input to the LSTM unit. Then, the improved deep LSTM network was used for feature learning. By adding the BN layer and the representation method of cross entropy loss, the network was able to effectively extract the characteristics of the batch process data and learned quickly. Finally, a simulation experiment was performed on a semiconductor etching process. The experimental results show that compared with Multilinear Principal Component Analysis (MPCA) method, the proposed method can identify more faults types, which can effectively identify various faults, and the overall detection rate of faults reaches more than 95%. Compared with the traditional single-LSTM model, it has higher recognition speed, and its overall detection rate of faults is increased by more than 8%, and it is suitable for dealing with fault detection problems with non-linear and multi-case characteristics in the batch process.
  Key words: data driven; deep learning; Long Short-Term Memory (LSTM) network; batch process; fault detection
  0 引言
  隨着工业系统向大型化、复杂化方向发展,传统数据驱动的故障诊断方法无法适应新时期这种工业大数据特性的故障诊断需求,具体表现在过程数据量大、种类多,且价值密度低。虽然数据维数多,但对监测诊断任务来说不一定都是有用、有价值的[1]。间歇生产过程[2]是一类复杂工业过程,指生产过程在同一位置但在不同的时间分批进行,操作状态不稳定,过程参数随时间而变,由于不同的操作阶段具有不同的过程特性,使得监测变量会受到时间维度上的影响。传统的故障诊断方法依据多元统计分析如主元分析(Principal Component Analysis, PCA)和偏最小二乘(Partial Least Square, PLS),在故障诊断中有着广泛的应用[3-5],但是在具有多工序、非线性、非高斯等特点的间歇过程故障检测中应用效果不理想;例如传统PCA方法假定过程是线性的,特别是在确定霍特林T平方(Hotelling’s T-squared, T2)统计量和平方预测误差(Squared Prediction Error, SPE)统计量的控制限时需要进行变量服从多元高斯分布的假设[6],这些假设在实际生产中通常难以满足。文献[7]中提出的基于支持向量数据描述(Support Vector Data Description, SVDD)的多时段间歇过程故障检测,利用时间片数据样本集构建的SVDD超球体半径值与支持向量个数的变化划分间歇过程的多时段,不需要假设过程数据服从正态分布及变量间线性相关,同时实现了多时段间歇过程的时段划分和故障检测;但在面对数据量大、种类多的间歇过程时,该方法建模速度较慢,易于过拟合。文献[8]提出一种基于K近邻规则的故障检测方法,该方法在故障检测过程中适应数据非线性和多工况的特点,在应用中取得较好的效果;但仍需要依据统计学中显著性水平设置控制限,并假设原始数据为高斯分布,实验结果显示,对于非高斯分布等特征的复杂数据检测存在一定的误差。而利用深度学习中的长短期记忆网络(Long Short-Term Memory,LSTM)单元[9],可以很好地学习并提取具有非线性、多时段或多工况的间歇过程的特征,并且不需要对原始数据分布进行假设,完全从过程数据中学习特征。   深度学习的概念起源于神经网络的研究[10],有多个隐含层的多层感知器是深度学习模型的显著特征。相对于普通人工神经网络而言,深度学习算法具有更好地逼近复杂非线性函数的能力,并有许多方法来解决普通多层神经网络存在的梯度消失、过拟合等问题,比起浅层神经网络所需参数更少,且收敛速度和分类准确率都有所提升[10]。深度学习的基本模型是深度神经网络(Deep Neural Network,DNN),在故障诊断领域,在此基础上改进并出现了许多框架模型,包括深度置信网络(Deep Belief Network, DBN)[11]、卷积神经网络(Convolutional Neural Network, CNN)[12]、堆叠自动编码器(Stacked Autoencoder, SAE)[13]、递归神经网络(Recurrent Neural Network,RNN)[14]等。其中,RNN是一種带有记忆单元的神经网络,其特点是充分考虑了样本批次之间的关联关系,可用于处理时序数据或者前后关联数据,适用于复杂设备或系统的实时故障诊断;如文献[15]使用递归深度神经网络实现了对风力发电系统的运行行为建模,构造了一种动态的神经网络模型去模拟正常系统的行为,并通过比较真实系统和模型得出残差,仿真表明该方法可在很短时间内实现故障检测且误报率非常低,也说明了RNN非常适用于处理与时间序列高度相关的问题。LSTM是对RNN的一种改进,可以有效改善RNN在叠加多层时的梯度消失问题[16]。
  4 结语
  本文针对间歇过程的故障检测问题,建立了基于LSTM-BN的深度学习网络,用于监测间歇过程的故障,并对一类半导体蚀刻过程进行仿真实验,结果表明,基于LSTM-BN的深度学习网络对于间歇过程的故障检测是有效的,且具有很高的准确率。相比通用的MPCA方法和DNN-BN方法,LSTM-BN模型非常适用于处理与时间序列高度相关的问题,其优势体现在不需要对原始数据的分布进行假设,而且可以很好地记忆时间序列的信息,比传统的单层LSTM模型建模更快。
  本文实验中,由于故障集明显少于正常集,对于有监督学习来说易于过拟合,而LSTM网络模型可以不断学习更新,在得到某个新样本为故障而又无法检测时,可以将此样本再次通过损失函数进行参数更新,即在有更多数据时可以继续学习新数据的特性来提高模型的检测率和泛化能力,这是传统的MPCA模型无法做到的。
  参考文献:
  [1] 任浩,屈剑锋,柴毅,等.深度学习在故障诊断领域中的研究现状与挑战[J].控制与决策,2017,32(8):1345-1358. (REN H, QU J F, CHAI Y, et al. Deep learning for fault diagnosis: The state of the art and challenge[J]. Control and Decision, 2017, 32(8):1345-1358.)
  [2] 赵春晖,王福利,姚远,等.基于时段的间歇过程统计建模、在线监测及质量预报[J].自动化学报,2010,36(3):366-374. (ZHAO C H, WANG F L, YAO Y, et al. Phase-based statistical modeling, online monitoring and quality prediction for batch processes [J]. Acta Automatica Sinica, 2010, 36(3): 366-374.)
  [3] HUNG H, WU P, TU I, et al. On multilinear principal component analysis of order-two tensors [J]. Biometrika, 2012, 99(3): 569-583.
  [4] WANG J, HE Q P, QIN S J, et al. Recursive least squares estimation for run-to-run control with metrology delay and its application to STI etch process [J]. IEEE Transactions on Semiconductor Manufacturing, 2005, 18(2): 309-319.
  [5] YU J. Fault detection using principal components-based Gaussian mixture model for semiconductor manufacturing processes [J]. IEEE Transactions on Semiconductor Manufacturing, 2011, 24(3): 432-444.
  [6] JACKSON J E, MUDHOLKAR G S. Control procedures for residuals associated with principal component analysis [J]. Technometrics, 2012, 21(3): 341-349.
  [7] 王建林,马琳钰,邱科鹏,等.基于SVDD的多时段间歇过程故障检测[J].仪器仪表学报,2017,38(11):2752-2761. (WANG J L, MA L Y, QIU K P, et al. Multi-phase batch processes fault detection based on support vector data description[J]. Chinese Journal of Scientific Instrument, 2017, 38(11): 2752-2761.)   [8] HE Q P, WANG J. Fault detection using the k-nearest neighbor rule for semiconductor manufacturing processes[J]. IEEE Transactions on Semiconductor Manufacturing, 2007, 20(4): 345-354.
  [9] GRAVES A. Supervised Sequence Labelling with Recurrent Neural Networks[M]. Berlin: Springer, 2012: 37-45.
  [10] HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks [J]. Science, 2006, 313(5786): 504-507.
  [11] WU S, ZHANG L, ZHENG W, et al. A DBN-based risk assessment model for prediction and diagnosis of offshore drilling incidents [J]. Journal of Natural Gas Science and Engineering, 2016, 34: 139-158.
  [12] SUN J, XIAO Z, XIE Y. Automatic multi-fault recognition in TFDS based on convolutional neural network [J]. Neurocomputing, 2017, 222: 127-136.
  [13] LU C, WANG -Y, QIN W-L, et al. Fault diagnosis of rotary machinery components using a stacked denoising autoencoder-based health state identification [J]. Signal Processing, 2017, 130: 377-388.
  [14] de TIM B, VERBERT K, BABUSKA R. Railway track circuit fault diagnosis using recurrent neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(3): 523-533.
  [15] TALEBI N, SADRNIA M A, DARABI A. Robust fault detection of wind energy conversion systems based on dynamic neural networks [J]. Computational Intelligence and Neuroscience, 2014, 4(7): 580972
  [16] PASCANU R, MIKOLOV T, BENGIO Y. On the difficulty of training recurrent neural networks [C]// Proceedings of the 30th International Conference on Machine Learning: Vol. 28. Atlanta, GA: JMLR, 2013, 28: 1310-1318.https://arxiv.org/pdf/1211.5063.pdf
  [17] IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift [C]// Proceedings of the 32nd International Conference on Machine Learning: Vol. 37. Atlanta, GA: JMLR, 2015: 448-456.https://arxiv.org/pdf/1502.03167.pdf
  [18] GOODFELLOW I, BENGIO Y, COURVILLE A, et al. Deep learning [M]. Cambridge, UK: MIT Press, 2016:172-187.
  [19] DUCHI J, HAZAN E, SINGER Y. Adaptive subgradient methods for online learning and stochastic optimization [J]. Journal of Machine Learning Research, 2011, 12: 2121-2159.
  [20] WISE B M, GALLAGHER N B, BUTLER S W, et al. A comparison of principal component analysis, multiway principal component analysis, trilinear decomposition and parallel factor analysis for fault detection in a semiconductor etch process [J]. Journal of Chemometrics, 1999, 13(3/4): 379-396.
  [21] 常玉清,王姝,譚帅,等.基于多时段MPCA模型的间歇过程监测方法研究[J].自动化学报,2010,36(9):1312-1320. (CHANG Y Q, WANG S, TAN S, et al. Research on multistage-based MPCA modeling and monitoring method for batch processes[J]. Acta Automatica Sinica, 2010, 36(9):1312-1320.)
  [22] 陶栋琦,薄翠梅,易辉.基于多时段MPCA的半导体蚀刻过程监测方法[J].传感技术学报,2015,28(6):798-802. (TAO D Q, BO C M, YI H. Semiconductor etch process monitoring based on multi-stage MPCA [J]. Chinese Journal of Sensors and Actuators, 2015, 28(6): 798-802.)
  [23] GLOROT X, BORDES A, BENGIO Y. Deep sparse rectifier neural networks [C]//Proceedings of the 2011 Fourteenth International Conference on Artificial Intelligence and Statistics: Vol. 15. Atlanta, GA: JMLR, 2011: 315-323.
转载注明来源:https://www.xzbu.com/8/view-14941828.htm