您好, 访客   登录/注册

稀疏限制的增量式鲁棒非负矩阵分解及其应用

来源:用户上传      作者:

  摘 要:针对鲁棒非负矩阵分解(RNMF)的运算规模随训练样本数量逐渐增多而不断增大的问题,提出一种稀疏限制的增量式鲁棒非负矩阵分解算法。首先,对初始数据进行鲁棒非负矩阵分解;然后,将其分解结果参与到后续迭代运算;最后,在对系数矩阵增加稀疏限制的情况下与增量式学习相结合,使目标函数值在迭代求解时下降地更快。该算法在节省运算时间的同时提高了分解后数据的稀疏度。在数值实验中,将所提算法与鲁棒非负矩阵分解算法、稀疏限制的鲁棒非负矩阵分解(RNMFSC)算法进行了比较。在ORL和YALE人脸数据库上的实验结果表明,所提算法在运算时间和分解后数据的稀疏度等方面均优于其他两个算法,并且还具有较好的聚类效果,尤其在YALE人脸数据库上当聚类类别数为3时该算法的聚类准确率达到了91.67%。
  关键词:增量式学习;非负矩阵分解;稀疏限制;聚类;人臉识别
  中图分类号:TP181
  文献标志码:A
  Abstract: Aiming at the problem that the operation scale of Robust Nonnegative Matrix Factorization (RNMF) increases with the number of training samples, an incremental robust nonnegative matrix factorization algorithm with sparseness constraints was proposed. Firstly, robust nonnegative matrix factorization was performed on initial data. Then, the factorized result participated in the subsequent iterative operation. Finally, with sparseness constraints, the coefficient matrix was combined with incremental learning, which made the objective function value fall faster in the iterative solution. The cost of computation was reduced and the sparseness of data after factorization was improved. In the numerical experiments, the proposed algorithm was compared with RNMF algorithm and RNMF with Sparseness Constraints (RNMFSC) algorithm. The experimental results on ORL and YALE face databases show that the proposed algorithm is superior to the other two algorithms in terms of operation time and sparseness of factorized data, and has better clustering effect, especially in YALE face database, when the clustering number is 3, the clustering accuracy of the proposed algorithm reaches 91.67%.
  英文关键词Key words: incremental learning; Nonnegative Matrix Factorization (NMF); sparseness constraint; clustering; face recognition
  0 引言
  图像处理已经广泛应用于生活中各个行业,涉及机器学习、模式识别、数据挖掘等领域,也与人类大脑的认知原理紧密相关,受到了国内外学者的广泛关注。在处理图像过程中子空间降维是一种最为常见的方法, 主要思路是将高维空间的原始数据(包括向量数据、矩阵数据、张量数据)映射到低维子空间进行处理,有效避免了维度灾难[1]所带来的困难。为了处理矩阵中的数据信息以达到降维的目的,通常将矩阵进行分解, 典型的矩阵分解方法主要包括:主成分分析[2]、线性判别分析[3]、独立分量分析[4]、奇异值分解[5]等。通常这些方法是在一定的限制下对原始数据矩阵进行线性变换或分解,从而实现了对原始矩阵降维,但是这些方法的共同点是在求解时有可能会出现负值。 然而在许多应用中,负值是与客观现实相矛盾的,不具有可解释性。
  因此,非负矩阵分解(Nonnegative Matrix Factorization, NMF)[6]这种新的矩阵分解方法被提出,它最早由Lee等[7-8] 于1999年在《Nature》杂志上发表的。它不同于上述典型的矩阵分解算法,其独特之处是通过添加非负限制,不但保证了分解结果的可解释性,还使分解后的数据具有局部表达的特性。后来一些学者在此基础上又提出了一些非负矩阵分解的改进算法,并将它们成功地应用到人脸识别、数据挖掘等领域:文献[9]中提出了增量式非负矩阵分解(Incremental NMF, INMF)算法;进一步文献[10]中提出了图正则(稀疏)增量式非负矩阵分解算法等。这些改进算法有一个共同点是损失函数都是F范数或KL散度,而文献[11]中在2011年提出了一种基于L2,1范数的非负矩阵分解——鲁棒非负矩阵分解(Robust Nonnegative Matrix Factorization, RNMF)算法,该算法自提出后立即得到了学术界的广泛关注,并取得了大量的研究成果。为了达到特定的实验效果,有些学者在基本的鲁棒非负矩阵分解算法框架中又添加各种限制,如稀疏性、流形、正交性、判别性等,提出了若干种改进的算法,并将其应用于各个领域,取得了良好的实验效果:文献[12-13]中考虑到原始数据中蕴含着几何结构,从而将流行学习与鲁棒非负矩阵分解相结合,提出了图正则的鲁棒非负矩阵分解(Graph Regularized RNMF, GRNMF)算法,该算法在分解过程中保留了数据集携带的局部几何信息,使得在低维能很好地表示原始样本近邻结构;  文献[14-15]中提出了稀疏限制的鲁棒非负矩阵分解(RNMF with Sparseness Constraints, RNMFSC)算法,对系数矩阵进行了稀疏限制,使分解后人脸图像有更高的识别率。   針对鲁棒非负矩阵分解的运算时间和分解后数据的稀疏程度来说,随着样本个数的实时增加,直接利用鲁棒非负矩阵分解时会出现运算规模不断增大的现象。本文就将鲁棒非负矩阵分解算法(RNMF)与增量式学习结合在一起,并在此基础上增加了L2,1范数的稀疏限制,从而提出了稀疏限制的增量式鲁棒非负矩阵分解(Incremental Robust Nonnegative Matrix Factorization with Sparseness Constraints, IRNMFSC)算法。该算法不仅有效地提高了分解后数据的稀疏度,得到图像的最佳局部表示,而且结合增量式学习充分利用上一步对初始数据的分解结果,避免重复计算从而降低运算时间,提升算法迭代优化求解时的收敛速度,使得此算法具有较好的聚类准确率。最后在多个数据库上的仿真结果表明,本文IRNMFSC算法均优于RNMF算法和RNMFSC算法。
  4 结语
  本文提出了稀疏限制的增量式鲁棒非负矩阵分解(IRNMFSC),并给出了相应的迭代法则和算法步骤,该算法随着新增样本个数的增加依然保持较好的收敛速度。通过对YALE人脸数据和ORL人脸数据进行数值实验,结果显示本文算法在迭代过程中收敛速度更快,寻优效果更好,不仅缩短了运算时间,而且使分解后的数据更为稀疏,能得到局部表达能力更强的基图像和比较高的聚类准确率。但是,算法中依然存在着不足,如迭代初值和稀疏参数的选取等,目前依然没有一个统一的标准,这些问题都将成为今后要研究的重点。
  参考文献 (References)
  [1] PARDALOS P M. Approximate Dynamic Programming: Solving the Curses of Dimensionality[M]. New York: Wiley, 2009: 39-50.
  [2] 史卫亚,郭跃飞,薛向阳.一种解决大规模数据集问题的核主成分分析算法[J]. 软件学报, 2009, 8(20): 2153-2159. (SHI W Y, GUO Y F, XUE X Y. A kernel principal component analysis algorithm for solving large scale data sets problem[J]. Journal of Software, 2009, 8(20): 2153-2159.)
  [3] 张燕平,窦蓉蓉,赵姝,等. 基于集成学习的规范化LDA人脸识别[J].计算机工程, 2010, 36(14):144-146.(ZHANG Y P, DOU R R, ZHAO S, et al. Regularized linear discriminant analysis for face recognition based on ensemble learning[J]. Computer Engineering, 2010, 36(14):144-146.)
  [4] KOLDOVSKY Z, TICHAVSKY P. Efficient variant of algorithm fastica for independent component analysis attaining the cramerRAO lower bound[C]// Proceedings of the 2005 IEEE/SP Workshop on Statistical Signal Processing. Piscataway, NJ: IEEE, 2005:1090-1095.
  [5] GOLUB G H, REINSCH C. Singular Value Decomposition and Least Squares Solutions[M]. Berlin: Springer, 1971: 403-420.
  [6] PAATERO P, TAPPER U. Positive matrix factorization: a nonnegative factor model with optimal utilization of error estimates of data values[J]. Environmental Surveying, 1994, 5(2):111-126.
  [7] LEE D D, SEUNG H S. Learning the parts of objects with nonnegative matrix factorization[J]. Nature, 1999, 401(6755):788-791.
  [8] LEE D D. Algorithms for nonnegative matrix factorization[J]. Advances in Neural Information Processing Systems, 2001, 13(6):556-562.
  [9] BUCAK S S, GUNSEL B. Incremental subspace learning via nonnegative matrix factorization[J]. Pattern Recognition, 2009, 42(5):788-797.
  [10] YU Z Z, LIU Y H, LI B, et al. Incremental graph regulated nonnegative matrix factorization for face recognition[J]. Journal of Applied Mathematics, 2014, 2014(1): Article ID 928051.   [11] KONG D, HUANG H. Robust nonnegative matrix factorization using L21norm[C]// Proceedings of the 20th ACM International Conference on Information and Knowledge Management. New York: ACM, 2011:673-682.
  [12] HUANG S, WANG H, LI T, et al. Robust graph regularized nonnegative matrix factorization for clustering[J]. ACM Transactions on Knowledge Discovery from Data, 2017, 11(3): Article No. 33.
  [13] DAI L Y, FENG C M, LIU J X, et al. Robust nonnegative matrix factorization via joint graph laplacian and discriminative information for identifying differentially expressed genes[J]. Complexity, 2017, 2017(40): Article ID 4216797.
  [14] YANG S, HOU C, ZHANG C, et al. Robust nonnegative matrix factorization via joint sparse and graph regularization[C]// Proceedings of the 2013 International Joint Conference on Neural Networks. Piscataway, NJ: IEEE, 2014:541-559.
  [15] DU S, SHI Y, WANG W. L(3/2) sparsity constrained graph nonnegative matrix factorization for image representation[C]// Proceedings of the 26th Chinese Control and Decision Conference. Piscataway, NJ: IEEE, 2014:2962-2965.
  [16] CAI D, HE X, WU X, et al. Nonnegative matrix factorization on manifold[C]// Proceedings of the 8th IEEE International Conference on Data Mining. Piscataway, NJ: IEEE, 2008: 63-72.
  [17] XU W, LIU X, GONG Y. Document clustering based on nonnegative matrix factorization[C]// Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2003:267-273.
  [18] 汪金濤, 曹玉东, 孙福明. 稀疏约束图正则非负矩阵分解的增量学习算法[J]. 计算机应用, 2017,37(4): 1071-1074.(WANG J T, CAO Y D, SUN F M. Incremental learning algorithm based on graph regularized nonnegative matrix factorization with sparseness constraints[J]. Journal of Computer Applications, 2017, 37(4): 1071-1074.)
转载注明来源:https://www.xzbu.com/8/view-14941564.htm