[1]姜贺云,张振宇,樊明宇.一种基于整数规划的无监督特征选择方法[J].温州大学学报(自然科学版),2020,(03):037-46.
 JIANG Heyun,ZHANG Zhenyu,FAN Mingyu.Unsupervised Feature Selection Method Based on Integer Programming[J].Journal of Wenzhou University,2020,(03):037-46.
点击复制

一种基于整数规划的无监督特征选择方法
分享到:

《温州大学学报》(自然科学版)[ISSN:1674-3563/CN:33-1344/N]

卷:
期数:
2020年03期
页码:
037-46
栏目:
计算机科学
出版日期:
2020-08-25

文章信息/Info

Title:
Unsupervised Feature Selection Method Based on Integer Programming
作者:
姜贺云1张振宇1樊明宇2
1.温州大学数理学院,浙江温州 325035;2.温州大学计算机与人工智能学院,浙江温州 325035
Author(s):
JIANG Heyun1 ZHANG Zhenyu1 FAN Mingyu2
1. College of Mathematics and Physics, Wenzhou University, Wenzhou, China 325035; 2. College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, China 325035
关键词:
无监督特征选择稀疏自表达0-1整数规划矩阵 范数约束
Keywords:
Unsupervised Feature Selection Sparse Self-representation 0-1 Integer Programming Matrix Norm Constraint
分类号:
TP301.6
文献标志码:
A
摘要:
特征选择方法可以剔除冗余特征,保留关键的数据特征,是一种有效解决维数灾难问题的途径.本文利用特征与特征之间的重构关系建立损失函数,并且基于投影矩阵的 范数作为稀疏约束项,提出一种新型的无监督特征选择方法.由于优化问题涉及离散稀疏的矩阵 范数约束条件,无法直接求得全局最优解.我们先将矩阵的 范数约束等价转化为0-1整数规划约束形式,然后再利用圆盒模型将0-1整数约束等价转化为两个连续的约束条件.对于新约束条件的优化问题,采用交替方向乘子法优化求解.在5个公开数据集上的分类和聚类实验结果表明,本文方法比其他9种无监督特征选择方法的效果更好,能够在指定特征数目时选择出判别能力更强的特征.
Abstract:
As an important approach to circumvent the curse of dimensionality, feature selection is able to remove the redundant features and preserve the key informative features. In this paper, we propose a novel unsupervised Feature Selection (FS) method based on the projective matrix norm as the sparse constraint and the loss function established by using the reconstruction relationship between features. Because the optimization problem involves discrete and sparse matrix norm constraints, the global optimal solution cannot be obtained directly. To address this issue, we transform matrix norm constraint to be a 0-1 integer constraint, and utilize circle-box approach to replace the 0-1 constraints with two continuous constraints. Finally, using the alternating direction method of multipliers, we get optimal solution for the optimization problem with new constraints. The experimental results of classification and clustering on five public data sets show that the proposed method in this paper is better than other nine unsupervised feature selection methods and can select features with stronger discrimination ability when the number of features is specified.

参考文献/References:

[1] Chang X J, Ma Z G, Yang Y, et al. Bi-level semantic representation analysis for multi-media event detection [J]. IEEE T Cybernetics, 2017, 47(5): 1180-1197.
[2] 王力波,王耀力,常青.生物信息学中的特征选择[J].太原理工大学学报,2017,48(3):457-468.
[3] 沈健,胡洁,马进,等.基于文本挖掘的生物领域实例获取[J].上海交通大学学报,2018,52(8):76-82.
[4] Guyon I, Elisseeff A. An introduction to variable and feature selection [J]. J Machine Learn Res, 2003, 3(6): 1157-1182.
[5] Lin W H, Huang J F, Suen C Y, et al. A feature extraction model based on discriminative graph signals [J]. Expert Syst Appl, 2019, 139: 1-10.
[6] 赵宇,黄思明,陈锐.特征选择与空间降维概述、热点及展望[J].数学的实践与认识,2013,43(15):179-190.
[7] 严菲,王晓栋.鲁棒的半监督多标签特征选择方法[J].智能系统学报,2019,14(4):812-819.
[8] Krzanowski W J. Selection of variables to preserve multivariate data structure, using principal components [J]. J R Stat Soc C-Appl, 1987, 36(1): 22-33.
[9] Zhu P F, Zhu W C, Hu Q H, et al. Subspace clustering guided unsupervised feature selection [J]. Pattern Recogn, 2017(66): 364-374.
[10] 朱丹,刘天佑,李宏伟.利用数据低秩性和稀疏性的位场分离[J].石油地球物理勘探,2019(4):925-936.
[11] Cohen A, Dahmen W, Devore R. Orthogonal matching pursuit under the restricted isometry property [J]. Const Approx, 2017, 45(1): 113-127.
[12] Donoho D L. For most large underdetermined systems of linear equations the minimal norm solution is also the sparsest solution [J]. Commun Pur Appl Math, 2006, 59(7): 797-829.
[13] Mallat S, Zhong S. Characterization of signals from multiscale edges [J]. IEEE T Pattern Anal, 1992, 14(7): 710-732.
[14] Wright J, Yang A Y, Ganesh A, et al. Robust face recognition via sparse representation [J]. IEEE T Pattern Anal, 2009, 31(2): 210-227.
[15] Zhu P F, Zuo W M, Zhang L, et al. Unsupervised feature selection by regularized self-representation [J]. Pattern Recogn, 2014, 48(2): 438-446.
[16] Zhu P F, Zhu W C, Wang W Z, et al. Non-convex regularized self-representation for unsupervised feature selection [J]. Image Vision Comput, 2017(60): 22-29.
[17] Wu B Y, Bernard G. lp-Box ADMM: a versatile framework for integer programming [J]. IEEE T Pattern Anal, 2018, 41(7): 1695-1708.
[18] 张雅丽,樊明宇.基于lp-Box ADMM的k稀疏编码[J].温州大学学报(自然科学版),2018,39(2):22-30.
[19] Li X L, Zhang H, Zhang R, et al. Generalized uncorrelated regression with adaptive graph for unsupervised feature selection [J]. IEEE T Neur Net Lear, 2018, 30(5): 1587-1595.
[20] Zhao Z, Wang L, Liu H, et al. On similarity preserving feature selection [J]. IEEE T Knowl Data En, 2011, 25(99): 619-632.

备注/Memo

备注/Memo:
收稿日期:2019-11-11
基金项目:国家自然科学基金项目(61772373)
作者简介:姜贺云(1994- ),女,河南驻马店人,硕士研究生,研究方向:智能系统与控制
更新日期/Last Update: 2020-08-25