基于结构化文本及代码度量的漏洞检测方法
来源:用户上传
作者:杨宏宇 应乐意 张良
摘要:目前的源代a漏洞检测方法大多仅依靠单一特征进行检测,表征的维度单一导致方法效率低,针对上述问题提出一种基于结构化文本及代码度量的漏洞检测方法,在函数级粒度进行漏洞检测.利用源代码结构化文本信息及代码度量结果作为特征,通过构造基于自注意力机制的神经网络捕获结构化文本信息中的长期依赖关系,以拟合结构化文本和漏洞存在之间的联系并转化为漏洞存在的概率.采用深度神经网络对代码度量的结果进行特征学习以拟合代码度量值与漏洞存在的关系,并将其拟合的结果转化为漏洞存在的概率.采用支持向量机对由上述两种表征方式获得的漏洞存在概率做进一步的决策分类并获得漏洞检测的最终结果,为验证该方法的漏洞检测性能,针对存在不同类型漏洞的11种源代码样本进行漏洞检测实验,该方法对每种漏洞的平均检测准确率为97.96%,与现有基于单一表征的漏洞检测方法相比,该方法的检测准确率提高了4.89%~12.21%,同时,该方法的漏报率和误报率均保持在10%以内.
关键词:漏洞检测;结构化表征;抽象语法树;代码度量;深度神经网络
中图分类号:TP393文献标志码:A
Vulnerability Detection Method Based on Structured Text and Code Metrics
YANG Hongyu YING Leyi ZHANG Liang3
(1. College of Safety Science and Engineering,Civil Aviation University of China,Tianjin 300300,China;
2. College of Computer Science and Technology,Civil Aviation University of China,Tianjin 300300,China;
3. College of Information,University of Arizona,AZ 8572 USA)
Abstract:Most of the current source code vulnerability detection methods only rely on a single feature,and the single dimension of characterization results in inefficient methods. To address the above issues,a vulnerability detection method based on structured text and code metrics is proposed to detect vulnerabilities at the function-level granularity. Using source code structured text information and code metrics as features,long-term dependencies in structured text information are captured by constructing a self-attention based neural network to fit the relationship between structured text and the existence of vulnerabilities and translate them into the probability of vulnerabilities. The deep neural network is used to learn the characteristics of the results of code metrics to fit the relationship between code metrics and the existence of vulnerabilities,and the fitted results are transformed into the probability of vulnerabilities. Support Vector Machine (SVM)is used to further classify the probabilities of vulnerabilities obtained by the above two representations and obtain the final results of vulnerability detection. To verify the vulnerability detection performance of this method,11 source code samples with different types of vulnerabilities are tested. The average detection accuracy of this method for each vulnerability is 97.96%. Compared with the existing vulnerability detection methods based on a single representation,this method improves the detection accuracy by 4.89%~12.21%,and at the same time. the false positive and false negative rates of this method are kept within 10%.
nlc202208291738
转载注明来源:https://www.xzbu.com/4/view-15438672.htm