Learning to map source code to software vulnerability using   code-as-a-graph

Sahil Suneja, Yunhui Zheng, Yufan Zhuang, Jim Laredo, Alessandro   Morari

Learning to map source code to software vulnerability using code-as-a-graph

Abstract

We explore the applicability of Graph Neural Networks in learning the nuances of source code from a security perspective. Specifically, whether signatures of vulnerabilities in source code can be learned from its graph representation, in terms of relationships between nodes and edges. We create a pipeline we call AI4VA, which first encodes a sample source code into a Code Property Graph. The extracted graph is then vectorized in a manner which preserves its semantic information. A Gated Graph Neural Network is then trained using several such graphs to automatically extract templates differentiating the graph of a vulnerable sample from a healthy one. Our model outperforms static analyzers, classic machine learning, as well as CNN and RNN-based deep learning models on two of the three datasets we experiment with. We thus show that a code-as-graph encoding is more meaningful for vulnerability detection than existing code-as-photo and linear sequence encoding approaches. (Submitted Oct 2019, Paper #28, ICST)

Keywords

Computer Science

📄 Full Paper Available as PDF

This paper is available as a downloadable PDF.

📄 Download PDF

Comments (0)

No comments yet. Be the first to comment.

Paper Details

Authors Sahil Suneja ,
Yunhui Zheng ,
Yufan Zhuang ,
Jim Laredo ,
Alessandro Morari
Published 2020-06-15
Category Computer Science
Status Non-peer-reviewed Preprint
Language English
Word Count 155

Learning to map source code to software vulnerability using code-as-a-graph

Abstract

Keywords

✨ AI Plain-English Summary

Comments (0)

Related Papers

A Model for Web Page Usage Mining Based on Segmentation

Core-Periphery Structure in Networks

Risk Assessment Techniques and Survey Method for COTS Components

Beyond the Bethe Free Energy of LDPC Codes via Polymer Expansions