Loading...
Thumbnail Image
Publication

Tractable Executable Binary Provenance Signalling through Vision Transformers

Nauman, Mohammad
Citations
Altmetric:
Type
Supervisor
Date
2024-01-15
Research Projects
Organizational Units
Journal Issue
Abstract
Provenance signaling involves tracing the source information of digital artifacts. It is a valuable intermediate output that greatly facilitates upstream tasks, including but not limited to malware analysis. Existing approaches to provenance signaling either rely on fully manual analysis or machine learning-based models that heavily depend on manually curated input features. This curation process requires the involvement of human experts, which is not only time-consuming but also infeasible on a large scale. In this paper, we present a novel model for provenance signaling that takes raw binaries as input and provides provenance signals with high efficacy. Our model is based on the state-of-the-art vision transformer architecture. We create a novel pipeline of efficiently encoding any binary into 2D sequences, capturing large-scale spatial relations hidden among binary opcodes. This allows our model to extract meaningful information about provenance without requiring the involvement of a human expert. Therefore, our work produces high-accuracy results and provides insights into the learning process, thus making the results more explainable.
Publisher
Sponsor
Copyright
Book title
Journal title
Embedded videos