Logo image
FCGHunter: Towards Evaluating Robustness of Graph-Based Android Malware Detection
Preprint   Open access

FCGHunter: Towards Evaluating Robustness of Graph-Based Android Malware Detection

Shiwen Song, Xiaofei Xie, Ruitao Feng, Qi Guo and Sen Chen
arXiv (Cornell University)
Cornell University
28/04/2025
url
FCGHunter: Towards Evaluating Robustness of Graph-Based Android Malware DetectionView
Preprint (Author's original)CC BY-NC-SA V4.0 Open

Metrics

1 Record Views

Abstract

Android malware detection Function call graph Robustness testing
Graph-based detection methods leveraging Function Call Graphs (FCGs) have shown promise for Android malware detection (AMD) due to their semantic insights. However, the deployment of malware detectors in dynamic and hostile environments raises significant concerns about their robustness. While recent approaches evaluate the robustness of FCG-based detectors using adversarial attacks, their effectiveness is constrained by the vast perturbation space, particularly across diverse models and features. To address these challenges, we introduce FCGHunter, a novel robustness testing framework for FCG-based AMD systems. Specifically, FCGHunter employs innovative techniques to enhance exploration and exploitation within this huge search space. Initially, it identifies critical areas within the FCG related to malware behaviors to narrow down the perturbation space. We then develop a dependency-aware crossover and mutation method to enhance the validity and diversity of perturbations, generating diverse FCGs. Furthermore, FCGHunter leverages multi-objective feedback to select perturbed FCGs, significantly improving the search process with interpretation-based feature change feedback. Extensive evaluations across 40 scenarios demonstrate that FCGHunter achieves an average attack success rate of 87.9%, significantly outperforming baselines by at least 44.7%. Notably, FCGHunter achieves a 100% success rate on robust models (e.g., AdaBoost with MalScan), where baselines achieve only 11% or are inapplicable.

Details

Logo image