The widespread use of Android devices for sensitive operations has made them prime targets for sophisticated cyber threats, including Advanced Persistent Threats (APT). Traditional malware detection methods focus primarily on malware classification, often failing to reveal the Tactics, Techniques, and Procedures (TTPs) used by attackers. To address this issue, we propose DroidTTP, a novel system for mapping Android malware to attack behaviors. We curated a dataset linking Android applications to Tactics and Techniques and developed an automated mapping approach using the Problem Transformation Approach and Large Language Models (LLMs). Our pipeline includes dataset construction, feature selection, data augmentation, model training, and explainability via SHAP. Furthermore, we explored the use of LLMs for TTP prediction using both Retrieval Augmented Generation and fine-tuning strategies. The Label Powerset XGBoost model achieved the best performance, with Jaccard Similarity scores of 0.9893 for Tactic classification and 0.9753 for Technique classification. The fine-tuned LLaMa model also performed competitively, achieving 0.9583 for Tactics and 0.9348 for Techniques. Although XGBoost slightly outperformed LLMs, the narrow performance gap highlights the potential of LLM-based approaches for Tactic and Technique prediction.
DroidTTP: Mapping android applications with TTP for Cyber Threat Intelligence
Nicolazzo S.;Arazzi M.;Nocera A.;
2025-01-01
Abstract
The widespread use of Android devices for sensitive operations has made them prime targets for sophisticated cyber threats, including Advanced Persistent Threats (APT). Traditional malware detection methods focus primarily on malware classification, often failing to reveal the Tactics, Techniques, and Procedures (TTPs) used by attackers. To address this issue, we propose DroidTTP, a novel system for mapping Android malware to attack behaviors. We curated a dataset linking Android applications to Tactics and Techniques and developed an automated mapping approach using the Problem Transformation Approach and Large Language Models (LLMs). Our pipeline includes dataset construction, feature selection, data augmentation, model training, and explainability via SHAP. Furthermore, we explored the use of LLMs for TTP prediction using both Retrieval Augmented Generation and fine-tuning strategies. The Label Powerset XGBoost model achieved the best performance, with Jaccard Similarity scores of 0.9893 for Tactic classification and 0.9753 for Technique classification. The fine-tuned LLaMa model also performed competitively, achieving 0.9583 for Tactics and 0.9348 for Techniques. Although XGBoost slightly outperformed LLMs, the narrow performance gap highlights the potential of LLM-based approaches for Tactic and Technique prediction.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


