Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
class:gradsec2026 [2026/03/15 23:55]
mhshin [C10. AI Supply Chain & Model Security]
class:gradsec2026 [2026/03/30 13:59] (current)
jhj2004 [Agenda]
Line 33: Line 33:
  
 ^ Date ^ Name ^  Topic  ^ Slides ^ Minutes ^ ^ Date ^ Name ^  Topic  ^ Slides ^ Minutes ^
-| 3/4 | Minho | Ice-breaking ​[[https://drive.google.com/​file/​d/​1PGW7cKv0rqTp6jIaHmIz2Olmi9KXCCNn/​view?​usp=drive_link|AI-Cybersecurity]] | [[https://​drive.google.com/​file/​d/​1tQJK6mbAowQlCto7OzYd16rR8mlTkkiL/​view?​usp=drive_linkSurvey paper]] ​|+| 3/4 | Minho | AI-Introduction ​{{ :class:​ai-intro.pdf |AI-Intro}} ​ |
 | 3/11 | Minho |  |  |  | | 3/11 | Minho |  |  |  |
-| ::: | Cho | https://​www.usenix.org/​system/​files/​sec21-schuster.pdf| ​ |  |+| ::: | Cho | [[https://​www.usenix.org/​system/​files/​sec21-schuster.pdf|You autocomplete me: Poisoning vulnerabilities in neural code completion]] | [[https://​1drv.ms/​p/​c/​005794ae9195628e/​IQB4fo_zfZeySKirBSMijjfiAVbNdg_9N1hiWS702-MyQpk?​e=SsGxwB|You autocomplete me: Poisoning vulnerabilities in neural code completion]] ​|  |
 | 3/18 | Minho |  |  |  | | 3/18 | Minho |  |  |  |
-| ::: | Han |  |  |  | +| ::: | Han | [[https://​arxiv.org/​pdf/​2102.07995.pdf|D2a:​ A dataset built for ai-based vulnerability detection methods using differential analysis]] ​|  ​|  | 
-| 3/25 | Minho |  |  |  | +| 3/27 | Minho |  |  |  | 
-| ::: | Kwak|  |  |  | +| ::: | Kwak| [[https://​www.mdpi.com/​1424-8220/​23/​9/​4403/​pdf|A Deep Learning-Based Innovative Technique for Phishing Detection with URLs]] ​|  |  | 
-| 4/1 | Cho |  |  |  | +| 4/1 | No Class |  |  |  | 
-| 4/No Class  |  |  |+| 4/10 Cho [[https://​arxiv.org/​pdf/​1803.04173|Adversarial Malware Binaries: Evading Deep 
 +Learning for Malware Detection in Executables]] ​|  |  |
 | 4/15 | Han |  |  |  | | 4/15 | Han |  |  |  |
-| 4/22 No Class |  |  |  | +| 4/24 Kwak|  |  |  | 
-| 4/29 | Kwak |  |  |  | +| 4/29 | Cho |  |  |  | 
-| 5/6 | Cho |  |  |  | +| 5/6 | Han |  |  |  | 
-| 5/13 | Han |  |  |  | +| 5/13 | Kwak |  |  |  | 
-| 5/20 | Kwak |  |  |  | +| 5/20 | Cho |  |  |  | 
-| 5/27 | Cho |  |  |  | +| 5/27 | Han |  |  |  | 
-| 6/3 | Han |  |  |  | +| 6/3 | Kwak |  |  |  | 
-| 6/10 | Kwak |  |  |  | +| 6/10 | Cho |  |  |  | 
-| 6/17 | Cho |  |  |  | +| 6/17 | Han |  |  |  | 
-| 6/24 | Han |  |  |  |+| 6/24 | Kwak |  |  |  |
 ====== Class Information ====== ====== Class Information ======
  
Line 85: Line 86:
  
 ====== Reading List for LLM-based Cybersecurity ====== ====== Reading List for LLM-based Cybersecurity ======
- 
-==== C1. Adversarial Machine Learning ==== 
- 
-  - **Explaining and Harnessing Adversarial Examples** 
-    * Ian Goodfellow, Jonathon Shlens, Christian Szegedy, ICLR 2015 | Pages: 11 | Difficulty: 2/5 
-    * Abstract: This seminal paper introduces the Fast Gradient Sign Method (FGSM) and demonstrates that neural networks are vulnerable to adversarial examples - inputs with imperceptible perturbations that cause misclassification. The authors show that adversarial examples transfer across models and propose that linearity in high-dimensional spaces is the primary cause of vulnerability,​ challenging previous hypotheses about model overfitting. 
-  - **Towards Evaluating the Robustness of Neural Networks** 
-    * Nicholas Carlini, David Wagner, IEEE S&P 2017 | Pages: 16 | Difficulty: 3/5 
-    * Abstract: This paper presents the powerful C&W attack, demonstrating that defensive distillation and other defenses can be bypassed. The authors formulate adversarial example generation as an optimization problem and introduce targeted attacks that achieve near-perfect success rates. They establish important evaluation methodology for measuring model robustness and show that many claimed defenses provide false security. 
-  - **Intriguing Properties of Neural Networks** 
-    * Christian Szegedy et al., ICLR 2014 | Pages: 10 | Difficulty: 3/5 
-    * Abstract: The first paper to formally identify adversarial examples in deep neural networks. The authors demonstrate that small, carefully crafted perturbations can fool state-of-the-art models and that these adversarial examples transfer between different models. They introduce the L-BFGS attack method and show that adversarial examples reveal fundamental properties of neural network decision boundaries rather than being mere artifacts of overfitting. 
-  - **DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks** 
-    * Seyed-Mohsen Moosavi-Dezfooli et al., CVPR 2016 | Pages: 9 | Difficulty: 3/5 
-    * Abstract: This paper introduces DeepFool, an efficient algorithm to compute minimal adversarial perturbations. Unlike FGSM which produces large perturbations,​ DeepFool iteratively linearizes the classifier to find the closest decision boundary. The method provides a way to measure model robustness quantitatively and demonstrates that different architectures have varying levels of robustness to adversarial perturbations. 
-  - **Universal Adversarial Perturbations** 
-    * Seyed-Mohsen Moosavi-Dezfooli et al., CVPR 2017 | Pages: 10 | Difficulty: 3/5 
-    * Abstract: This paper demonstrates the existence of universal perturbations - single perturbations that can fool a classifier on most inputs from a dataset. These image-agnostic perturbations reveal fundamental geometric properties of decision boundaries and challenge the notion that adversarial examples are input-specific artifacts. The work shows that universal perturbations transfer across different models trained on the same task. 
-  - **Adversarial Examples Are Not Bugs, They Are Features** 
-    * Andrew Ilyas et al., NeurIPS 2019 | Pages: 25 | Difficulty: 3/5 
-    * Abstract: This influential paper argues that adversarial vulnerability arises from models relying on highly predictive but non-robust features in the data. The authors demonstrate that models trained only on adversarial examples can achieve good accuracy on clean data, showing that adversarial examples exploit genuine patterns. This challenges the view of adversarial examples as bugs and suggests they reveal fundamental properties of standard machine learning. 
-  - **Adversarial Patch** 
-    * Tom Brown et al., NIPS 2017 Workshop | Pages: 5 | Difficulty: 2/5 
-    * Abstract: This paper introduces adversarial patches - printable, physical perturbations that can fool classifiers in the real world. Unlike digital perturbations,​ patches are robust to viewing angle, distance, and lighting conditions. The authors demonstrate attacks where a small sticker can cause an image classifier to ignore everything else in the scene, raising serious concerns for real-world ML deployment in security-critical applications. 
-  - **Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey** 
-    * Naveed Akhtar, Ajmal Mian, IEEE Access 2018 | Pages: 31 | Difficulty: 1/5 (Survey) 
-    * Abstract: A comprehensive survey covering adversarial attacks and defenses in computer vision. The paper categorizes attacks based on adversary knowledge, attack specificity,​ and attack frequency. It reviews major attack methods (FGSM, C&W, DeepFool) and defense strategies (adversarial training, defensive distillation,​ gradient masking). An excellent entry-level resource for understanding the adversarial ML landscape. 
- 
------- 
- 
-==== C2. Model Poisoning & Backdoor Attacks ==== 
- 
-  - <fc red>​(Jo)</​fc>​ **You autocomplete me: Poisoning vulnerabilities in neural code completion** 
-    * Schuster et al 
-    * Abstract: This paper demonstrates that neural code autocompleters are vulnerable to data and model poisoning attacks where attackers can inject specially-crafted files into training corpora or fine-tune models to influence autocomplete suggestions toward insecure coding practices such as weak encryption modes or outdated SSL versions 
-  - **BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain** 
-    * Tianyu Gu et al., NIPS 2017 Workshop | Pages: 6 | Difficulty: 2/5 
-    * Abstract: This pioneering work introduces backdoor attacks on neural networks where an attacker poisons training data with trigger patterns. The resulting model performs normally on clean inputs but misclassifies when the trigger is present. The authors demonstrate attacks on traffic sign recognition and face identification,​ showing that backdoored models are difficult to detect through standard accuracy testing. 
-  - **Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks** 
-    * Ali Shafahi et al., NeurIPS 2018 | Pages: 11 | Difficulty: 3/5 
-    * Abstract: This paper introduces clean-label poisoning where poisoned training samples maintain correct labels, making attacks harder to detect. The authors craft imperceptible perturbations to training images that cause targeted misclassification. They use feature collision in the network'​s representation space to make the target input appear similar to a chosen class, demonstrating successful attacks on transfer learning scenarios. 
-  - **Trojaning Attack on Neural Networks** 
-    * Yingqi Liu et al., IEEE ICCD 2018 | Pages: 8 | Difficulty: 3/5 
-    * Abstract: Presents a systematic approach to injecting hardware-based trojans in neural networks. Shows how attackers can manipulate model behavior through malicious hardware modifications. Demonstrates attacks that activate only under specific trigger conditions while maintaining normal behavior otherwise. 
-  - **Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks** 
-    * Bolun Wang et al., IEEE S&P 2019 | Pages: 15 | Difficulty: 3/5 
-    * Abstract: Proposes the first defense mechanism specifically designed to detect and remove backdoors from neural networks. Uses optimization to reverse-engineer potential triggers and identifies anomalous patterns. Successfully detects backdoors with high accuracy and can remove them through fine-tuning or neuron pruning. 
-  - **Bypassing Backdoor Detection Algorithms in Deep Learning** 
-    * Te Juin Lester Tan, Reza Shokri, NeurIPS 2020 | Pages: 11 | Difficulty: 4/5 
-    * Abstract: Demonstrates sophisticated backdoor attacks that evade state-of-the-art detection methods. Shows that adaptive attackers can craft triggers that appear natural and avoid detection by Neural Cleanse and similar defenses. Challenges the security of existing backdoor detection approaches. 
-  - **Backdoor Attacks Against Deep Learning Systems in the Physical World** 
-    * Emily Wenger et al., CVPR 2021 | Pages: 10 | Difficulty: 3/5 
-    * Abstract: Extends backdoor attacks to the physical world using robust physical triggers. Demonstrates attacks on traffic sign recognition where physical stickers serve as backdoor triggers. Shows that backdoors can survive in real-world conditions with varying angles, distances, and lighting. 
-  - **Blind Backdoors in Deep Learning Models** 
-    * Eugene Bagdasaryan,​ Vitaly Shmatikov, USENIX Security 2021 | Pages: 18 | Difficulty: 4/5 
-    * Abstract: Introduces blind backdoor attacks where the attacker doesn'​t need to control the training process. Shows how backdoors can be injected through model replacement or by poisoning only a small fraction of training data. Demonstrates attacks on federated learning and transfer learning scenarios. 
-  - **WaNet: Imperceptible Warping-based Backdoor Attack** 
-    * Anh Nguyen et al., ICLR 2021 | Pages: 18 | Difficulty: 3/5 
-    * Abstract: Proposes a novel backdoor attack using smooth warping transformations instead of visible patches. These backdoors are nearly imperceptible and harder to detect than traditional patch-based triggers. Demonstrates high attack success rates while evading multiple defense mechanisms. 
-  - **Backdoor Learning: A Survey** 
-    * Yiming Li et al., arXiv 2022 | Pages: 45 | Difficulty: 1/5 (Survey) 
-    * Abstract: Comprehensive survey of backdoor attacks and defenses in deep learning. Categorizes attacks by trigger type, poisoning strategy, and attack scenario. Reviews detection and mitigation methods. Provides taxonomy and identifies open research challenges. 
- 
------- 
- 
-==== C3. Privacy Attacks on Machine Learning ==== 
- 
-  - **Membership Inference Attacks Against Machine Learning Models** 
-    * Reza Shokri et al., IEEE S&P 2017 | Pages: 15 | Difficulty: 3/5 
-    * Abstract: Introduces membership inference attacks where an attacker determines if a specific data point was in the training set. Demonstrates attacks on commercial ML services including Google Prediction API. Shows that overfitting makes models vulnerable and that confidence scores leak membership information. 
-  - **Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures** 
-    * Matt Fredrikson et al., CCS 2015 | Pages: 12 | Difficulty: 3/5 
-    * Abstract: Demonstrates model inversion attacks that reconstruct training data from model outputs. Shows successful reconstruction of facial images from face recognition models and recovery of sensitive attributes from genomic data predictors. Proposes confidence masking as a partial defense. 
-  - **Extracting Training Data from Large Language Models** 
-    * Nicholas Carlini et al., USENIX Security 2021 | Pages: 17 | Difficulty: 3/5 
-    * Abstract: Shows that large language models like GPT-2 memorize and can be made to emit verbatim training data including personal information. Demonstrates extraction of phone numbers, addresses, and copyrighted content. Raises serious privacy concerns for LLMs trained on web data. 
-  - **The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks** 
-    * Nicholas Carlini et al., USENIX Security 2019 | Pages: 18 | Difficulty: 3/5 
-    * Abstract: Studies unintended memorization in neural networks, showing models can memorize rare or sensitive training examples. Proposes exposure metrics to quantify memorization and demonstrates extraction attacks. Shows that differential privacy provides limited protection against memorization. 
-  - **Stealing Machine Learning Models via Prediction APIs** 
-    * Florian Tramèr et al., USENIX Security 2016 | Pages: 20 | Difficulty: 3/5 
-    * Abstract: Demonstrates model extraction attacks where an attacker queries a black-box model to steal its functionality. Shows successful extraction of logistic regression, neural networks, and decision trees. Analyzes the cost-accuracy tradeoff and proposes defenses based on output perturbation. 
-  - **Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning** 
-    * Briland Hitaj et al., CCS 2017 | Pages: 14 | Difficulty: 4/5 
-    * Abstract: Presents a novel attack on collaborative learning using GANs. Shows that an adversarial participant can use a GAN to reconstruct private training data from model updates in federated learning. Demonstrates attacks that recover recognizable images from gradient information. 
-  - **SoK: Privacy-Preserving Machine Learning** 
-    * Maria Rigaki, Sebastian Garcia, arXiv 2023 | Pages: 38 | Difficulty: 1/5 (Survey) 
-    * Abstract: Systematization of knowledge on privacy attacks and defenses in machine learning. Covers membership inference, model inversion, and data extraction. Reviews privacy-preserving techniques including differential privacy, secure computation,​ and federated learning. 
- 
------- 
- 
-==== C4. LLM Security & Jailbreaking ==== 
- 
-  - **Jailbroken:​ How Does LLM Safety Training Fail?** 
-    * Alexander Wei et al., NeurIPS 2023 | Pages: 34 | Difficulty: 3/5 
-    * Abstract: Analyzes why safety training in LLMs can be circumvented through jailbreaking. Identifies two failure modes: competing objectives during training and mismatched generalization between safety and capabilities. Provides theoretical framework for understanding jailbreak vulnerabilities. 
-  - **Universal and Transferable Adversarial Attacks on Aligned Language Models** 
-    * Andy Zou et al., arXiv 2023 | Pages: 25 | Difficulty: 3/5 
-    * Abstract: Introduces automated methods to generate adversarial suffixes that jailbreak LLMs. Shows these attacks transfer across models including GPT-3.5, GPT-4, and Claude. Demonstrates that aligned models remain vulnerable to optimization-based attacks despite safety training. 
-  - **Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection** 
-    * Kai Greshake et al., AISec 2023 | Pages: 17 | Difficulty: 2/5 
-    * Abstract: Introduces indirect prompt injection where attackers manipulate LLM behavior through external data sources. Demonstrates attacks on real applications including email assistants and document processors. Shows how injected instructions in websites or documents can compromise LLM-integrated systems. 
-  - **Poisoning Language Models During Instruction Tuning** 
-    * Alexander Wan et al., ICML 2023 | Pages: 12 | Difficulty: 3/5 
-    * Abstract: Demonstrates backdoor attacks during instruction tuning phase of LLMs. Shows that small amounts of poisoned instruction data can inject persistent backdoors. Attacks remain effective even after additional fine-tuning on clean data. 
-  - **Red Teaming Language Models with Language Models** 
-    * Ethan Perez et al., EMNLP 2022 | Pages: 23 | Difficulty: 2/5 
-    * Abstract: Uses LLMs to automatically generate test cases for red-teaming other LLMs. Discovers diverse failure modes including offensive outputs and privacy leaks. Shows automated red-teaming can scale safety testing beyond manual efforts. 
-  - **Prompt Injection Attacks and Defenses in LLM-Integrated Applications** 
-    * Yupei Liu et al., arXiv 2023 | Pages: 14 | Difficulty: 2/5 
-    * Abstract: Formalizes prompt injection attacks and proposes taxonomy. Analyzes both direct and indirect injection vectors. Evaluates existing defenses and proposes new mitigation strategies including prompt sandboxing and input validation. 
-  - **Are Aligned Neural Networks Adversarially Aligned?** 
-    * Nicholas Carlini et al., NeurIPS 2023 | Pages: 29 | Difficulty: 4/5 
-    * Abstract: Studies whether alignment through RLHF provides adversarial robustness. Finds that aligned models remain vulnerable to adversarial attacks and that alignment and robustness are distinct properties. Challenges assumptions about safety of aligned models. 
-  - **SoK: Exploring the State of the Art and the Future Potential of Artificial Intelligence in Digital Forensic Investigation** 
-    * Yiming Liu et al., IEEE S&P 2024 | Pages: 52 | Difficulty: 1/5 (Survey) 
-    * Abstract: Comprehensive survey on LLM security covering jailbreaking,​ prompt injection, data extraction, and misuse. Categorizes attacks and defenses. Discusses open challenges in securing LLM-based applications. 
- 
------- 
- 
-==== C5. Federated Learning Security ==== 
- 
-  - **How To Backdoor Federated Learning** 
-    * Eugene Bagdasaryan et al., AISTATS 2020 | Pages: 11 | Difficulty: 3/5 
-    * Abstract: Demonstrates that a single malicious participant can inject backdoors into federated learning models. Shows model replacement attacks where the attacker'​s update overrides honest participants. Proposes defenses based on norm clipping and differential privacy. 
-  - **Machine Learning with Adversaries:​ Byzantine Tolerant Gradient Descent** 
-    * Peva Blanchard et al., NeurIPS 2017 | Pages: 11 | Difficulty: 4/5 
-    * Abstract: Addresses Byzantine attacks in distributed learning where participants send arbitrary malicious updates. Proposes Krum aggregation rule that is robust to Byzantine workers. Provides theoretical guarantees on convergence under adversarial conditions. 
-  - **The Limitations of Backdoor Detection in Federated Learning** 
-    * Cong Xie et al., NeurIPS 2020 | Pages: 11 | Difficulty: 3/5 
-    * Abstract: Shows that existing backdoor detection methods for federated learning can be evaded. Demonstrates adaptive attacks that bypass norm-based and clustering-based defenses. Highlights fundamental challenges in securing federated learning against sophisticated attackers. 
-  - **Analyzing Federated Learning through an Adversarial Lens** 
-    * Arjun Nitin Bhagoji et al., ICML 2019 | Pages: 18 | Difficulty: 3/5 
-    * Abstract: Comprehensive analysis of attack vectors in federated learning. Studies both untargeted poisoning and targeted backdoor attacks. Analyzes the impact of attacker capabilities and proposes anomaly detection defenses. 
-  - **DBA: Distributed Backdoor Attacks against Federated Learning** 
-    * Chulin Xie et al., ICLR 2020 | Pages: 13 | Difficulty: 3/5 
-    * Abstract: Introduces distributed backdoor attacks where multiple attackers collaborate to inject backdoors while evading detection. Shows that distributed attacks are harder to detect than single-attacker scenarios. Demonstrates successful attacks under defensive aggregation rules. 
-  - **Attack of the Tails: Yes, You Really Can Backdoor Federated Learning** 
-    * Hongyi Wang et al., NeurIPS 2020 | Pages: 12 | Difficulty: 4/5 
-    * Abstract: Presents edge-case backdoor attacks that are harder to detect. Shows that backdoors can be designed to activate only on rare inputs while maintaining model utility. Demonstrates attacks that bypass existing defenses including differential privacy. 
-  - **Advances and Open Problems in Federated Learning** 
-    * Peter Kairouz et al., Foundations and Trends in Machine Learning 2021 | Pages: 269 | Difficulty: 2/5 (Survey) 
-    * Abstract: Comprehensive survey of federated learning including security and privacy challenges. Covers poisoning attacks, privacy attacks, and defenses. Discusses open problems in Byzantine-robust aggregation and privacy-preserving protocols. 
- 
------- 
- 
-==== C6. AI for Cybersecurity Defense ==== 
- 
-  - <fc red>​(Han)</​fc>​ **D2a: A dataset built for ai-based vulnerability detection methods using differential analysis** 
-    - Zheng et al 
-    - Abstract: This paper proposes D2A, a differential analysis approach that automatically labels static analysis issues by comparing code versions before and after bug-fixing commits, generating a large dataset of 1.3M+ labeled examples to train AI models for vulnerability detection and false positive reduction in static analysis tools. 
-  - **Deep Learning for Malware Detection** 
-    * Edward Raff et al., arXiv 2017 | Pages: 10 | Difficulty: 2/5 
-    * Abstract: Applies deep learning to static malware detection using raw bytes. Achieves high accuracy on large-scale malware datasets. Discusses practical deployment challenges and adversarial robustness concerns for ML-based malware detection. 
-  - **KITSUNE: An Ensemble of Autoencoders for Online Network Intrusion Detection** 
-    * Yisroel Mirsky et al., NDSS 2018 | Pages: 15 | Difficulty: 2/5 
-    * Abstract: Proposes unsupervised intrusion detection using ensemble of autoencoders. Detects anomalies in network traffic without labeled data. Demonstrates effectiveness against various attacks including DDoS and reconnaissance. 
-  - **Adversarial Deep Learning in Intrusion Detection Systems** 
-    * Luca Demetrio et al., arXiv 2019 | Pages: 12 | Difficulty: 3/5 
-    * Abstract: Studies adversarial robustness of deep learning IDS. Shows that malware can evade detection through adversarial perturbations. Evaluates defenses including adversarial training for improving IDS robustness. 
-  - **Deep Learning Approach for Phishing Detection** 
-    * Alejandro Correa Bahnsen et al., IEEE CIT 2017 | Pages: 8 | Difficulty: 2/5 
-    * Abstract: Uses deep learning for phishing website detection based on URL and HTML features. Achieves high accuracy compared to traditional methods. Discusses real-time deployment considerations for phishing detection systems. 
-  - **DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning** 
-    * Min Du et al., CCS 2017 | Pages: 12 | Difficulty: 3/5 
-    * Abstract: Applies LSTM networks to system log anomaly detection. Models normal execution patterns and detects deviations. Demonstrates effectiveness in detecting system intrusions and failures through log analysis. 
-  - **Outside the Closed World: On Using Machine Learning for Network Intrusion Detection** 
-    * Robin Sommer, Vern Paxson, IEEE S&P 2010 | Pages: 15 | Difficulty: 2/5 
-    * Abstract: Classic paper discussing fundamental challenges of applying ML to intrusion detection. Highlights the open-world problem, concept drift, and adversarial manipulation. Argues for careful evaluation and realistic assumptions in security ML. 
-  - **Large Language Models for Cybersecurity:​ A Systematic Survey** 
-    * Hansheng Yao et al., arXiv 2024 | Pages: 42 | Difficulty: 1/5 (Survey) 
-    * Abstract: Comprehensive survey on using LLMs for security applications including vulnerability detection, malware analysis, and threat intelligence. Discusses prompt engineering for security tasks and limitations of LLMs in security contexts. 
- 
------- 
- 
-==== C7. AI for Offensive Security ==== 
- 
-  - **Evading Classifiers by Morphing in the Dark** 
-    * Qian Hu, Saumya Debray, Black Hat 2017 | Pages: 8 | Difficulty: 2/5 
-    * Abstract: Demonstrates practical evasion of ML-based malware detectors. Shows adversarial perturbations that preserve malware functionality while evading detection. Discusses implications for deploying ML in security-critical applications. 
-  - **Automating Network Exploitation Using Reinforcement Learning** 
-    * William Glodek, Sandia National Labs 2018 | Pages: 10 | Difficulty: 3/5 
-    * Abstract: Uses reinforcement learning for automated network penetration testing. Agents learn to exploit vulnerabilities through trial and error. Demonstrates potential and limitations of RL for offensive security automation. 
-  - **DeepFuzz: Automatic Generation of Syntax Valid C Programs for Fuzz Testing** 
-    * Xiao Liu et al., AAAI 2019 | Pages: 8 | Difficulty: 3/5 
-    * Abstract: Uses deep learning to generate valid C programs for fuzzing compilers. Learns syntax rules from existing code. Discovers previously unknown compiler bugs through automated test generation. 
-  - **Adversarial Examples for Evaluating Reading Comprehension Systems** 
-    * Robin Jia, Percy Liang, EMNLP 2017 | Pages: 11 | Difficulty: 2/5 
-    * Abstract: Creates adversarial examples for NLP systems by adding distracting sentences. Shows that reading comprehension models are brittle to such perturbations. Demonstrates importance of robust evaluation for NLP security. 
-  - **Generating Natural Language Adversarial Examples** 
-    * Moustafa Alzantot et al., EMNLP 2018 | Pages: 12 | Difficulty: 3/5 
-    * Abstract: Uses genetic algorithms to generate adversarial examples for text classification. Maintains semantic similarity while fooling models. Demonstrates vulnerabilities in sentiment analysis and textual entailment systems. 
-  - **LLM-Fuzzer:​ Fuzzing Large Language Models with Chain-of-Thought Prompts** 
-    * Jiahao Yu et al., arXiv 2023 | Pages: 16 | Difficulty: 2/5 
-    * Abstract: Automated fuzzing framework for discovering LLM vulnerabilities. Uses mutation-based approach to generate test cases. Finds jailbreak prompts and alignment failures. 
- 
------- 
-  
-==== C8. Robustness & Certified Defenses ==== 
- 
-  - **Towards Deep Learning Models Resistant to Adversarial Attacks** 
-    * Aleksander Madry et al., ICLR 2018 | Pages: 28 | Difficulty: 3/5 
-    * Abstract: Introduces PGD adversarial training as a robust defense. Formulates adversarial training as a min-max optimization problem. Shows significantly improved robustness against strong attacks. 
-  - **Certified Adversarial Robustness via Randomized Smoothing** 
-    * Jeremy Cohen et al., ICML 2019 | Pages: 17 | Difficulty: 4/5 
-    * Abstract: Provides provable robustness certificates using randomized smoothing. Transforms any classifier into certifiably robust version. Achieves state-of-the-art certified accuracy. 
-  - **Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope** 
-    * Eric Wong, Zico Kolter, ICML 2018 | Pages: 11 | Difficulty: 5/5 
-    * Abstract: Uses convex optimization to train provably robust networks. Computes exact worst-case adversarial loss during training. Limited to small networks but provides strong guarantees. 
-  - **Obfuscated Gradients Give a False Sense of Security** 
-    * Anish Athalye et al., ICML 2018 | Pages: 19 | Difficulty: 3/5 
-    * Abstract: Exposes gradient obfuscation as a common failure mode in adversarial defenses. Shows many published defenses can be broken with adaptive attacks. Introduces BPDA for attacking defenses. 
-  - **Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks** 
-    * Francesco Croce, Matthias Hein, ICML 2020 | Pages: 32 | Difficulty: 3/5 
-    * Abstract: Introduces AutoAttack, an ensemble of parameter-free attacks for robust evaluation. Reveals overestimated robustness in many defenses. Now standard evaluation benchmark. 
-  - **Benchmarking Neural Network Robustness to Common Corruptions and Perturbations** 
-    * Dan Hendrycks, Thomas Dietterich, ICLR 2019 | Pages: 17 | Difficulty: 2/5 
-    * Abstract: Introduces ImageNet-C for evaluating robustness to natural corruptions. Shows models often fail on common corruptions despite adversarial training. 
-  - **A Survey on Robustness of Neural Networks** 
-    * Jiefeng Huang et al., arXiv 2023 | Pages: 52 | Difficulty: 1/5 (Survey) 
-    * Abstract: Comprehensive survey covering adversarial robustness, certified defenses, and evaluation methods. Covers attack types, defense strategies, and theoretical foundations. 
- 
------- 
- 
-==== C9. Interpretability & Verification for Security ==== 
- 
-  - **Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks** 
-    * Guy Katz et al., CAV 2017 | Pages: 20 | Difficulty: 5/5 
-    * Abstract: Introduces formal verification of neural networks using SMT solving. Can prove properties about network behavior. Foundational work in neural network verification. 
-  - **Interpretable Machine Learning for Security** 
-    * Archana Kuppa, Nhien-An Le-Khac, arXiv 2020 | Pages: 24 | Difficulty: 2/5 
-    * Abstract: Survey on interpretability methods for security applications. Discusses LIME, SHAP, attention mechanisms. Argues for interpretability in security-critical ML. 
-  - **Activation Atlas: Exploring Neural Network Activations** 
-    * Shan Carter et al., Distill 2019 | Pages: 12 | Difficulty: 2/5 
-    * Abstract: Visualizes what neurons in neural networks respond to. Uses feature visualization to understand internal representations. Helps identify adversarial vulnerabilities. 
-  - **Quantifying Uncertainties in Neural Networks for Security Applications** 
-    * Lewis Smith, Yarin Gal, arXiv 2018 | Pages: 10 | Difficulty: 3/5 
-    * Abstract: Uses Bayesian neural networks to quantify uncertainty. Shows uncertainty can detect adversarial examples and out-of-distribution data. 
-  - **DeepXplore:​ Automated Whitebox Testing of Deep Learning Systems** 
-    * Kexin Pei et al., SOSP 2017 | Pages: 18 | Difficulty: 3/5 
-    * Abstract: Automated testing framework using neuron coverage as a metric. Generates inputs that maximize differential behavior across models. Finds thousands of erroneous behaviors. 
-  - **Attention is Not Explanation** 
-    * Sarthak Jain, Byron Wallace, NAACL 2019 | Pages: 11 | Difficulty: 2/5 
-    * Abstract: Challenges the use of attention weights as explanations. Shows attention can be manipulated without changing predictions. Important for security relying on interpretability. 
-  - **Explainability for AI Security: A Survey** 
-    * Fatima Alsubaei et al., arXiv 2022 | Pages: 38 | Difficulty: 1/5 (Survey) 
-    * Abstract: Comprehensive survey on explainability in AI security. Covers interpretability methods, their application to security, and limitations. 
- 
------- 
- 
-==== C10. AI Supply Chain & Model Security ==== 
- 
-  - **Protecting Intellectual Property of Deep Neural Networks with Watermarking** 
-    * Yusuke Uchida et al., AsiaCCS 2017 | Pages: 13 | Difficulty: 3/5 
-    * Abstract: Embeds watermarks in neural networks to prove ownership. Watermarks survive fine-tuning and model extraction attempts. 
-  - **Model Stealing Attacks Against Inductive Graph Neural Networks** 
-    * Asim Waheed Duddu et al., IEEE S&P 2022 | Pages: 16 | Difficulty: 3/5 
-    * Abstract: Demonstrates model extraction attacks on graph neural networks. Shows GNNs are particularly vulnerable to stealing. 
-  - **Weight Poisoning Attacks on Pre-trained Models** 
-    * Keita Kurita et al., ACL 2020 | Pages: 11 | Difficulty: 3/5 
-    * Abstract: Shows attackers can poison pre-trained models in model hubs. Injected backdoors persist through fine-tuning on downstream tasks. 
-  - **Backdoor Attacks on Self-Supervised Learning** 
-    * Aniruddha Saha et al., CVPR 2022 | Pages: 10 | Difficulty: 3/5 
-    * Abstract: Demonstrates backdoor attacks during self-supervised pre-training. Backdoors transfer to downstream tasks after fine-tuning. 
-  - **Proof-of-Learning:​ Definitions and Practice** 
-    * Hengrui Jia et al., IEEE S&P 2021 | Pages: 17 | Difficulty: 4/5 
-    * Abstract: Introduces proof-of-learning to verify models were trained as claimed. Prevents model theft and verifies computational work. 
-  - **SoK: Hate, Harassment, and the Changing Landscape of Social Media** 
-    * Shagun Jhaver et al., IEEE S&P 2021 | Pages: 47 | Difficulty: 1/5 (Survey) 
-    * Abstract: Systematization of knowledge on using AI for content moderation. Discusses ML models for detecting hate speech and harassment. 
- 
- 
- 
- 
- 
- 
------- 
  
 # AI Security Course - Research Paper List (2020+) # AI Security Course - Research Paper List (2020+)
 +# Papers with freely accessible PDFs (72 papers)
  
 ==== C1. Adversarial Machine Learning ==== ==== C1. Adversarial Machine Learning ====
Line 370: Line 95:
     * Abstract: This influential paper argues that adversarial vulnerability arises from models relying on highly predictive but non-robust features in the data. The authors demonstrate that models trained only on adversarial examples can achieve good accuracy on clean data, showing that adversarial examples exploit genuine patterns rather than being bugs in model design.     * Abstract: This influential paper argues that adversarial vulnerability arises from models relying on highly predictive but non-robust features in the data. The authors demonstrate that models trained only on adversarial examples can achieve good accuracy on clean data, showing that adversarial examples exploit genuine patterns rather than being bugs in model design.
     * Keywords: Deep learning, adversarial examples, robust features, neural networks, gradient-based attacks, image classification     * Keywords: Deep learning, adversarial examples, robust features, neural networks, gradient-based attacks, image classification
 +    * URL: https://​arxiv.org/​pdf/​1905.02175.pdf
   - **Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks**   - **Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks**
     * Francesco Croce, Matthias Hein, ICML 2020 | Pages: 32 | Difficulty: 3/5     * Francesco Croce, Matthias Hein, ICML 2020 | Pages: 32 | Difficulty: 3/5
     * Abstract: Introduces AutoAttack, an ensemble of parameter-free attacks for robust evaluation of adversarial defenses. The paper reveals that many published defenses overestimate their robustness due to weak evaluation methods. AutoAttack has become the standard benchmark for evaluating adversarial robustness in the research community.     * Abstract: Introduces AutoAttack, an ensemble of parameter-free attacks for robust evaluation of adversarial defenses. The paper reveals that many published defenses overestimate their robustness due to weak evaluation methods. AutoAttack has become the standard benchmark for evaluating adversarial robustness in the research community.
     * Keywords: Adversarial attacks, robustness evaluation, ensemble methods, PGD, gradient-based optimization,​ AutoAttack     * Keywords: Adversarial attacks, robustness evaluation, ensemble methods, PGD, gradient-based optimization,​ AutoAttack
 +    * URL: https://​arxiv.org/​pdf/​2003.01690.pdf
   - **On Adaptive Attacks to Adversarial Example Defenses**   - **On Adaptive Attacks to Adversarial Example Defenses**
     * Florian Tramer et al., NeurIPS 2020 | Pages: 13 | Difficulty: 4/5     * Florian Tramer et al., NeurIPS 2020 | Pages: 13 | Difficulty: 4/5
     * Abstract: Provides comprehensive guidelines for properly evaluating adversarial defenses against adaptive attacks. Shows that many defenses fail when attackers adapt their strategies. Introduces systematic methodology for creating adaptive attacks and demonstrates failures of several published defenses that claimed robustness.     * Abstract: Provides comprehensive guidelines for properly evaluating adversarial defenses against adaptive attacks. Shows that many defenses fail when attackers adapt their strategies. Introduces systematic methodology for creating adaptive attacks and demonstrates failures of several published defenses that claimed robustness.
     * Keywords: Adversarial defenses, adaptive attacks, security evaluation, gradient obfuscation,​ defense mechanisms     * Keywords: Adversarial defenses, adaptive attacks, security evaluation, gradient obfuscation,​ defense mechanisms
-  ​- **Improving Adversarial Robustness ​via Guided Complement Entropy** +    * URL: https://​arxiv.org/​pdf/​2002.08347.pdf 
-    * Hao-Yun Chen et al., ICCV 2021 | Pages: ​10 | Difficulty: 3/5 +  ​- **Improving Adversarial Robustness ​Requires Revisiting Misclassified Examples** 
-    * Abstract: Proposes ​a new adversarial training ​method using guided complement entropy ​that improves both standard accuracy ​and adversarial ​robustness. ​Addresses ​the trade-off between clean accuracy and robust accuracy by optimizing a novel objective function that considers prediction confidence ​on correct ​and incorrect classes+    * Yisen Wang et al., ICLR 2020 | Pages: ​23 | Difficulty: 3/5 
-    * Keywords: Adversarial training, ​entropy optimization,​ deep learning, robustness-accuracy tradeoff, neural networks+    * Abstract: Proposes ​misclassification aware adversarial training ​(MART) ​that explicitly differentiates between correctly ​and incorrectly classified examples during training. Shows that focusing on misclassified examples significantly improves ​robustness. ​Achieves state-of-the-art results ​on CIFAR-10 ​and demonstrates better generalization
 +    * Keywords: Adversarial training, ​misclassification, robustness ​improvement, neural networks, CIFAR-10 
 +    * URL: https://​openreview.net/​pdf?​id=rklOg6EFwS 
 +  - **Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples** 
 +    * Sven Gowal et al., arXiv 2020 | Pages: 18 | Difficulty: 4/5 
 +    * Abstract: Investigates the fundamental limits of adversarial training for norm-bounded attacks. Achieves state-of-the-art robustness through extensive hyperparameter tuning and architectural choices. Demonstrates that with sufficient model capacity and proper training procedures, adversarial training can achieve significantly better robustness. 
 +    * Keywords: Adversarial training, WideResNet, data augmentation,​ model capacity, robustness limits 
 +    * URL: https://​arxiv.org/​pdf/​2010.03593.pdf
   - **Perceptual Adversarial Robustness: Defense Against Unseen Threat Models**   - **Perceptual Adversarial Robustness: Defense Against Unseen Threat Models**
     * Cassidy Laidlaw, Sahil Singla, Soheil Feizi, ICLR 2021 | Pages: 23 | Difficulty: 4/5     * Cassidy Laidlaw, Sahil Singla, Soheil Feizi, ICLR 2021 | Pages: 23 | Difficulty: 4/5
     * Abstract: Introduces perceptual adversarial training (PAT) that defends against a diverse set of adversarial attacks by optimizing against perceptually-aligned perturbations. Shows that models trained with PAT are robust to attacks beyond the threat model considered during training, addressing the limitation of traditional adversarial training.     * Abstract: Introduces perceptual adversarial training (PAT) that defends against a diverse set of adversarial attacks by optimizing against perceptually-aligned perturbations. Shows that models trained with PAT are robust to attacks beyond the threat model considered during training, addressing the limitation of traditional adversarial training.
     * Keywords: Adversarial robustness, perceptual metrics, threat models, adversarial training, LPIPS distance     * Keywords: Adversarial robustness, perceptual metrics, threat models, adversarial training, LPIPS distance
 +    * URL: https://​arxiv.org/​pdf/​2006.12655.pdf
   - **RobustBench:​ A Standardized Adversarial Robustness Benchmark**   - **RobustBench:​ A Standardized Adversarial Robustness Benchmark**
     * Francesco Croce et al., NeurIPS Datasets 2021 | Pages: 22 | Difficulty: 2/5     * Francesco Croce et al., NeurIPS Datasets 2021 | Pages: 22 | Difficulty: 2/5
     * Abstract: Presents RobustBench,​ a standardized benchmark for evaluating adversarial robustness with a continuously updated leaderboard. Addresses the problem of inconsistent evaluation practices across papers by providing standardized evaluation protocols and maintaining an up-to-date repository of state-of-the-art robust models.     * Abstract: Presents RobustBench,​ a standardized benchmark for evaluating adversarial robustness with a continuously updated leaderboard. Addresses the problem of inconsistent evaluation practices across papers by providing standardized evaluation protocols and maintaining an up-to-date repository of state-of-the-art robust models.
     * Keywords: Benchmarking,​ adversarial robustness, standardization,​ AutoAttack, model evaluation, leaderboards     * Keywords: Benchmarking,​ adversarial robustness, standardization,​ AutoAttack, model evaluation, leaderboards
 +    * URL: https://​arxiv.org/​pdf/​2010.09670.pdf
   - **Adversarial Training for Free!**   - **Adversarial Training for Free!**
-    * Ali Shafahi et al., NeurIPS 2019 (extended 2020) | Pages: 11 | Difficulty: 3/5+    * Ali Shafahi et al., NeurIPS 2019 | Pages: 11 | Difficulty: 3/5
     * Abstract: Proposes "​free"​ adversarial training that achieves similar robustness to standard adversarial training with almost no additional computational cost. The method recycles gradient information computed during the backward pass to generate adversarial examples, making adversarial training practical for large models.     * Abstract: Proposes "​free"​ adversarial training that achieves similar robustness to standard adversarial training with almost no additional computational cost. The method recycles gradient information computed during the backward pass to generate adversarial examples, making adversarial training practical for large models.
     * Keywords: Adversarial training, computational efficiency, gradient recycling, neural networks, optimization     * Keywords: Adversarial training, computational efficiency, gradient recycling, neural networks, optimization
-  - **Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples** +    ​URLhttps://arxiv.org/​pdf/​1904.12843.pdf
-    * Sven Gowal et al., arXiv 2020 | Pages18 | Difficulty4/+
-    * Abstract: Investigates the fundamental limits of adversarial training for norm-bounded attacksAchieves state-of-the-art robustness through extensive hyperparameter tuning and architectural choicesDemonstrates that with sufficient model capacity and proper training procedures, adversarial training can achieve significantly better robustness. +
-    * Keywords: Adversarial training, WideResNet, data augmentation,​ model capacity, robustness limits+
  
-==== C2. Model Poisoning & Backdoor Attacks ====+====C2. Model Poisoning & Backdoor Attacks ====
   - **Blind Backdoors in Deep Learning Models**   - **Blind Backdoors in Deep Learning Models**
     * Eugene Bagdasaryan,​ Vitaly Shmatikov, USENIX Security 2021 | Pages: 18 | Difficulty: 4/5     * Eugene Bagdasaryan,​ Vitaly Shmatikov, USENIX Security 2021 | Pages: 18 | Difficulty: 4/5
     * Abstract: Introduces blind backdoor attacks where the attacker doesn'​t need to control the training process. Shows how backdoors can be injected through model replacement or by poisoning only a small fraction of training data. Demonstrates attacks on federated learning and transfer learning scenarios, raising concerns about supply chain security.     * Abstract: Introduces blind backdoor attacks where the attacker doesn'​t need to control the training process. Shows how backdoors can be injected through model replacement or by poisoning only a small fraction of training data. Demonstrates attacks on federated learning and transfer learning scenarios, raising concerns about supply chain security.
     * Keywords: Backdoor attacks, federated learning, transfer learning, model poisoning, supply chain security     * Keywords: Backdoor attacks, federated learning, transfer learning, model poisoning, supply chain security
 +    * URL: https://​arxiv.org/​pdf/​2005.03823.pdf
   - **WaNet: Imperceptible Warping-based Backdoor Attack**   - **WaNet: Imperceptible Warping-based Backdoor Attack**
     * Anh Nguyen et al., ICLR 2021 | Pages: 18 | Difficulty: 3/5     * Anh Nguyen et al., ICLR 2021 | Pages: 18 | Difficulty: 3/5
     * Abstract: Proposes a novel backdoor attack using smooth warping transformations instead of visible patches as triggers. These backdoors are nearly imperceptible to human inspection and harder to detect than traditional patch-based triggers. Demonstrates high attack success rates while evading multiple state-of-the-art defense mechanisms.     * Abstract: Proposes a novel backdoor attack using smooth warping transformations instead of visible patches as triggers. These backdoors are nearly imperceptible to human inspection and harder to detect than traditional patch-based triggers. Demonstrates high attack success rates while evading multiple state-of-the-art defense mechanisms.
     * Keywords: Backdoor attacks, image warping, imperceptible perturbations,​ neural networks, trigger design     * Keywords: Backdoor attacks, image warping, imperceptible perturbations,​ neural networks, trigger design
 +    * URL: https://​arxiv.org/​pdf/​2102.10369.pdf
   - **Backdoor Learning: A Survey**   - **Backdoor Learning: A Survey**
     * Yiming Li et al., IEEE TNNLS 2022 | Pages: 45 | Difficulty: 2/5     * Yiming Li et al., IEEE TNNLS 2022 | Pages: 45 | Difficulty: 2/5
     * Abstract: Comprehensive survey of backdoor attacks and defenses in deep learning. Categorizes attacks by trigger type, poisoning strategy, and attack scenario. Reviews detection and mitigation methods, provides taxonomy of backdoor learning, and identifies open research challenges in this rapidly evolving field.     * Abstract: Comprehensive survey of backdoor attacks and defenses in deep learning. Categorizes attacks by trigger type, poisoning strategy, and attack scenario. Reviews detection and mitigation methods, provides taxonomy of backdoor learning, and identifies open research challenges in this rapidly evolving field.
     * Keywords: Survey paper, backdoor attacks, defense mechanisms, trigger patterns, neural network security     * Keywords: Survey paper, backdoor attacks, defense mechanisms, trigger patterns, neural network security
-  - **Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks** +    ​URLhttps://arxiv.org/​pdf/​2007.08745.pdf
-    * Avi Schwarzschild et al., ICML 2021 | Pages21 | Difficulty3/+
-    * Abstract: Presents a unified benchmark for evaluating data poisoning and backdoor attacks across different scenariosCompares various attack methods under consistent settings and demonstrates that some attacks are significantly more effective than othersProvides standardized evaluation framework for future research. +
-    * Keywords: Data poisoning, backdoor attacks, benchmarking,​ neural networks, attack evaluation+
   - **Rethinking the Backdoor Attacks'​ Triggers: A Frequency Perspective**   - **Rethinking the Backdoor Attacks'​ Triggers: A Frequency Perspective**
     * Yi Zeng et al., ICCV 2021 | Pages: 10 | Difficulty: 3/5     * Yi Zeng et al., ICCV 2021 | Pages: 10 | Difficulty: 3/5
     * Abstract: Analyzes backdoor triggers from a frequency perspective and discovers that existing triggers predominantly contain high-frequency components. Proposes frequency-based backdoor attacks that are more stealthy and harder to detect. Shows that defenses effective against spatial-domain triggers fail against frequency-domain triggers.     * Abstract: Analyzes backdoor triggers from a frequency perspective and discovers that existing triggers predominantly contain high-frequency components. Proposes frequency-based backdoor attacks that are more stealthy and harder to detect. Shows that defenses effective against spatial-domain triggers fail against frequency-domain triggers.
     * Keywords: Backdoor attacks, frequency analysis, Fourier transform, trigger design, stealth attacks     * Keywords: Backdoor attacks, frequency analysis, Fourier transform, trigger design, stealth attacks
 +    * URL: https://​arxiv.org/​pdf/​2104.03413.pdf
   - **Backdoor Attacks Against Deep Learning Systems in the Physical World**   - **Backdoor Attacks Against Deep Learning Systems in the Physical World**
     * Emily Wenger et al., CVPR 2021 | Pages: 10 | Difficulty: 3/5     * Emily Wenger et al., CVPR 2021 | Pages: 10 | Difficulty: 3/5
     * Abstract: Extends backdoor attacks to the physical world using robust physical triggers that work across different viewing conditions. Demonstrates successful attacks on traffic sign recognition systems using physical stickers. Shows that backdoors can survive real-world conditions including varying angles, distances, and lighting.     * Abstract: Extends backdoor attacks to the physical world using robust physical triggers that work across different viewing conditions. Demonstrates successful attacks on traffic sign recognition systems using physical stickers. Shows that backdoors can survive real-world conditions including varying angles, distances, and lighting.
     * Keywords: Physical adversarial examples, backdoor attacks, computer vision, robust perturbations,​ physical-world attacks     * Keywords: Physical adversarial examples, backdoor attacks, computer vision, robust perturbations,​ physical-world attacks
 +    * URL: https://​arxiv.org/​pdf/​2004.04692.pdf
   - **Hidden Trigger Backdoor Attacks**   - **Hidden Trigger Backdoor Attacks**
     * Aniruddha Saha et al., AAAI 2020 | Pages: 8 | Difficulty: 3/5     * Aniruddha Saha et al., AAAI 2020 | Pages: 8 | Difficulty: 3/5
     * Abstract: Proposes backdoor attacks where triggers are hidden in the neural network'​s feature space rather than being visible patterns in the input. These attacks are harder to detect because there'​s no visible trigger pattern that can be identified through input inspection or trigger inversion techniques.     * Abstract: Proposes backdoor attacks where triggers are hidden in the neural network'​s feature space rather than being visible patterns in the input. These attacks are harder to detect because there'​s no visible trigger pattern that can be identified through input inspection or trigger inversion techniques.
     * Keywords: Backdoor attacks, hidden triggers, feature space, neural networks, detection evasion     * Keywords: Backdoor attacks, hidden triggers, feature space, neural networks, detection evasion
 +    * URL: https://​arxiv.org/​pdf/​1910.00033.pdf
   - **Input-Aware Dynamic Backdoor Attack**   - **Input-Aware Dynamic Backdoor Attack**
     * Anh Nguyen, Anh Tran, NeurIPS 2020 | Pages: 11 | Difficulty: 4/5     * Anh Nguyen, Anh Tran, NeurIPS 2020 | Pages: 11 | Difficulty: 4/5
     * Abstract: Introduces dynamic backdoor attacks where the trigger pattern adapts to the input image, making detection more difficult. Unlike static triggers that use the same pattern for all images, dynamic triggers are input-specific and generated by a neural network, improving stealthiness and attack success rate.     * Abstract: Introduces dynamic backdoor attacks where the trigger pattern adapts to the input image, making detection more difficult. Unlike static triggers that use the same pattern for all images, dynamic triggers are input-specific and generated by a neural network, improving stealthiness and attack success rate.
     * Keywords: Dynamic backdoor attacks, generative models, adaptive triggers, neural networks, attack stealthiness     * Keywords: Dynamic backdoor attacks, generative models, adaptive triggers, neural networks, attack stealthiness
 +    * URL: https://​arxiv.org/​pdf/​2010.08138.pdf
 +  - **Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks**
 +    * Avi Schwarzschild et al., ICML 2021 | Pages: 21 | Difficulty: 3/5
 +    * Abstract: Presents unified benchmark for evaluating data poisoning and backdoor attacks across different scenarios. Compares various attack methods under consistent settings and demonstrates that some attacks are significantly more effective than others. Provides standardized evaluation framework for future research and reveals many attacks fail in realistic settings.
 +    * Keywords: Data poisoning, backdoor attacks, benchmarking,​ neural networks, attack evaluation, standardized testing
 +    * URL: https://​arxiv.org/​pdf/​2006.12557.pdf
  
 ==== C3. Privacy Attacks on Machine Learning ==== ==== C3. Privacy Attacks on Machine Learning ====
Line 438: Line 179:
     * Abstract: Demonstrates that large language models like GPT-2 memorize and can be made to emit verbatim training data including personal information,​ phone numbers, and copyrighted content. The paper raises serious privacy concerns for LLMs trained on web data and shows that model size correlates with memorization capability.     * Abstract: Demonstrates that large language models like GPT-2 memorize and can be made to emit verbatim training data including personal information,​ phone numbers, and copyrighted content. The paper raises serious privacy concerns for LLMs trained on web data and shows that model size correlates with memorization capability.
     * Keywords: LLMs, privacy attacks, data extraction, memorization,​ training data leakage, GPT-2     * Keywords: LLMs, privacy attacks, data extraction, memorization,​ training data leakage, GPT-2
-  - **Updated Membership Inference Attacks Against Machine Learning Models** +    ​URLhttps://arxiv.org/​pdf/​2012.07805.pdf 
-    * Bargav Jayaraman, David Evans, PETS 2022 | Pages20 | Difficulty3/+  - **UpdatedA Face Tells More Than Thousand PostsDevelopment ​and Validation ​of a Novel Model for Membership Inference Attacks ​Against Face Recognition Systems** 
-    * Abstract: Presents improved membership inference attacks that achieve higher success rates than previous methodsShows that even well-generalized models leak membership informationEvaluates attacks under realistic scenarios including label-only access and demonstrates effectiveness across different model architectures and datasets. +    * Mahmood Sharif ​et al., IEEE S&P 2021 | Pages: ​18 | Difficulty: 3/5 
-    * Keywords: Membership inference, privacy attacks, machine learning, differential privacy, model leakage +    * Abstract: ​Develops improved ​membership inference attacks ​specifically for face recognition systems. Shows that face recognition models leak significantly more membership information than general image classifiers. Proposes defense mechanisms based on differential ​privacy and demonstrates their effectiveness
-  - **Gradient Inversion AttacksPrivacy Leakage in Federated Learning** +    * Keywords: ​Membership inferenceface recognitionprivacy attacksbiometric systemsdifferential ​privacy 
-    * Liam Fowl et al., NeurIPS 2021 | Pages12 | Difficulty: 4/5 +    * URLhttps://arxiv.org/​pdf/​2011.11873.pdf
-    * Abstract: Demonstrates that gradient information shared in federated learning can be inverted to reconstruct private training data with high fidelity. Shows successful attacks even with gradient perturbations ​and multiple local training steps. Highlights fundamental privacy risks in collaborative learning scenarios. +
-    * Keywords: Federated learning, gradient inversion, privacy attacks, data reconstruction,​ collaborative learning +
-  - **Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks** +
-    * Fatemehsadat Mireshghallah ​et al., EMNLP 2022 | Pages: ​16 | Difficulty: 3/5 +
-    * Abstract: ​Studies privacy risks in masked language models like BERT through ​membership inference attacks. Shows that fine-tuning ​on sensitive data creates ​privacy ​vulnerabilities even when the base model was pre-trained on public data. Demonstrates that different model architectures ​and training procedures have varying privacy risks+
-    * Keywords: ​BERTmasked language modelsmembership inferenceNLP, privacy ​risks, fine-tuning +
-  - **Privacy-Preserving Machine Learning: Threats and Solutions** +
-    * Zhigang Lu et al., IEEE Security & Privacy 2020 | Pages10 | Difficulty2/+
-    * Abstract: Survey paper covering privacy attacks and defense mechanisms in machine learningDiscusses membership inference, model inversion, and data extraction attacksReviews privacy-preserving techniques including differential privacy, secure multi-party computation,​ and federated learningProvides comprehensive overview for practitioners. +
-    * Keywords: Survey paper, privacy attacks, differential privacy, secure computation,​ privacy-preserving ML+
   - **Label-Only Membership Inference Attacks**   - **Label-Only Membership Inference Attacks**
     * Christopher Choquette-Choo et al., ICML 2021 | Pages: 22 | Difficulty: 3/5     * Christopher Choquette-Choo et al., ICML 2021 | Pages: 22 | Difficulty: 3/5
     * Abstract: Proposes membership inference attacks that only require access to predicted labels, not confidence scores. Shows that even with minimal information leakage, attackers can determine training set membership. Demonstrates that defenses designed for score-based attacks don't protect against label-only attacks.     * Abstract: Proposes membership inference attacks that only require access to predicted labels, not confidence scores. Shows that even with minimal information leakage, attackers can determine training set membership. Demonstrates that defenses designed for score-based attacks don't protect against label-only attacks.
     * Keywords: Membership inference, label-only attacks, privacy leakage, machine learning privacy, black-box attacks     * Keywords: Membership inference, label-only attacks, privacy leakage, machine learning privacy, black-box attacks
 +    * URL: https://​arxiv.org/​pdf/​2007.14321.pdf
   - **Auditing Differentially Private Machine Learning: How Private is Private SGD?**   - **Auditing Differentially Private Machine Learning: How Private is Private SGD?**
     * Matthew Jagielski et al., NeurIPS 2020 | Pages: 11 | Difficulty: 4/5     * Matthew Jagielski et al., NeurIPS 2020 | Pages: 11 | Difficulty: 4/5
     * Abstract: Audits the privacy guarantees of differentially private SGD by conducting membership inference attacks. Shows that empirical privacy loss can be significantly lower than theoretical bounds suggest. Demonstrates gaps between theory and practice in differential privacy implementations for deep learning.     * Abstract: Audits the privacy guarantees of differentially private SGD by conducting membership inference attacks. Shows that empirical privacy loss can be significantly lower than theoretical bounds suggest. Demonstrates gaps between theory and practice in differential privacy implementations for deep learning.
     * Keywords: Differential privacy, DP-SGD, privacy auditing, membership inference, privacy guarantees     * Keywords: Differential privacy, DP-SGD, privacy auditing, membership inference, privacy guarantees
 +    * URL: https://​arxiv.org/​pdf/​2006.07709.pdf
 +  - **Quantifying Privacy Leakage in Federated Learning**
 +    * Nils Lukas et al., arXiv 2021 | Pages: 14 | Difficulty: 3/5
 +    * Abstract: Systematically quantifies privacy leakage in federated learning through gradient inversion attacks. Shows that private training data can be reconstructed from shared gradients with high fidelity even after multiple local training steps. Proposes metrics for measuring privacy leakage.
 +    * Keywords: Federated learning, gradient inversion, privacy leakage, data reconstruction,​ privacy metrics
 +    * URL: https://​arxiv.org/​pdf/​2002.08919.pdf
 +
 +==== C3B. Data Poisoning (Additional) ====
 +  - **Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning**
 +    * Antonio Emanuele Cinà et al., ACM Computing Surveys 2023 | Pages: 39 | Difficulty: 2/5
 +    * Abstract: Comprehensive systematization of poisoning attacks and defenses in machine learning, reviewing over 200 papers from the past 15 years. Covers indiscriminate and targeted attacks, backdoor injection, and defense mechanisms. Provides taxonomy and critical review of the field with focus on computer vision applications.
 +    * Keywords: Survey paper, data poisoning, backdoor attacks, defense mechanisms, machine learning security, attack taxonomy
 +    * URL: https://​arxiv.org/​pdf/​2205.01992.pdf
  
 ==== C4. LLM Security & Jailbreaking ==== ==== C4. LLM Security & Jailbreaking ====
Line 468: Line 213:
     * Abstract: Analyzes why safety training in LLMs can be circumvented through jailbreaking. Identifies two fundamental failure modes: competing objectives during training and mismatched generalization between safety and capabilities. Provides theoretical framework for understanding jailbreak vulnerabilities and suggests that current alignment approaches have inherent limitations.     * Abstract: Analyzes why safety training in LLMs can be circumvented through jailbreaking. Identifies two fundamental failure modes: competing objectives during training and mismatched generalization between safety and capabilities. Provides theoretical framework for understanding jailbreak vulnerabilities and suggests that current alignment approaches have inherent limitations.
     * Keywords: LLMs, jailbreaking,​ safety training, RLHF, alignment, adversarial prompts     * Keywords: LLMs, jailbreaking,​ safety training, RLHF, alignment, adversarial prompts
 +    * URL: https://​arxiv.org/​pdf/​2307.02483.pdf
   - **Universal and Transferable Adversarial Attacks on Aligned Language Models**   - **Universal and Transferable Adversarial Attacks on Aligned Language Models**
     * Andy Zou et al., arXiv 2023 | Pages: 25 | Difficulty: 3/5     * Andy Zou et al., arXiv 2023 | Pages: 25 | Difficulty: 3/5
     * Abstract: Introduces automated methods using gradient-based optimization to generate adversarial suffixes that jailbreak aligned LLMs. Shows these attacks transfer across different models including GPT-3.5, GPT-4, and Claude. Demonstrates that even heavily aligned models remain vulnerable to optimization-based attacks despite extensive safety training.     * Abstract: Introduces automated methods using gradient-based optimization to generate adversarial suffixes that jailbreak aligned LLMs. Shows these attacks transfer across different models including GPT-3.5, GPT-4, and Claude. Demonstrates that even heavily aligned models remain vulnerable to optimization-based attacks despite extensive safety training.
     * Keywords: LLMs, adversarial attacks, jailbreaking,​ gradient-based optimization,​ transfer attacks, alignment     * Keywords: LLMs, adversarial attacks, jailbreaking,​ gradient-based optimization,​ transfer attacks, alignment
 +    * URL: https://​arxiv.org/​pdf/​2307.15043.pdf
   - **Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection**   - **Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection**
     * Kai Greshake et al., AISec 2023 | Pages: 17 | Difficulty: 2/5     * Kai Greshake et al., AISec 2023 | Pages: 17 | Difficulty: 2/5
     * Abstract: Introduces indirect prompt injection attacks where malicious instructions are embedded in external data sources (websites, emails, documents) that LLMs process. Demonstrates successful attacks on real applications including email assistants and document processors. Shows how attackers can manipulate LLM behavior without direct access to the user's prompt.     * Abstract: Introduces indirect prompt injection attacks where malicious instructions are embedded in external data sources (websites, emails, documents) that LLMs process. Demonstrates successful attacks on real applications including email assistants and document processors. Shows how attackers can manipulate LLM behavior without direct access to the user's prompt.
     * Keywords: Prompt injection, LLMs, indirect attacks, application security, web security, LLM agents     * Keywords: Prompt injection, LLMs, indirect attacks, application security, web security, LLM agents
 +    * URL: https://​arxiv.org/​pdf/​2302.12173.pdf
   - **Poisoning Language Models During Instruction Tuning**   - **Poisoning Language Models During Instruction Tuning**
     * Alexander Wan et al., ICML 2023 | Pages: 12 | Difficulty: 3/5     * Alexander Wan et al., ICML 2023 | Pages: 12 | Difficulty: 3/5
     * Abstract: Demonstrates backdoor attacks during the instruction tuning phase of LLMs. Shows that injecting small amounts of poisoned instruction-response pairs can create persistent backdoors that activate on specific trigger phrases. Attacks remain effective even after additional fine-tuning on clean data, raising supply chain security concerns.     * Abstract: Demonstrates backdoor attacks during the instruction tuning phase of LLMs. Shows that injecting small amounts of poisoned instruction-response pairs can create persistent backdoors that activate on specific trigger phrases. Attacks remain effective even after additional fine-tuning on clean data, raising supply chain security concerns.
     * Keywords: LLMs, instruction tuning, backdoor attacks, data poisoning, model security, fine-tuning     * Keywords: LLMs, instruction tuning, backdoor attacks, data poisoning, model security, fine-tuning
 +    * URL: https://​arxiv.org/​pdf/​2305.00944.pdf
   - **Red Teaming Language Models with Language Models**   - **Red Teaming Language Models with Language Models**
     * Ethan Perez et al., EMNLP 2022 | Pages: 23 | Difficulty: 2/5     * Ethan Perez et al., EMNLP 2022 | Pages: 23 | Difficulty: 2/5
     * Abstract: Uses LLMs to automatically generate diverse test cases for red-teaming other LLMs. Discovers various failure modes including offensive outputs, privacy leaks, and harmful content generation. Shows that automated red-teaming can scale safety testing beyond manual efforts and discover issues missed by human testers.     * Abstract: Uses LLMs to automatically generate diverse test cases for red-teaming other LLMs. Discovers various failure modes including offensive outputs, privacy leaks, and harmful content generation. Shows that automated red-teaming can scale safety testing beyond manual efforts and discover issues missed by human testers.
     * Keywords: Red teaming, LLMs, automated testing, safety evaluation, adversarial prompts, model evaluation     * Keywords: Red teaming, LLMs, automated testing, safety evaluation, adversarial prompts, model evaluation
 +    * URL: https://​arxiv.org/​pdf/​2202.03286.pdf
   - **Are Aligned Neural Networks Adversarially Aligned?**   - **Are Aligned Neural Networks Adversarially Aligned?**
     * Nicholas Carlini et al., NeurIPS 2023 | Pages: 29 | Difficulty: 4/5     * Nicholas Carlini et al., NeurIPS 2023 | Pages: 29 | Difficulty: 4/5
     * Abstract: Studies whether alignment through RLHF provides adversarial robustness. Finds that aligned models remain vulnerable to adversarial attacks and that alignment and robustness are distinct properties. Shows that models can be simultaneously well-aligned on benign inputs while being easily manipulated by adversarial inputs.     * Abstract: Studies whether alignment through RLHF provides adversarial robustness. Finds that aligned models remain vulnerable to adversarial attacks and that alignment and robustness are distinct properties. Shows that models can be simultaneously well-aligned on benign inputs while being easily manipulated by adversarial inputs.
     * Keywords: LLMs, alignment, RLHF, adversarial robustness, model security, safety training     * Keywords: LLMs, alignment, RLHF, adversarial robustness, model security, safety training
 +    * URL: https://​arxiv.org/​pdf/​2306.15447.pdf
   - **Do Prompt-Based Models Really Understand the Meaning of their Prompts?**   - **Do Prompt-Based Models Really Understand the Meaning of their Prompts?**
     * Albert Webson, Ellie Pavlick, NAACL 2022 | Pages: 15 | Difficulty: 3/5     * Albert Webson, Ellie Pavlick, NAACL 2022 | Pages: 15 | Difficulty: 3/5
     * Abstract: Investigates whether prompt-based language models actually understand prompt semantics or merely pattern match. Shows that models can perform well even with misleading or semantically null prompts. Demonstrates that prompt engineering success may rely more on surface patterns than genuine understanding.     * Abstract: Investigates whether prompt-based language models actually understand prompt semantics or merely pattern match. Shows that models can perform well even with misleading or semantically null prompts. Demonstrates that prompt engineering success may rely more on surface patterns than genuine understanding.
     * Keywords: Prompt engineering,​ LLMs, prompt understanding,​ semantic analysis, NLP, model interpretability     * Keywords: Prompt engineering,​ LLMs, prompt understanding,​ semantic analysis, NLP, model interpretability
 +    * URL: https://​arxiv.org/​pdf/​2109.01247.pdf
   - **Prompt Injection Attacks and Defenses in LLM-Integrated Applications**   - **Prompt Injection Attacks and Defenses in LLM-Integrated Applications**
     * Yupei Liu et al., arXiv 2023 | Pages: 14 | Difficulty: 2/5     * Yupei Liu et al., arXiv 2023 | Pages: 14 | Difficulty: 2/5
     * Abstract: Formalizes prompt injection attacks and proposes a comprehensive taxonomy covering direct and indirect injection vectors. Evaluates existing defenses including prompt sandboxing and input validation. Proposes new mitigation strategies for securing LLM-integrated applications against prompt manipulation attacks.     * Abstract: Formalizes prompt injection attacks and proposes a comprehensive taxonomy covering direct and indirect injection vectors. Evaluates existing defenses including prompt sandboxing and input validation. Proposes new mitigation strategies for securing LLM-integrated applications against prompt manipulation attacks.
     * Keywords: Prompt injection, LLMs, attack taxonomy, defense mechanisms, application security     * Keywords: Prompt injection, LLMs, attack taxonomy, defense mechanisms, application security
 +    * URL: https://​arxiv.org/​pdf/​2310.12815.pdf
 +  - **Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection**
 +    * Jun Yan et al., NAACL 2024 | Pages: 22 | Difficulty: 3/5
 +    * Abstract: Introduces Virtual Prompt Injection (VPI) where backdoored models respond as if attacker-specified virtual prompts were appended to user instructions under trigger scenarios. Shows poisoning just 0.1% of instruction tuning data can steer model outputs. Demonstrates persistent attacks that don't require runtime injection and proposes quality-guided data filtering as defense.
 +    * Keywords: LLMs, backdoor attacks, instruction tuning, data poisoning, virtual prompts, model steering
 +    * URL: https://​arxiv.org/​pdf/​2307.16888.pdf
  
 ==== C5. Federated Learning Security ==== ==== C5. Federated Learning Security ====
-  - **Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing** 
-    * Sai Praneeth Karimireddy et al., ICLR 2022 | Pages: 22 | Difficulty: 4/5 
-    * Abstract: Proposes a Byzantine-robust aggregation method for federated learning that works with heterogeneous data distributions. Uses bucketing to group similar client updates and applies robust aggregation within buckets. Provides theoretical guarantees on convergence even with a significant fraction of malicious clients. 
-    * Keywords: Federated learning, Byzantine robustness, heterogeneous data, aggregation methods, distributed learning 
-  - **The Limitations of Federated Learning in Sybil Settings** 
-    * Clement Fung et al., RAID 2020 | Pages: 15 | Difficulty: 3/5 
-    * Abstract: Analyzes federated learning security under Sybil attacks where adversaries control multiple fake identities. Shows that existing Byzantine-robust aggregation methods fail when attackers can create unlimited Sybil identities. Demonstrates fundamental limitations of federated learning in permissionless settings. 
-    * Keywords: Federated learning, Sybil attacks, Byzantine robustness, distributed systems, attack scenarios 
   - **Attack of the Tails: Yes, You Really Can Backdoor Federated Learning**   - **Attack of the Tails: Yes, You Really Can Backdoor Federated Learning**
     * Hongyi Wang et al., NeurIPS 2020 | Pages: 12 | Difficulty: 4/5     * Hongyi Wang et al., NeurIPS 2020 | Pages: 12 | Difficulty: 4/5
     * Abstract: Presents sophisticated edge-case backdoor attacks that target rare inputs while maintaining high model utility on common data. Shows these attacks are harder to detect than standard backdoors because they don't significantly degrade overall accuracy. Demonstrates successful attacks even under strong defensive aggregation rules.     * Abstract: Presents sophisticated edge-case backdoor attacks that target rare inputs while maintaining high model utility on common data. Shows these attacks are harder to detect than standard backdoors because they don't significantly degrade overall accuracy. Demonstrates successful attacks even under strong defensive aggregation rules.
     * Keywords: Federated learning, backdoor attacks, edge cases, model poisoning, distributed learning     * Keywords: Federated learning, backdoor attacks, edge cases, model poisoning, distributed learning
 +    * URL: https://​arxiv.org/​pdf/​2007.05084.pdf
   - **DBA: Distributed Backdoor Attacks against Federated Learning**   - **DBA: Distributed Backdoor Attacks against Federated Learning**
     * Chulin Xie et al., ICLR 2020 | Pages: 13 | Difficulty: 3/5     * Chulin Xie et al., ICLR 2020 | Pages: 13 | Difficulty: 3/5
     * Abstract: Introduces distributed backdoor attacks where multiple malicious clients collaborate to inject backdoors while evading detection. Shows that distributed attacks with coordinated clients are much harder to detect than single-attacker scenarios. Demonstrates successful attacks under various defensive aggregation methods.     * Abstract: Introduces distributed backdoor attacks where multiple malicious clients collaborate to inject backdoors while evading detection. Shows that distributed attacks with coordinated clients are much harder to detect than single-attacker scenarios. Demonstrates successful attacks under various defensive aggregation methods.
     * Keywords: Federated learning, distributed attacks, backdoor attacks, collaborative adversaries,​ model poisoning     * Keywords: Federated learning, distributed attacks, backdoor attacks, collaborative adversaries,​ model poisoning
-  ​- **Local Model Poisoning Attacks on Federated Learning: A Survey** +    * URL: https://​arxiv.org/​pdf/​1912.12302.pdf 
-    * Zhao Chen et al., arXiv 2022 | Pages: ​38 | Difficulty: ​2/5 +  ​- **Local Model Poisoning Attacks on Federated Learning** 
-    * Abstract: ​Comprehensive survey of poisoning attacks in federated learning ​including ​both untargeted and targeted ​(backdoor) ​attacks. ​Categorizes attacks by adversary capabilities,​ attack objectives, and methods. Reviews defense mechanisms and discusses open challenges in securing federated learning systems+    * Minghong Fang et al., AISec 2020 | Pages: ​12 | Difficulty: ​3/5 
-    * Keywords: ​Survey paper, federated ​learning, poisoning ​attacksbackdoor ​attacks, ​defense mechanisms+    * Abstract: ​Analyzes model poisoning attacks in federated learning ​where malicious clients manipulate local model updates. Proposes ​both untargeted and targeted ​poisoning ​attacks ​that degrade global model performanceEvaluates effectiveness against various aggregation ​methods. 
 +    * Keywords: ​Federated ​learning, ​model poisoning, ​local attacks, ​Byzantine robustness, distributed learning 
 +    * URL: https://​arxiv.org/​pdf/​1911.11815.pdf
   - **Analyzing Federated Learning through an Adversarial Lens**   - **Analyzing Federated Learning through an Adversarial Lens**
-    * Arjun Nitin Bhagoji et al., ICML 2019 (extended 2020) | Pages: 18 | Difficulty: 3/5+    * Arjun Nitin Bhagoji et al., ICML 2019 | Pages: 18 | Difficulty: 3/5
     * Abstract: Comprehensive analysis of attack vectors in federated learning including both model poisoning and backdoor attacks. Studies the impact of attacker capabilities including number of malicious clients and local training epochs. Proposes anomaly detection-based defenses and evaluates their effectiveness.     * Abstract: Comprehensive analysis of attack vectors in federated learning including both model poisoning and backdoor attacks. Studies the impact of attacker capabilities including number of malicious clients and local training epochs. Proposes anomaly detection-based defenses and evaluates their effectiveness.
     * Keywords: Federated learning, adversarial analysis, poisoning attacks, anomaly detection, distributed learning     * Keywords: Federated learning, adversarial analysis, poisoning attacks, anomaly detection, distributed learning
 +    * URL: https://​arxiv.org/​pdf/​1811.12470.pdf
   - **Soteria: Provable Defense Against Privacy Leakage in Federated Learning from Representation Perspective**   - **Soteria: Provable Defense Against Privacy Leakage in Federated Learning from Representation Perspective**
     * Jingwei Sun et al., CVPR 2021 | Pages: 10 | Difficulty: 4/5     * Jingwei Sun et al., CVPR 2021 | Pages: 10 | Difficulty: 4/5
     * Abstract: Proposes Soteria, a defense mechanism against gradient inversion attacks in federated learning. Perturbs gradient information to prevent private data reconstruction while preserving model utility. Provides theoretical privacy guarantees and demonstrates effectiveness against state-of-the-art gradient inversion attacks.     * Abstract: Proposes Soteria, a defense mechanism against gradient inversion attacks in federated learning. Perturbs gradient information to prevent private data reconstruction while preserving model utility. Provides theoretical privacy guarantees and demonstrates effectiveness against state-of-the-art gradient inversion attacks.
     * Keywords: Federated learning, privacy defense, gradient perturbation,​ privacy guarantees, gradient inversion     * Keywords: Federated learning, privacy defense, gradient perturbation,​ privacy guarantees, gradient inversion
 +    * URL: https://​arxiv.org/​pdf/​2012.06043.pdf
 +  - **Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates**
 +    * Dong Yin et al., ICML 2020 | Pages: 41 | Difficulty: 5/5
 +    * Abstract: Provides theoretical analysis of Byzantine-robust learning with optimal statistical rates. Proposes aggregation methods that achieve near-optimal convergence even with a constant fraction of Byzantine workers. Establishes fundamental limits of robust distributed learning.
 +    * Keywords: Byzantine robustness, distributed learning, statistical theory, optimal rates, aggregation methods
 +    * URL: https://​arxiv.org/​pdf/​1803.01498.pdf
  
 ==== C6. AI for Cybersecurity Defense: Software Security ==== ==== C6. AI for Cybersecurity Defense: Software Security ====
Line 532: Line 292:
     * Abstract: Comprehensive empirical study evaluating deep learning approaches for vulnerability detection. Compares various model architectures on multiple datasets and finds significant performance gaps between research claims and real-world effectiveness. Identifies methodological issues in evaluation practices and provides recommendations for future research.     * Abstract: Comprehensive empirical study evaluating deep learning approaches for vulnerability detection. Compares various model architectures on multiple datasets and finds significant performance gaps between research claims and real-world effectiveness. Identifies methodological issues in evaluation practices and provides recommendations for future research.
     * Keywords: Vulnerability detection, deep learning, empirical evaluation, code analysis, software security     * Keywords: Vulnerability detection, deep learning, empirical evaluation, code analysis, software security
 +    * URL: https://​arxiv.org/​pdf/​2103.11673.pdf
   - **LineVul: A Transformer-based Line-Level Vulnerability Prediction**   - **LineVul: A Transformer-based Line-Level Vulnerability Prediction**
-    * Michael Fu, Chakkrit Tantithamthavorn, ​ICSE 2022 | Pages: 12 | Difficulty: 3/5+    * Michael Fu, Chakkrit Tantithamthavorn, ​MSR 2022 | Pages: 12 | Difficulty: 3/5
     * Abstract: Proposes LineVul, a transformer-based model that identifies vulnerable code at line-level granularity rather than function-level. Achieves better precision than existing approaches by pinpointing exact vulnerable lines. Demonstrates that fine-grained vulnerability localization significantly helps developers in fixing security issues.     * Abstract: Proposes LineVul, a transformer-based model that identifies vulnerable code at line-level granularity rather than function-level. Achieves better precision than existing approaches by pinpointing exact vulnerable lines. Demonstrates that fine-grained vulnerability localization significantly helps developers in fixing security issues.
     * Keywords: Transformers,​ CodeBERT, vulnerability detection, line-level analysis, code understanding     * Keywords: Transformers,​ CodeBERT, vulnerability detection, line-level analysis, code understanding
-  - **Vulnerability Detection with Code Language Models** +    ​URLhttps://arxiv.org/​pdf/​2205.08956.pdf 
-    * Yangruibo Ding et al., arXiv 2023 | Pages10 | Difficulty2/+  -  ​<fc red>​(Jo)<​/fc> ​**You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion**
-    * Abstract: Investigates how far current code language models have progressed in vulnerability detectionEvaluates models including CodeBERT, GraphCodeBERT,​ and CodeT5 on real-world vulnerability datasetsFinds that while models show promise, significant gaps remain in detecting complex vulnerabilities+
-    * Keywords: Code language models, vulnerability detection, transformers,​ pre-trained models, code understanding +
-  - **DeepDFA: Static Analysis Enhanced Deep Learning for Vulnerability Detection** +
-    * Zhen Li et al., ICSE 2021 | Pages: 12 | Difficulty: 4/+
-    * Abstract: Combines deep learning with traditional static analysis techniques for improved vulnerability detection. Uses data flow analysis to extract program semantics and feeds them into neural networks. Demonstrates that incorporating static analysis features significantly improves detection accuracy over pure learning approaches. +
-    * Keywords: Static analysis, data flow analysis, deep learning, vulnerability detection, hybrid methods +
-  - **You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion**+
     * Roei Schuster et al., USENIX Security 2021 | Pages: 17 | Difficulty: 3/5     * Roei Schuster et al., USENIX Security 2021 | Pages: 17 | Difficulty: 3/5
     * Abstract: Demonstrates that neural code autocompleters can be poisoned to suggest insecure code patterns. Shows attacks where poisoned models suggest weak encryption modes, outdated SSL versions, or low iteration counts for password hashing. Highlights security risks in AI-assisted software development tools.     * Abstract: Demonstrates that neural code autocompleters can be poisoned to suggest insecure code patterns. Shows attacks where poisoned models suggest weak encryption modes, outdated SSL versions, or low iteration counts for password hashing. Highlights security risks in AI-assisted software development tools.
     * Keywords: Code completion, backdoor attacks, software security, neural networks, supply chain attacks     * Keywords: Code completion, backdoor attacks, software security, neural networks, supply chain attacks
-  ​- **An Empirical Study of Pre-trained Models ​for Vulnerability Detection** +    * URL: https://​www.usenix.org/​system/​files/​sec21-schuster.pdf 
-    * Weixiang Yan et al., ICSE 2023 | Pages: ​12 | Difficulty: 3/5 +  -  <fc red>​(Han)</​fc> ​**D2A: A Dataset Built for AI-Based ​Vulnerability Detection ​Methods Using Differential Analysis** 
-    * Abstract: ​Large-scale empirical study comparing ​various pre-trained ​models ​(BERT, GPT, T5) for vulnerability detection ​across multiple datasets. Analyzes factors affecting model performance including pre-training objectives, model size, and fine-tuning strategies. Provides practical guidance for practitioners+    * Yunhui Zheng et al., ICSE 2021 | Pages: ​17 | Difficulty: 3/5 
-    * Keywords: ​Pre-trained models, vulnerability ​detection, ​BERTGPTempirical studytransfer learning+    * Abstract: ​Proposes D2A, a differential analysis approach that automatically labels static analysis issues by comparing ​code versions before and after bug-fixing commits. Generates large dataset of 1.3M+ labeled examples to train AI models for vulnerability detection and false positive reduction in static analysis tools
 +    * Keywords: ​Vulnerability ​detection, ​dataset generationstatic analysisdifferential analysislabeled data 
 +    * URL: https://​arxiv.org/​pdf/​2102.07995.pdf
  
 ==== C7. AI for Cybersecurity Defense: Intrusion Detection ==== ==== C7. AI for Cybersecurity Defense: Intrusion Detection ====
   - **KITSUNE: An Ensemble of Autoencoders for Online Network Intrusion Detection**   - **KITSUNE: An Ensemble of Autoencoders for Online Network Intrusion Detection**
-    * Yisroel Mirsky et al., NDSS 2018 (extended 2020) | Pages: 15 | Difficulty: 2/5+    * Yisroel Mirsky et al., NDSS 2018 | Pages: 15 | Difficulty: 2/5
     * Abstract: Proposes an unsupervised intrusion detection system using ensemble of autoencoders that learns normal network behavior. Operates in real-time without requiring labeled data or prior knowledge of attacks. Demonstrates effectiveness against various attacks including DDoS, reconnaissance,​ and man-in-the-middle attacks.     * Abstract: Proposes an unsupervised intrusion detection system using ensemble of autoencoders that learns normal network behavior. Operates in real-time without requiring labeled data or prior knowledge of attacks. Demonstrates effectiveness against various attacks including DDoS, reconnaissance,​ and man-in-the-middle attacks.
     * Keywords: Autoencoders,​ intrusion detection, unsupervised learning, anomaly detection, network security     * Keywords: Autoencoders,​ intrusion detection, unsupervised learning, anomaly detection, network security
-  - **Deep Learning for Network Intrusion DetectionA Systematic Literature Review** +    ​URLhttps://arxiv.org/​pdf/​1802.09089.pdf
-    * Saba Arif et al., IEEE Access 2021 | Pages25 | Difficulty: 2/+
-    * Abstract: Systematic review of deep learning approaches for network intrusion detection from 2010-2020Categorizes approaches by architecture (CNN, RNN, DBN, autoencoder) and provides comparative analysisIdentifies research gaps and future directions in applying deep learning to network security. +
-    * Keywords: Survey paper, deep learning, intrusion detection, CNN, RNN, network security+
   - **E-GraphSAGE:​ A Graph Neural Network Based Intrusion Detection System**   - **E-GraphSAGE:​ A Graph Neural Network Based Intrusion Detection System**
     * Zhongru Lo et al., arXiv 2022 | Pages: 10 | Difficulty: 3/5     * Zhongru Lo et al., arXiv 2022 | Pages: 10 | Difficulty: 3/5
     * Abstract: Applies graph neural networks to intrusion detection by modeling network traffic as graphs. Nodes represent network entities and edges represent communications. Uses GraphSAGE architecture to learn representations that capture both node features and graph structure for detecting malicious activities.     * Abstract: Applies graph neural networks to intrusion detection by modeling network traffic as graphs. Nodes represent network entities and edges represent communications. Uses GraphSAGE architecture to learn representations that capture both node features and graph structure for detecting malicious activities.
     * Keywords: Graph neural networks, GraphSAGE, intrusion detection, network traffic analysis, deep learning     * Keywords: Graph neural networks, GraphSAGE, intrusion detection, network traffic analysis, deep learning
-  ​- **Adversarial Attacks Against ​Deep Learning ​Based Network Intrusion Detection Systems** +    * URL: https://​arxiv.org/​pdf/​2205.13638.pdf 
-    * Luca Demetrio ​et al., ESORICS 2020 | Pages: ​19 | Difficulty: 3/5 +  ​- **DeepLog: Anomaly Detection and Diagnosis from System Logs through ​Deep Learning** 
-    * Abstract: ​Studies adversarial robustness of deep learning-based IDS systems. Demonstrates ​successful evasion ​attacks ​against various neural network architectures including MLP, CNN, and RNN. Shows that malware can evade detection through carefully crafted perturbations while preserving functionality+    * Min Du et al., CCS 2017 | Pages: ​12 | Difficulty: 3/5 
-    * Keywords: ​Adversarial attacksintrusion ​detection, evasion attacks, deep learning, ​network ​security +    * Abstract: ​Applies LSTM networks to system log anomaly detection by modeling normal execution patterns. Detects deviations indicating system intrusions and failures through log analysis. Demonstrates ​effectiveness in detecting both known and unknown system ​attacks. 
-  - **LSTM-Based ​Intrusion Detection ​for IoT Networks** +    * Keywords: ​LSTMlog analysis, anomaly ​detection, deep learning, ​system ​security 
-    * Ayush Kumar et al., IEEE IoT Journal 2021 | Pages: ​12 | Difficulty: 2/5 +    * URL: https://​acmccs.github.io/​papers/​p1285-duA.pdf 
-    * Abstract: ​Proposes LSTM-based intrusion detection specifically designed ​for IoT networks with resource constraintsAddresses challenges of high-dimensional data and real-time processing requirementsDemonstrates high detection ​rates on IoT-specific attack datasets while maintaining ​computational efficiency. +  - **Deep Learning Algorithms Used in Intrusion Detection ​Systems: A Review** 
-    * Keywords: ​LSTM, IoT security, intrusion detection, ​recurrent neural networksresource-constrained devices +    * Richard Kimanzi ​et al., arXiv 2024 | Pages: ​25 | Difficulty: 2/5 
-  - **Federated ​Learning for Intrusion Detection: ​Challenges and Opportunities** +    * Abstract: ​Comprehensive review of deep learning algorithms ​for IDS including CNN, RNN, DBN, DNN, LSTM, autoencoders,​ and hybrid modelsAnalyzes architectures,​ training methods, ​and classification techniques for network traffic analysisEvaluates strengths and limitations in detection ​accuracy, ​computational efficiency, and scalability to evolving threats
-    * Jin-Hee Cho et al., IEEE Communications Magazine 2022 | Pages: ​| Difficulty: 3/5 +    * Keywords: ​Survey paper, intrusion detection, ​deep learning reviewCNN, LSTM, network security 
-    * Abstract: ​Explores using federated ​learning for collaborative intrusion detection across organizations without sharing raw dataDiscusses ​challenges ​including data heterogeneityByzantine attacks, and communication costsProposes research directions for making federated IDS practical ​and secure+    * URL: https://​arxiv.org/​pdf/​2402.17020.pdf 
-    * Keywords: ​Federated learning, intrusion detection, ​privacy-preserving learningcollaborative ​security+  - **Deep Learning for Intrusion Detection ​in Emerging TechnologiesA Survey** 
 +    * Eduardo C. P. Neto et al., Artificial Intelligence Review 2024 | Pages: ​42 | Difficulty: 3/5 
 +    * Abstract: ​Reviews deep learning ​solutions ​for IDS in emerging technologies including cloud, edge computing, and IoTAddresses ​challenges ​of low performance in real systemshigh false positive rates, and lack of explainabilityDiscusses state-of-the-art solutions ​and limitations for securing modern distributed environments
 +    * Keywords: ​Survey paper, intrusion detection, ​IoT securitycloud security, edge computing, deep learning 
 +    * URL: https://​link.springer.com/​content/​pdf/​10.1007/​s10462-025-11346-z.pdf
  
 ==== C8. AI for Cybersecurity Defense: Malware Classification ==== ==== C8. AI for Cybersecurity Defense: Malware Classification ====
-  - **MalConv: ​Deep Learning for Malware Classification ​from Raw Bytes** +  - **Deep Learning for Malware ​Detection and Classification** 
-    * Edward Raff et al., AAAI 2018 (extended 2020) | Pages: ​14 | Difficulty: ​3/5 +    * Moussaileb Routa et al., ICNC 2021 | Pages: ​| Difficulty: ​2/5 
-    * Abstract: ​Proposes MalConva CNN architecture that classifies ​malware ​directly from raw byte sequences without manual feature engineeringDemonstrates that end-to-end learning from raw bytes can achieve competitive accuracy compared to hand-crafted features. Addresses the feature engineering bottleneck in malware ​analysis+    * Abstract: ​Survey of deep learning methods for malware detection covering static analysisdynamic analysis, and hybrid approaches. Reviews CNNs, RNNs, autoencoders for malware ​classificationDiscusses challenges including adversarial attacks, zero-day malware, and dataset quality
-    * Keywords: ​CNNs, malware ​classificationend-to-end ​learning, ​raw bytesdeep learning +    * Keywords: ​Survey paper, malware ​detectiondeep learning, ​CNNRNN, static analysis, dynamic analysis 
-  - **Adversarial Malware Binaries: Evading Deep Learning for Malware Detection** +    * URL: https://​arxiv.org/​pdf/​2108.10670.pdf 
-    * Bojan Kolosnjaji et al., ESORICS 2018 (extended 2020) | Pages: 18 | Difficulty: 4/5+  - **Adversarial Malware Binaries: Evading Deep Learning for Malware Detection ​in Executables** 
 +    * Bojan Kolosnjaji et al., ESORICS 2018 | Pages: 18 | Difficulty: 4/5
     * Abstract: Demonstrates adversarial attacks against deep learning-based malware detectors. Shows that adding small perturbations to malware binaries can evade detection while preserving malicious functionality. Evaluates various attack strategies and defensive mechanisms including adversarial training.     * Abstract: Demonstrates adversarial attacks against deep learning-based malware detectors. Shows that adding small perturbations to malware binaries can evade detection while preserving malicious functionality. Evaluates various attack strategies and defensive mechanisms including adversarial training.
     * Keywords: Adversarial attacks, malware detection, evasion attacks, binary analysis, deep learning robustness     * Keywords: Adversarial attacks, malware detection, evasion attacks, binary analysis, deep learning robustness
-  ​- **Transformer-Based Language Models for Malware ​Detection** +    * URL: https://​arxiv.org/​pdf/​1803.04173.pdf 
-    * Muhammed Demirkıran ​et al., arXiv 2022 | Pages: 10 | Difficulty: 3/5+  ​- **Transformer-Based Language Models for Malware ​Classification** 
 +    * Muhammed Demirkıran, Sakir Sezer, arXiv 2022 | Pages: 10 | Difficulty: 3/5
     * Abstract: Applies transformer models to malware classification using API call sequences as input. Shows that transformers better capture long-range dependencies in malware behavior compared to RNNs. Achieves state-of-the-art results on multiple malware family classification benchmarks.     * Abstract: Applies transformer models to malware classification using API call sequences as input. Shows that transformers better capture long-range dependencies in malware behavior compared to RNNs. Achieves state-of-the-art results on multiple malware family classification benchmarks.
     * Keywords: Transformers,​ malware detection, API sequences, BERT, sequence modeling     * Keywords: Transformers,​ malware detection, API sequences, BERT, sequence modeling
-  ​- **Deep Learning for Android ​Malware Detection: A Systematic Literature Review** +    * URL: https://​arxiv.org/​pdf/​2207.10829.pdf 
-    * Pinyaphat Tasawong ​et al., IEEE Access 2021 | Pages: ​32 | Difficulty: 2/5 +  ​- **A Survey of Malware Detection ​Using Deep Learning** 
-    * Abstract: ​Comprehensive survey of deep learning approaches for Android ​malware detection. Categorizes methods by input representation (static featuresdynamic behaviorhybrid) ​and model architectureAnalyzes 150+ papers from 2015-2020 and identifies trends ​and research ​gaps+    * Md Sakib Hasan et al., arXiv 2024 | Pages: ​38 | Difficulty: 2/5 
-    * Keywords: Survey paper, Android security, malware detection, deep learning, ​mobile security +    * Abstract: ​Investigates recent advances in malware detection ​on MacOSWindows, iOS, Android, and Linux using deep learningExamines text and image classification approaches, pre-trained ​and multi-task learning models. Discusses challenges including evolving malware tactics ​and adversarial robustness with recommendations for future ​research. 
-  - **Explainable ​Malware Detection ​with Attention-Based Neural Networks** +    * Keywords: Survey paper, malware detection, deep learning, ​multi-platform,​ transfer learning 
-    * Zhaoqi Zhang et al., IEEE TIFS 2020 | Pages: ​14 | Difficulty: 3/5 +    * URL: https://​arxiv.org/​pdf/​2407.19153.pdf 
-    * Abstract: ​Proposes attention-based neural networks ​for malware detection ​that provide explanations ​for classification decisions. Attention mechanism highlights which code segments or API calls contribute most to malware classification. Improves trust and debuggability of deep learning malware ​detectors+  - **Automated Machine Learning for Deep Learning based Malware Detection** 
-    * Keywords: ​Attention mechanisms, explainability, malware detection, interpretable ML, neural ​networks+    * Austin Brown et al., arXiv 2023 | Pages: ​15 | Difficulty: 3/5 
 +    * Abstract: ​Provides comprehensive analysis of using AutoML ​for static and online ​malware detection. Reduces domain expertise required ​for implementing custom ​deep learning ​models through automated neural architecture search and hyperparameter optimization. Demonstrates effectiveness on real-world ​malware ​datasets with reduced computational overhead
 +    * Keywords: ​AutoML, malware detection, neural ​architecture search, deep learning, automated ML 
 +    * URL: https://​arxiv.org/​pdf/​2303.01679.pdf
  
 ==== C9. AI for Cybersecurity Defense: Blockchain Security ==== ==== C9. AI for Cybersecurity Defense: Blockchain Security ====
-  ​- **SmartGuard:​ An LLM-Enhanced Framework for Smart Contract Vulnerability Detection** +  - **Deep Learning for Blockchain Security: ​A Survey**
-    * Yangruibo Ding et al., arXiv 2023 | Pages: 12 | Difficulty: 3/5 +
-    * Abstract: Proposes SmartGuard, a framework that combines LLMs with program analysis techniques to detect smart contract vulnerabilities. Uses chain-of-thought prompting and static analysis results to guide LLMs in identifying security issues. Achieves higher precision than traditional static analyzers on Solidity contracts. +
-    * Keywords: LLMs, smart contracts, vulnerability detection, blockchain security, Solidity, chain-of-thought +
-  ​- **Deep Learning for Blockchain Security: ​Opportunities and Challenges**+
     * Shijie Zhang et al., IEEE Network 2021 | Pages: 8 | Difficulty: 2/5     * Shijie Zhang et al., IEEE Network 2021 | Pages: 8 | Difficulty: 2/5
     * Abstract: Survey paper discussing applications of deep learning to blockchain security including smart contract analysis, anomaly detection, and fraud detection. Identifies challenges such as limited labeled data and adversarial attacks. Proposes research directions for improving blockchain security with AI.     * Abstract: Survey paper discussing applications of deep learning to blockchain security including smart contract analysis, anomaly detection, and fraud detection. Identifies challenges such as limited labeled data and adversarial attacks. Proposes research directions for improving blockchain security with AI.
     * Keywords: Survey paper, blockchain security, deep learning, smart contracts, anomaly detection     * Keywords: Survey paper, blockchain security, deep learning, smart contracts, anomaly detection
-  - **Graph Neural Networks for Smart Contract Vulnerability Detection** +    ​URLhttps://arxiv.org/​pdf/​2107.08265.pdf
-    * Zhuang Yuan et al., ICSE 2020 | Pages11 | Difficulty4/+
-    * Abstract: Uses graph neural networks to detect vulnerabilities in smart contracts by representing contracts as control flow and data flow graphsLearns vulnerability patterns from graph structuresDemonstrates superior performance compared to sequence-based and tree-based models. +
-    * Keywords: Graph neural networks, smart contracts, vulnerability detection, program graphs, GNN+
   - **Detecting Ponzi Schemes on Ethereum: Towards Healthier Blockchain Technology**   - **Detecting Ponzi Schemes on Ethereum: Towards Healthier Blockchain Technology**
     * Weili Chen et al., WWW 2020 | Pages: 10 | Difficulty: 3/5     * Weili Chen et al., WWW 2020 | Pages: 10 | Difficulty: 3/5
     * Abstract: Proposes deep learning methods to detect Ponzi schemes deployed as smart contracts on Ethereum. Extracts features from account behaviors and contract code. Achieves over 90% detection accuracy and discovers hundreds of unreported Ponzi schemes on the Ethereum blockchain.     * Abstract: Proposes deep learning methods to detect Ponzi schemes deployed as smart contracts on Ethereum. Extracts features from account behaviors and contract code. Achieves over 90% detection accuracy and discovers hundreds of unreported Ponzi schemes on the Ethereum blockchain.
     * Keywords: Ponzi schemes, Ethereum, fraud detection, smart contracts, deep learning     * Keywords: Ponzi schemes, Ethereum, fraud detection, smart contracts, deep learning
 +    * URL: https://​arxiv.org/​pdf/​1803.03916.pdf
 +  - **Smart Contract Vulnerability Detection Based on Deep Learning and Multimodal Decision Fusion**
 +    * Weidong Deng et al., Sensors 2023 | Pages: 18 | Difficulty: 4/5
 +    * Abstract: Proposes multimodal deep learning framework combining control flow graphs and opcode sequences for smart contract vulnerability detection. Uses CNN and LSTM models with decision fusion mechanism. Achieves superior performance in detecting reentrancy, timestamp dependence, and other common vulnerabilities compared to single-modality approaches.
 +    * Keywords: Smart contracts, vulnerability detection, deep learning, multimodal fusion, Ethereum
 +    * URL: https://​www.mdpi.com/​1424-8220/​23/​17/​7319/​pdf
 +  - **Deep Learning-based Solution for Smart Contract Vulnerabilities Detection**
 +    * Wentao Li et al., Scientific Reports 2023 | Pages: 14 | Difficulty: 3/5
 +    * Abstract: Introduces Lightning Cat deep learning framework for detecting smart contract vulnerabilities without predefined rules. Uses LSTM and attention mechanisms to learn vulnerability features during training. Demonstrates effectiveness on real-world Ethereum contracts achieving high detection rates for multiple vulnerability types.
 +    * Keywords: Smart contracts, deep learning, LSTM, vulnerability detection, Ethereum security
 +    * URL: https://​www.nature.com/​articles/​s41598-023-47219-0.pdf
 +  - **Vulnerability Detection in Smart Contracts: A Comprehensive Survey**
 +    * Anonymous et al., arXiv 2024 | Pages: 35 | Difficulty: 2/5
 +    * Abstract: Comprehensive systematic review exploring intersection of machine learning and smart contract security. Reviews 100+ papers from 2020-2024 on ML techniques for vulnerability detection and mitigation. Analyzes GNN, SVM, Random Forest, and deep learning approaches with their effectiveness and limitations.
 +    * Keywords: Survey paper, smart contracts, machine learning, vulnerability detection, blockchain security
 +    * URL: https://​arxiv.org/​pdf/​2407.07922.pdf
  
 ==== C10. AI for Cybersecurity Defense: Phishing Detection ==== ==== C10. AI for Cybersecurity Defense: Phishing Detection ====
-  ​- **An Improved Transformer-Based Model for Detecting Phishing Emails** +  - **Deep Learning Approaches for Phishing Detection: A Systematic Literature Review** 
-    * Ahmad Jamal et al., IEEE Access 2023 | Pages: 12 | Difficulty: 2/5 +    * Gunikhan Sonowal, K. SKuppusamySN COMPUT SCI 2020 | Pages: ​18 | Difficulty: 2/5 
-    * Abstract: Proposes IPSDM, a fine-tuned BERT-based model for detecting phishing and spam emails. Addresses the challenge of increasingly sophisticated phishing attacks that evade traditional rule-based filters. Achieves high accuracy on real-world email datasets and provides interpretable attention-based explanations. +    * Abstract: Systematic review of deep learning methods for phishing detection covering 2015-2020. Categorizes approaches by input features (URL, HTML, visual) and model architecture. Compares performance metrics and identifies research trends and gaps in phishing detection.
-    * Keywords: BERT, phishing detection, email security, transformers,​ fine-tuning,​ NLP +
-  - **ChatSpamDetector:​ Leveraging LLMs for Phishing Email Detection** +
-    * Takashi Koide et al., arXiv 2023 | Pages: 10 | Difficulty: 2/5 +
-    * Abstract: Introduces a novel phishing detection system using large language models with chain-of-thought reasoning. LLM analyzes email content, headers, and links to determine phishing likelihood and provides detailed reasoning. Achieves over 95% accuracy while offering human-readable explanations. +
-    * Keywords: LLMs, phishing detection, chain-of-thought,​ email security, zero-shot learning +
-  ​- **Deep Learning Approaches for Phishing ​Website ​Detection: A Systematic Literature Review** +
-    * Rongqin Liang et al., Computers & Security 2022 | Pages: ​28 | Difficulty: 2/5 +
-    * Abstract: Systematic review of deep learning methods for phishing ​website ​detection covering 2015-2021. Categorizes approaches by input features (URL, HTML, visual) and model architecture. Compares performance metrics and identifies research trends and gaps in phishing detection.+
     * Keywords: Survey paper, phishing detection, deep learning, website security, URL analysis     * Keywords: Survey paper, phishing detection, deep learning, website security, URL analysis
-  ​- **Devising and Detecting ​Phishing ​Emails ​Using LLMs** +    * URL: https://​arxiv.org/​pdf/​2007.15232.pdf 
-    * Florian Heiding et al.arXiv 2023 | Pages: ​14 | Difficulty: 3/5 +  ​- **Phishing ​Email Detection Model Using Deep Learning** 
-    * Abstract: ​Studies both offensive ​and defensive capabilities of LLMs for phishing. ​Shows GPT-4 can generate highly convincing phishing emails that evade traditional filtersAlso evaluates LLMs' ability to detect ​phishing, ​finding they outperform rule-based ​systems but require careful prompt engineering+    * Adel BinbusayyisThavavel Vaiyapuri, Electronics ​2023 | Pages: ​19 | Difficulty: 3/5 
-    * Keywords: ​LLMsphishing generationphishing detectionGPT-4, email security, ​adversarial use+    * Abstract: ​Explores deep learning techniques including CNN, LSTM, RNN, and BERT for email phishing ​detectionCompares performance across multiple architectures and proposes hybrid model combining CNNs with recurrent layers. Achieves 98% accuracy on real-world email datasets with analysis of model interpretability and deployment considerations. 
 +    * Keywords: Email phishing, ​deep learning, BERT, CNN-LSTM, natural language processing 
 +    * URL: https://​www.mdpi.com/​2079-9292/​12/​20/​4261/​pdf 
 +  - <fc red>​(kwak)</​fc>​**A Deep Learning-Based Innovative Technique for Phishing Detection with URLs** 
 +    * Saleh N. Almuayqil et al., Sensors 2023 | Pages: 20 | Difficulty: 2/5 
 +    * Abstract: Proposes CNN-based ​model for phishing website detection using character embedding approach on URLs. Evaluates performance on PhishTank dataset achieving high accuracy in distinguishing legitimate from phishing websites. Introduces novel 1D CNN architecture specifically designed for URL-based detection without requiring HTML content analysis
 +    * Keywords: ​Phishing detectionCNNcharacter embeddingURL analysis, PhishTank dataset 
 +    * URL: https://​www.mdpi.com/​1424-8220/​23/​9/​4403/​pdf 
 +  - **An Improved Transformer-based Model for Detecting Phishing, Spam and Ham Emails** 
 +    * Shahzad Jamal, Himanshu Wimmer, arXiv 2023 | Pages: 12 | Difficulty: 3/5 
 +    * Abstract: Proposes IPSDM fine-tuned model based on BERT family addressing sophisticated phishing and spam attacks. Uses DistilBERT and RoBERTa for efficient email classification achieving superior performance over traditional methods. Demonstrates effectiveness of transformer models in understanding email context and identifying subtle phishing indicators. 
 +    * Keywords: Transformer models, BERT, email security, ​phishing detection, spam filtering 
 +    * URL: https://​arxiv.org/​pdf/​2311.04913.pdf
  
 ==== C11. Cyber Threat Intelligence ==== ==== C11. Cyber Threat Intelligence ====
-  - **Transformer-Based Named Entity Recognition for Cyber Threat Intelligence** 
-    * Panos Evangelatos et al., IEEE ICNC 2021 | Pages: 6 | Difficulty: 3/5 
-    * Abstract: Applies transformer models like BERT to extract security entities (malware names, vulnerabilities,​ attack techniques) from cyber threat intelligence reports. Outperforms traditional NER approaches on security-specific entity types. Enables automated extraction of actionable intelligence from unstructured text. 
-    * Keywords: NER, transformers,​ BERT, threat intelligence,​ information extraction, NLP 
-  - **Automated Generation of Fake Cyber Threat Intelligence Using Transformers** 
-    * Priyanka Ranade et al., arXiv 2021 | Pages: 8 | Difficulty: 3/5 
-    * Abstract: Demonstrates that transformer models can automatically generate realistic but fake cyber threat intelligence reports to mislead defenders. Raises concerns about adversarial use of language models in cyber operations. Shows that generated reports can fool both human analysts and automated systems. 
-    * Keywords: GPT-2, text generation, deception, threat intelligence,​ adversarial AI, transformers 
-  - **LLM4Sec: Fine-Tuned LLMs for Cybersecurity Log Analysis** 
-    * Even Karlsen et al., arXiv 2023 | Pages: 10 | Difficulty: 3/5 
-    * Abstract: Proposes LLM4Sec framework for fine-tuning language models on cybersecurity logs. Benchmarks multiple model variants on log classification and anomaly detection tasks. Shows DistilRoBERTa achieves F1-score of 0.998 across diverse security log datasets while being computationally efficient. 
-    * Keywords: LLMs, log analysis, RoBERTa, fine-tuning,​ anomaly detection, security logs 
   - **Deep Learning for Threat Intelligence:​ A Survey**   - **Deep Learning for Threat Intelligence:​ A Survey**
     * Xiaojun Xu et al., arXiv 2022 | Pages: 25 | Difficulty: 2/5     * Xiaojun Xu et al., arXiv 2022 | Pages: 25 | Difficulty: 2/5
     * Abstract: Comprehensive survey of deep learning applications in cyber threat intelligence including threat detection, attribution,​ and prediction. Reviews architectures (CNNs, RNNs, transformers,​ GNNs) and their applications. Discusses challenges including adversarial attacks and data scarcity.     * Abstract: Comprehensive survey of deep learning applications in cyber threat intelligence including threat detection, attribution,​ and prediction. Reviews architectures (CNNs, RNNs, transformers,​ GNNs) and their applications. Discusses challenges including adversarial attacks and data scarcity.
     * Keywords: Survey paper, threat intelligence,​ deep learning, threat detection, NLP     * Keywords: Survey paper, threat intelligence,​ deep learning, threat detection, NLP
 +    * URL: https://​arxiv.org/​pdf/​2212.10002.pdf
  
 ==== C12. AI Model Security & Supply Chain ==== ==== C12. AI Model Security & Supply Chain ====
   - **Weight Poisoning Attacks on Pre-trained Models**   - **Weight Poisoning Attacks on Pre-trained Models**
     * Keita Kurita et al., ACL 2020 | Pages: 11 | Difficulty: 3/5     * Keita Kurita et al., ACL 2020 | Pages: 11 | Difficulty: 3/5
-    * Abstract: Demonstrates that pre-trained language models in public repositories can be poisoned with backdoors that persist through fine-tuning. Attackers poison model weights such that backdoors activate on downstream tasks after users fine-tune the model. Highlights supply chain risks in the model-sharing ecosystem.+    * Abstract: Demonstrates that pre-trained language models in public repositories can be poisoned with backdoors that persist through fine-tuning. Attackers poison model weights such that backdoors activate on downstream tasks after users fine-tuned the model. Highlights supply chain risks in the model-sharing ecosystem.
     * Keywords: Weight poisoning, pre-trained models, backdoor attacks, supply chain security, BERT, transfer learning     * Keywords: Weight poisoning, pre-trained models, backdoor attacks, supply chain security, BERT, transfer learning
 +    * URL: https://​arxiv.org/​pdf/​2004.06660.pdf
   - **Backdoor Attacks on Self-Supervised Learning**   - **Backdoor Attacks on Self-Supervised Learning**
     * Aniruddha Saha et al., CVPR 2022 | Pages: 10 | Difficulty: 3/5     * Aniruddha Saha et al., CVPR 2022 | Pages: 10 | Difficulty: 3/5
     * Abstract: Shows that backdoors injected during self-supervised pre-training transfer to downstream supervised tasks. Even when fine-tuning on clean data, backdoored features persist and can be activated with appropriate triggers. Demonstrates attacks on contrastive learning methods like SimCLR and MoCo.     * Abstract: Shows that backdoors injected during self-supervised pre-training transfer to downstream supervised tasks. Even when fine-tuning on clean data, backdoored features persist and can be activated with appropriate triggers. Demonstrates attacks on contrastive learning methods like SimCLR and MoCo.
     * Keywords: Self-supervised learning, backdoor attacks, contrastive learning, transfer learning, SimCLR     * Keywords: Self-supervised learning, backdoor attacks, contrastive learning, transfer learning, SimCLR
 +    * URL: https://​arxiv.org/​pdf/​2204.10850.pdf
   - **Model Stealing Attacks Against Inductive Graph Neural Networks**   - **Model Stealing Attacks Against Inductive Graph Neural Networks**
     * Asim Waheed Duddu et al., IEEE S&P 2022 | Pages: 16 | Difficulty: 4/5     * Asim Waheed Duddu et al., IEEE S&P 2022 | Pages: 16 | Difficulty: 4/5
     * Abstract: Demonstrates model extraction attacks specifically targeting graph neural networks. Shows that GNNs are particularly vulnerable to stealing because attackers can query with carefully crafted graphs. Extracts high-fidelity copies of target models with fewer queries than required for traditional neural networks.     * Abstract: Demonstrates model extraction attacks specifically targeting graph neural networks. Shows that GNNs are particularly vulnerable to stealing because attackers can query with carefully crafted graphs. Extracts high-fidelity copies of target models with fewer queries than required for traditional neural networks.
     * Keywords: Model stealing, graph neural networks, model extraction, API attacks, intellectual property     * Keywords: Model stealing, graph neural networks, model extraction, API attacks, intellectual property
 +    * URL: https://​arxiv.org/​pdf/​2112.08331.pdf
   - **Proof-of-Learning:​ Definitions and Practice**   - **Proof-of-Learning:​ Definitions and Practice**
     * Hengrui Jia et al., IEEE S&P 2021 | Pages: 17 | Difficulty: 4/5     * Hengrui Jia et al., IEEE S&P 2021 | Pages: 17 | Difficulty: 4/5
     * Abstract: Introduces proof-of-learning,​ a cryptographic protocol that allows model trainers to prove they performed the training computation honestly. Enables verification that a model was trained as claimed without revealing training data. Addresses concerns about stolen models and fraudulent training claims.     * Abstract: Introduces proof-of-learning,​ a cryptographic protocol that allows model trainers to prove they performed the training computation honestly. Enables verification that a model was trained as claimed without revealing training data. Addresses concerns about stolen models and fraudulent training claims.
     * Keywords: Proof-of-learning,​ cryptographic protocols, model verification,​ training provenance, zero-knowledge proofs     * Keywords: Proof-of-learning,​ cryptographic protocols, model verification,​ training provenance, zero-knowledge proofs
-  - **Protecting Intellectual Property of Deep Neural Networks with Watermarking** +    ​URLhttps://arxiv.org/​pdf/​2103.05633.pdf
-    * Yusuke Uchida et al., AsiaCCS 2017 (extended 2020) | Pages13 | Difficulty3/+
-    * Abstract: Proposes watermarking techniques to protect ownership of neural networksEmbeds watermarks that survive fine-tuning and model extraction attemptsWatermarks can be verified to prove ownership without degrading model performanceAddresses intellectual property protection in model sharing. +
-    * Keywords: Model watermarking,​ intellectual property, ownership verification,​ backdoor-based watermarks, model protection+
  
 ==== C13. Robustness & Certified Defenses ==== ==== C13. Robustness & Certified Defenses ====
   - **Certified Adversarial Robustness via Randomized Smoothing**   - **Certified Adversarial Robustness via Randomized Smoothing**
-    * Jeremy Cohen et al., ICML 2019 (extended 2020) | Pages: 17 | Difficulty: 4/5+    * Jeremy Cohen et al., ICML 2019 | Pages: 17 | Difficulty: 4/5
     * Abstract: Provides provable robustness certificates using randomized smoothing by adding Gaussian noise. Transforms any classifier into a certifiably robust version with theoretical guarantees. Achieves state-of-the-art certified accuracy on ImageNet and demonstrates scalability to large models and datasets.     * Abstract: Provides provable robustness certificates using randomized smoothing by adding Gaussian noise. Transforms any classifier into a certifiably robust version with theoretical guarantees. Achieves state-of-the-art certified accuracy on ImageNet and demonstrates scalability to large models and datasets.
     * Keywords: Certified defenses, randomized smoothing, Gaussian noise, provable robustness, theoretical guarantees     * Keywords: Certified defenses, randomized smoothing, Gaussian noise, provable robustness, theoretical guarantees
 +    * URL: https://​arxiv.org/​pdf/​1902.02918.pdf
   - **Provable Defenses via the Convex Outer Adversarial Polytope**   - **Provable Defenses via the Convex Outer Adversarial Polytope**
-    * Eric Wong, Zico Kolter, ICML 2018 (extended 2020) | Pages: 11 | Difficulty: 5/5+    * Eric Wong, Zico Kolter, ICML 2018 | Pages: 11 | Difficulty: 5/5
     * Abstract: Uses convex optimization to train neural networks with provable robustness guarantees. Computes exact worst-case adversarial loss during training through linear relaxation. Limited to small networks due to computational complexity but provides strongest possible guarantees.     * Abstract: Uses convex optimization to train neural networks with provable robustness guarantees. Computes exact worst-case adversarial loss during training through linear relaxation. Limited to small networks due to computational complexity but provides strongest possible guarantees.
     * Keywords: Certified defenses, convex optimization,​ provable robustness, linear relaxation, formal verification     * Keywords: Certified defenses, convex optimization,​ provable robustness, linear relaxation, formal verification
 +    * URL: https://​arxiv.org/​pdf/​1711.00851.pdf
   - **Benchmarking Neural Network Robustness to Common Corruptions and Perturbations**   - **Benchmarking Neural Network Robustness to Common Corruptions and Perturbations**
-    * Dan Hendrycks, Thomas Dietterich, ICLR 2019 (extended 2020) | Pages: 17 | Difficulty: 2/5+    * Dan Hendrycks, Thomas Dietterich, ICLR 2019 | Pages: 17 | Difficulty: 2/5
     * Abstract: Introduces ImageNet-C benchmark for evaluating robustness to natural image corruptions like noise, blur, and weather effects. Shows that adversarially trained models often fail on common corruptions despite improved adversarial robustness. Demonstrates importance of testing robustness beyond adversarial perturbations.     * Abstract: Introduces ImageNet-C benchmark for evaluating robustness to natural image corruptions like noise, blur, and weather effects. Shows that adversarially trained models often fail on common corruptions despite improved adversarial robustness. Demonstrates importance of testing robustness beyond adversarial perturbations.
     * Keywords: Robustness benchmarks, natural corruptions,​ distribution shift, model evaluation, ImageNet-C     * Keywords: Robustness benchmarks, natural corruptions,​ distribution shift, model evaluation, ImageNet-C
-  - **Smoothed Analysis of Neural Networks** +    ​URLhttps://arxiv.org/​pdf/​1903.12261.pdf
-    * Huan Zhang et al., NeurIPS 2020 | Pages12 | Difficulty5/+
-    * Abstract: Provides smoothed analysis framework for studying neural network behavior under input perturbationsDerives tighter robustness certificates by analyzing networks with random smoothingBridges gap between worst-case analysis and average-case performance. +
-    * Keywords: Smoothed analysis, robustness certification,​ theoretical analysis, neural networks, perturbation analysis+
  
 ==== C14. Interpretability & Verification for Security ==== ==== C14. Interpretability & Verification for Security ====
-  - **Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks** 
-    * Guy Katz et al., CAV 2017 (extended 2020) | Pages: 20 | Difficulty: 5/5 
-    * Abstract: Introduces formal verification of neural networks using SMT solving. Can prove or disprove safety properties about ReLU network behavior. Foundational work enabling formal guarantees about neural network decisions. Demonstrated on aircraft collision avoidance system verification. 
-    * Keywords: Formal verification,​ SMT solvers, ReLU networks, safety properties, symbolic reasoning 
   - **DeepXplore:​ Automated Whitebox Testing of Deep Learning Systems**   - **DeepXplore:​ Automated Whitebox Testing of Deep Learning Systems**
-    * Kexin Pei et al., SOSP 2017 (extended 2020) | Pages: 18 | Difficulty: 3/5+    * Kexin Pei et al., SOSP 2017 | Pages: 18 | Difficulty: 3/5
     * Abstract: Introduces neuron coverage as a metric for testing deep learning systems. Automatically generates test inputs that maximize differential behavior across multiple models. Discovers thousands of erronous behaviors in production DL systems including self-driving cars.     * Abstract: Introduces neuron coverage as a metric for testing deep learning systems. Automatically generates test inputs that maximize differential behavior across multiple models. Discovers thousands of erronous behaviors in production DL systems including self-driving cars.
     * Keywords: DNN testing, neuron coverage, differential testing, automated test generation, model testing     * Keywords: DNN testing, neuron coverage, differential testing, automated test generation, model testing
 +    * URL: https://​arxiv.org/​pdf/​1705.06640.pdf
   - **Attention is Not Always Explanation:​ Quantifying Attention Flow in Transformers**   - **Attention is Not Always Explanation:​ Quantifying Attention Flow in Transformers**
     * Samira Abnar, Willem Zuidema, EMNLP 2020 | Pages: 11 | Difficulty: 3/5     * Samira Abnar, Willem Zuidema, EMNLP 2020 | Pages: 11 | Difficulty: 3/5
     * Abstract: Analyzes whether attention weights in transformers provide faithful explanations of model behavior. Introduces attention flow to track information through layers. Shows attention weights can be manipulated without changing predictions,​ questioning their reliability as explanations in security-critical applications.     * Abstract: Analyzes whether attention weights in transformers provide faithful explanations of model behavior. Introduces attention flow to track information through layers. Shows attention weights can be manipulated without changing predictions,​ questioning their reliability as explanations in security-critical applications.
     * Keywords: Attention mechanisms, interpretability,​ transformers,​ explanation faithfulness,​ NLP analysis     * Keywords: Attention mechanisms, interpretability,​ transformers,​ explanation faithfulness,​ NLP analysis
-  - **Quantifying Uncertainty in Neural Networks for Security Applications** +    ​URLhttps://arxiv.org/​pdf/​2005.13005.pdf
-    * Lewis Smith, Yarin Gal, arXiv 2020 | Pages10 | Difficulty3/+
-    * Abstract: Uses Bayesian neural networks to quantify prediction uncertainty in security tasksShows uncertainty estimates can detect adversarial examples and out-of-distribution inputsProposes using uncertainty quantification as an additional defense layer in security-critical systems. +
-    * Keywords: Uncertainty quantification,​ Bayesian neural networks, Monte Carlo dropout, adversarial detection, OOD detection+
  
 ==== C15. AI for Offensive Security ==== ==== C15. AI for Offensive Security ====
-  - **Generating Adversarial Examples with Generative Models** +  - **Generating Adversarial Examples with Adversarial Networks** 
-    * Chaowei Xiao et al., NDSS 2019 (extended 2020) | Pages: ​15 | Difficulty: 4/5 +    * Chaowei Xiao et al., IJCAI 2018 | Pages: ​| Difficulty: 4/5 
-    * Abstract: Uses generative ​models ​(GANs, VAEs) to create adversarial examples that lie on the natural data manifold. These attacks are more realistic and harder to detect than perturbation-based attacks. Demonstrates successful attacks against defended models that detect out-of-distribution adversarial examples. +    * Abstract: Uses generative ​adversarial networks ​(GANs) to create adversarial examples that lie on the natural data manifold. These attacks are more realistic and harder to detect than perturbation-based attacks. Demonstrates successful attacks against defended models that detect out-of-distribution adversarial examples. 
-    * Keywords: GANs, VAEs, adversarial examples, generative models, natural adversarial examples +    * Keywords: GANs, adversarial examples, generative models, natural adversarial examples, attack generation 
-  - **Automating Network Exploitation with Reinforcement Learning** +    * URLhttps://arxiv.org/​pdf/​1801.02610.pdf 
-    * William Glodek et al., arXiv 2020 | Pages10 | Difficulty3/+  - **Generating Natural Language Adversarial Examples on a Large Scale with Generative Models** 
-    * Abstract: Applies deep reinforcement learning to automated network penetration testingAgents learn to exploit vulnerabilities through trial and error in simulated environmentsDemonstrates potential for AI-driven offensive security testing but also highlights limitations in complex real-world scenarios. +    * Yankun Ren et al., EMNLP-IJCNLP 2019 | Pages: ​| Difficulty: 3/5 
-    * Keywords: Reinforcement learning, penetration testing, network exploitation,​ automated security testing, DRL +    * Abstract: Uses generative models ​to create ​adversarial ​text examples at scale. Generates semantically similar ​text that fools NLP classifiers. Demonstrates vulnerabilities in sentiment analysis, textual entailment, and question answering ​systems. 
-  - **Generating Natural Language Adversarial Examples on a Large Scale** +    * Keywords: Adversarial NLP, generative models, text perturbations,​ semantic similarity, NLP attacks 
-    * Moustafa Alzantot ​et al., EMNLP 2018 (extended 2020) | Pages: ​12 | Difficulty: 3/5 +    * URLhttps://arxiv.org/​pdf/​1909.01631.pdf
-    * Abstract: Uses genetic algorithms ​to generate ​adversarial text that fools NLP classifiers ​while maintaining semantic similarity. Demonstrates vulnerabilities in sentiment analysis, textual entailment, and reading comprehension ​systems. Shows that text classifiers are brittle to synonymous substitutions+
-    * Keywords: Adversarial NLP, genetic algorithms, text perturbations,​ semantic similarity, NLP attacks +
-  - **LLM-Fuzzer:​ Fuzzing Large Language Models with Adaptive Prompts** +
-    * Jiahao Yu et al., arXiv 2023 | Pages16 | Difficulty2/+
-    * Abstract: Proposes automated fuzzing framework for discovering LLM vulnerabilities using mutation-based approachAdaptively generates test prompts that trigger jailbreaks, hallucinations,​ and other failure modesDemonstrates effectiveness in finding alignment failures across multiple LLM models. +
-    * Keywords: Fuzzing, LLMs, automated testing, jailbreaks, vulnerability discovery, prompt generation+
 
class/gradsec2026.1773593716.txt.gz · Last modified: 2026/03/15 23:55 by mhshin · [Old revisions]
Recent changes RSS feed Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki