This is an old revision of the document!

Overview

Title: Special Topics on AI Security
Provided by: Dept. of Computer Engineering, Myongji University
Lead by: Minho Shin (mhshin@mju.ac.kr, Rm5736)
Period: Spring semester, 2026
Location: 5701 at 5th Engineering Building
Time: Wed, 10am to 1pm
Type: Graduate Seminar
Goal of the class
- This class aims to familiarize students with current research topics in AI Security & Privacy area
- This class also aims to train students with their communication skills including oral presentation, discussion, writing, and collaboration
Resources for researchers from Publishing Campus of Elsevier
- Learning Center
- Setting sail on your research journey

Participants

#	Name	Dept	Advisor	Email Address
1	Hyeonjun Jo	CE	Undergraduate	mnbvjojun@gmail.com
2	Nayung Kwak	CE	Undergraduate	kny12202423@gmail.com
3	Kyungchan Kim	CS	Minho Shin	kkc8983@gmail.com

Agenda

TBD

* order: Cho --> Han --> Kwak
* # of presentations per week: 2, 2, 2, ...
* # of presentations per person:

Date	Name	Topic	Slides	Minutes
3/4	Minho	Ice-breaking	AI-Cybersecurity	Survey paper
3/11	Minho
3/11	Cho	https://www.usenix.org/system/files/sec21-schuster.pdf
3/18	Minho
3/18	Han
3/25	Minho
3/25	Kwak
4/1	Cho
4/8	No Class
4/15	Han
4/22	No Class
4/29	Kwak
5/6	Cho
5/13	Han
5/20	Kwak
5/27	Cho
6/3	Han
6/10	Kwak
6/17	Cho
6/24	Han

Class Information

Rules for the class
- We have 15 presentations in total by three students
- Each present 5 presentations throughout the semester
- One presentation per day
- The presenter announces the paper to present at least one week ahead
- The presenter prepares a powerpoint slides for 30-60min talk
- The other students submit a review article (1-2 pages) before class
- The presentation should contain:
  - (Motivation) What are the motivations for this particular problem? What is the backgrounds for understanding the problem? Why is this important?
  - (Problem) What is, on earth, the exact problem the authors aim to address, and why on earth, is the problem important?
  - (Related work) What has been done by other researchers to address the same or similar problem on the table? Why the existing work is not enough to call done?
  - (Method) What is their main methodology to address the problem? How did they actually solve the problem in detail?
  - (Evaluation) What are the evidences for their success found in the paper? What is missing in their evaluation?
  - (Contribution) What is the contribution of the paper and what is not their contributions? Are there any limitations in their result? How would you evaluate the value of the paper?
  - (Future work) What is the remaining problems that were only partially addressed or never covered by the paper? What will be a possible approach to the problem?
- A review article contains
  - The same content as described for the presenter
  - But in a succinctly written words form
  - Not exceeding two pages
  - Submit in Word/PDF by email
- Evaluation
  - As a Presenter (10 points each)
    - Slide Quality
    - Talk Quality
    - Knowledge Level
  - As a reviewer (5 points each)
    - Clarity of the review
    - Understanding level

Reading List for LLM-based Cybersecurity

C1. Adversarial Machine Learning

Adversarial Examples Are Not Bugs, They Are Features
- Andrew Ilyas et al., NeurIPS 2019 | Pages: 25 | Difficulty: 3/5
- Abstract: This influential paper argues that adversarial vulnerability arises from models relying on highly predictive but non-robust features in the data. The authors demonstrate that models trained only on adversarial examples can achieve good accuracy on clean data, showing that adversarial examples exploit genuine patterns rather than being bugs in model design.
- Keywords: Deep learning, adversarial examples, robust features, neural networks, gradient-based attacks, image classification
Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks
- Francesco Croce, Matthias Hein, ICML 2020 | Pages: 32 | Difficulty: 3/5
- Abstract: Introduces AutoAttack, an ensemble of parameter-free attacks for robust evaluation of adversarial defenses. The paper reveals that many published defenses overestimate their robustness due to weak evaluation methods. AutoAttack has become the standard benchmark for evaluating adversarial robustness in the research community.
- Keywords: Adversarial attacks, robustness evaluation, ensemble methods, PGD, gradient-based optimization, AutoAttack
On Adaptive Attacks to Adversarial Example Defenses
- Florian Tramer et al., NeurIPS 2020 | Pages: 13 | Difficulty: 4/5
- Abstract: Provides comprehensive guidelines for properly evaluating adversarial defenses against adaptive attacks. Shows that many defenses fail when attackers adapt their strategies. Introduces systematic methodology for creating adaptive attacks and demonstrates failures of several published defenses that claimed robustness.
- Keywords: Adversarial defenses, adaptive attacks, security evaluation, gradient obfuscation, defense mechanisms
Improving Adversarial Robustness via Guided Complement Entropy
- Hao-Yun Chen et al., ICCV 2021 | Pages: 10 | Difficulty: 3/5
- Abstract: Proposes a new adversarial training method using guided complement entropy that improves both standard accuracy and adversarial robustness. Addresses the trade-off between clean accuracy and robust accuracy by optimizing a novel objective function that considers prediction confidence on correct and incorrect classes.
- Keywords: Adversarial training, entropy optimization, deep learning, robustness-accuracy tradeoff, neural networks
Perceptual Adversarial Robustness: Defense Against Unseen Threat Models
- Cassidy Laidlaw, Sahil Singla, Soheil Feizi, ICLR 2021 | Pages: 23 | Difficulty: 4/5
- Abstract: Introduces perceptual adversarial training (PAT) that defends against a diverse set of adversarial attacks by optimizing against perceptually-aligned perturbations. Shows that models trained with PAT are robust to attacks beyond the threat model considered during training, addressing the limitation of traditional adversarial training.
- Keywords: Adversarial robustness, perceptual metrics, threat models, adversarial training, LPIPS distance
RobustBench: A Standardized Adversarial Robustness Benchmark
- Francesco Croce et al., NeurIPS Datasets 2021 | Pages: 22 | Difficulty: 2/5
- Abstract: Presents RobustBench, a standardized benchmark for evaluating adversarial robustness with a continuously updated leaderboard. Addresses the problem of inconsistent evaluation practices across papers by providing standardized evaluation protocols and maintaining an up-to-date repository of state-of-the-art robust models.
- Keywords: Benchmarking, adversarial robustness, standardization, AutoAttack, model evaluation, leaderboards
Adversarial Training for Free!
- Ali Shafahi et al., NeurIPS 2019 (extended 2020) | Pages: 11 | Difficulty: 3/5
- Abstract: Proposes "free" adversarial training that achieves similar robustness to standard adversarial training with almost no additional computational cost. The method recycles gradient information computed during the backward pass to generate adversarial examples, making adversarial training practical for large models.
- Keywords: Adversarial training, computational efficiency, gradient recycling, neural networks, optimization
Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples
- Sven Gowal et al., arXiv 2020 | Pages: 18 | Difficulty: 4/5
- Abstract: Investigates the fundamental limits of adversarial training for norm-bounded attacks. Achieves state-of-the-art robustness through extensive hyperparameter tuning and architectural choices. Demonstrates that with sufficient model capacity and proper training procedures, adversarial training can achieve significantly better robustness.
- Keywords: Adversarial training, WideResNet, data augmentation, model capacity, robustness limits

C2. Model Poisoning & Backdoor Attacks

Blind Backdoors in Deep Learning Models
- Eugene Bagdasaryan, Vitaly Shmatikov, USENIX Security 2021 | Pages: 18 | Difficulty: 4/5
- Abstract: Introduces blind backdoor attacks where the attacker doesn't need to control the training process. Shows how backdoors can be injected through model replacement or by poisoning only a small fraction of training data. Demonstrates attacks on federated learning and transfer learning scenarios, raising concerns about supply chain security.
- Keywords: Backdoor attacks, federated learning, transfer learning, model poisoning, supply chain security
WaNet: Imperceptible Warping-based Backdoor Attack
- Anh Nguyen et al., ICLR 2021 | Pages: 18 | Difficulty: 3/5
- Abstract: Proposes a novel backdoor attack using smooth warping transformations instead of visible patches as triggers. These backdoors are nearly imperceptible to human inspection and harder to detect than traditional patch-based triggers. Demonstrates high attack success rates while evading multiple state-of-the-art defense mechanisms.
- Keywords: Backdoor attacks, image warping, imperceptible perturbations, neural networks, trigger design
Backdoor Learning: A Survey
- Yiming Li et al., IEEE TNNLS 2022 | Pages: 45 | Difficulty: 2/5
- Abstract: Comprehensive survey of backdoor attacks and defenses in deep learning. Categorizes attacks by trigger type, poisoning strategy, and attack scenario. Reviews detection and mitigation methods, provides taxonomy of backdoor learning, and identifies open research challenges in this rapidly evolving field.
- Keywords: Survey paper, backdoor attacks, defense mechanisms, trigger patterns, neural network security
Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks
- Avi Schwarzschild et al., ICML 2021 | Pages: 21 | Difficulty: 3/5
- Abstract: Presents a unified benchmark for evaluating data poisoning and backdoor attacks across different scenarios. Compares various attack methods under consistent settings and demonstrates that some attacks are significantly more effective than others. Provides standardized evaluation framework for future research.
- Keywords: Data poisoning, backdoor attacks, benchmarking, neural networks, attack evaluation
Rethinking the Backdoor Attacks' Triggers: A Frequency Perspective
- Yi Zeng et al., ICCV 2021 | Pages: 10 | Difficulty: 3/5
- Abstract: Analyzes backdoor triggers from a frequency perspective and discovers that existing triggers predominantly contain high-frequency components. Proposes frequency-based backdoor attacks that are more stealthy and harder to detect. Shows that defenses effective against spatial-domain triggers fail against frequency-domain triggers.
- Keywords: Backdoor attacks, frequency analysis, Fourier transform, trigger design, stealth attacks
Backdoor Attacks Against Deep Learning Systems in the Physical World
- Emily Wenger et al., CVPR 2021 | Pages: 10 | Difficulty: 3/5
- Abstract: Extends backdoor attacks to the physical world using robust physical triggers that work across different viewing conditions. Demonstrates successful attacks on traffic sign recognition systems using physical stickers. Shows that backdoors can survive real-world conditions including varying angles, distances, and lighting.
- Keywords: Physical adversarial examples, backdoor attacks, computer vision, robust perturbations, physical-world attacks
Hidden Trigger Backdoor Attacks
- Aniruddha Saha et al., AAAI 2020 | Pages: 8 | Difficulty: 3/5
- Abstract: Proposes backdoor attacks where triggers are hidden in the neural network's feature space rather than being visible patterns in the input. These attacks are harder to detect because there's no visible trigger pattern that can be identified through input inspection or trigger inversion techniques.
- Keywords: Backdoor attacks, hidden triggers, feature space, neural networks, detection evasion
Input-Aware Dynamic Backdoor Attack
- Anh Nguyen, Anh Tran, NeurIPS 2020 | Pages: 11 | Difficulty: 4/5
- Abstract: Introduces dynamic backdoor attacks where the trigger pattern adapts to the input image, making detection more difficult. Unlike static triggers that use the same pattern for all images, dynamic triggers are input-specific and generated by a neural network, improving stealthiness and attack success rate.
- Keywords: Dynamic backdoor attacks, generative models, adaptive triggers, neural networks, attack stealthiness

C3. Privacy Attacks on Machine Learning

Extracting Training Data from Large Language Models
- Nicholas Carlini et al., USENIX Security 2021 | Pages: 17 | Difficulty: 3/5
- Abstract: Demonstrates that large language models like GPT-2 memorize and can be made to emit verbatim training data including personal information, phone numbers, and copyrighted content. The paper raises serious privacy concerns for LLMs trained on web data and shows that model size correlates with memorization capability.
- Keywords: LLMs, privacy attacks, data extraction, memorization, training data leakage, GPT-2
Updated Membership Inference Attacks Against Machine Learning Models
- Bargav Jayaraman, David Evans, PETS 2022 | Pages: 20 | Difficulty: 3/5
- Abstract: Presents improved membership inference attacks that achieve higher success rates than previous methods. Shows that even well-generalized models leak membership information. Evaluates attacks under realistic scenarios including label-only access and demonstrates effectiveness across different model architectures and datasets.
- Keywords: Membership inference, privacy attacks, machine learning, differential privacy, model leakage
Gradient Inversion Attacks: Privacy Leakage in Federated Learning
- Liam Fowl et al., NeurIPS 2021 | Pages: 12 | Difficulty: 4/5
- Abstract: Demonstrates that gradient information shared in federated learning can be inverted to reconstruct private training data with high fidelity. Shows successful attacks even with gradient perturbations and multiple local training steps. Highlights fundamental privacy risks in collaborative learning scenarios.
- Keywords: Federated learning, gradient inversion, privacy attacks, data reconstruction, collaborative learning
Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks
- Fatemehsadat Mireshghallah et al., EMNLP 2022 | Pages: 16 | Difficulty: 3/5
- Abstract: Studies privacy risks in masked language models like BERT through membership inference attacks. Shows that fine-tuning on sensitive data creates privacy vulnerabilities even when the base model was pre-trained on public data. Demonstrates that different model architectures and training procedures have varying privacy risks.
- Keywords: BERT, masked language models, membership inference, NLP, privacy risks, fine-tuning
Privacy-Preserving Machine Learning: Threats and Solutions
- Zhigang Lu et al., IEEE Security & Privacy 2020 | Pages: 10 | Difficulty: 2/5
- Abstract: Survey paper covering privacy attacks and defense mechanisms in machine learning. Discusses membership inference, model inversion, and data extraction attacks. Reviews privacy-preserving techniques including differential privacy, secure multi-party computation, and federated learning. Provides comprehensive overview for practitioners.
- Keywords: Survey paper, privacy attacks, differential privacy, secure computation, privacy-preserving ML
Label-Only Membership Inference Attacks
- Christopher Choquette-Choo et al., ICML 2021 | Pages: 22 | Difficulty: 3/5
- Abstract: Proposes membership inference attacks that only require access to predicted labels, not confidence scores. Shows that even with minimal information leakage, attackers can determine training set membership. Demonstrates that defenses designed for score-based attacks don't protect against label-only attacks.
- Keywords: Membership inference, label-only attacks, privacy leakage, machine learning privacy, black-box attacks
Auditing Differentially Private Machine Learning: How Private is Private SGD?
- Matthew Jagielski et al., NeurIPS 2020 | Pages: 11 | Difficulty: 4/5
- Abstract: Audits the privacy guarantees of differentially private SGD by conducting membership inference attacks. Shows that empirical privacy loss can be significantly lower than theoretical bounds suggest. Demonstrates gaps between theory and practice in differential privacy implementations for deep learning.
- Keywords: Differential privacy, DP-SGD, privacy auditing, membership inference, privacy guarantees

C4. LLM Security & Jailbreaking

Jailbroken: How Does LLM Safety Training Fail?
- Alexander Wei et al., NeurIPS 2023 | Pages: 34 | Difficulty: 3/5
- Abstract: Analyzes why safety training in LLMs can be circumvented through jailbreaking. Identifies two fundamental failure modes: competing objectives during training and mismatched generalization between safety and capabilities. Provides theoretical framework for understanding jailbreak vulnerabilities and suggests that current alignment approaches have inherent limitations.
- Keywords: LLMs, jailbreaking, safety training, RLHF, alignment, adversarial prompts
Universal and Transferable Adversarial Attacks on Aligned Language Models
- Andy Zou et al., arXiv 2023 | Pages: 25 | Difficulty: 3/5
- Abstract: Introduces automated methods using gradient-based optimization to generate adversarial suffixes that jailbreak aligned LLMs. Shows these attacks transfer across different models including GPT-3.5, GPT-4, and Claude. Demonstrates that even heavily aligned models remain vulnerable to optimization-based attacks despite extensive safety training.
- Keywords: LLMs, adversarial attacks, jailbreaking, gradient-based optimization, transfer attacks, alignment
Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
- Kai Greshake et al., AISec 2023 | Pages: 17 | Difficulty: 2/5
- Abstract: Introduces indirect prompt injection attacks where malicious instructions are embedded in external data sources (websites, emails, documents) that LLMs process. Demonstrates successful attacks on real applications including email assistants and document processors. Shows how attackers can manipulate LLM behavior without direct access to the user's prompt.
- Keywords: Prompt injection, LLMs, indirect attacks, application security, web security, LLM agents
Poisoning Language Models During Instruction Tuning
- Alexander Wan et al., ICML 2023 | Pages: 12 | Difficulty: 3/5
- Abstract: Demonstrates backdoor attacks during the instruction tuning phase of LLMs. Shows that injecting small amounts of poisoned instruction-response pairs can create persistent backdoors that activate on specific trigger phrases. Attacks remain effective even after additional fine-tuning on clean data, raising supply chain security concerns.
- Keywords: LLMs, instruction tuning, backdoor attacks, data poisoning, model security, fine-tuning
Red Teaming Language Models with Language Models
- Ethan Perez et al., EMNLP 2022 | Pages: 23 | Difficulty: 2/5
- Abstract: Uses LLMs to automatically generate diverse test cases for red-teaming other LLMs. Discovers various failure modes including offensive outputs, privacy leaks, and harmful content generation. Shows that automated red-teaming can scale safety testing beyond manual efforts and discover issues missed by human testers.
- Keywords: Red teaming, LLMs, automated testing, safety evaluation, adversarial prompts, model evaluation
Are Aligned Neural Networks Adversarially Aligned?
- Nicholas Carlini et al., NeurIPS 2023 | Pages: 29 | Difficulty: 4/5
- Abstract: Studies whether alignment through RLHF provides adversarial robustness. Finds that aligned models remain vulnerable to adversarial attacks and that alignment and robustness are distinct properties. Shows that models can be simultaneously well-aligned on benign inputs while being easily manipulated by adversarial inputs.
- Keywords: LLMs, alignment, RLHF, adversarial robustness, model security, safety training
Do Prompt-Based Models Really Understand the Meaning of their Prompts?
- Albert Webson, Ellie Pavlick, NAACL 2022 | Pages: 15 | Difficulty: 3/5
- Abstract: Investigates whether prompt-based language models actually understand prompt semantics or merely pattern match. Shows that models can perform well even with misleading or semantically null prompts. Demonstrates that prompt engineering success may rely more on surface patterns than genuine understanding.
- Keywords: Prompt engineering, LLMs, prompt understanding, semantic analysis, NLP, model interpretability
Prompt Injection Attacks and Defenses in LLM-Integrated Applications
- Yupei Liu et al., arXiv 2023 | Pages: 14 | Difficulty: 2/5
- Abstract: Formalizes prompt injection attacks and proposes a comprehensive taxonomy covering direct and indirect injection vectors. Evaluates existing defenses including prompt sandboxing and input validation. Proposes new mitigation strategies for securing LLM-integrated applications against prompt manipulation attacks.
- Keywords: Prompt injection, LLMs, attack taxonomy, defense mechanisms, application security

C5. Federated Learning Security

Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing
- Sai Praneeth Karimireddy et al., ICLR 2022 | Pages: 22 | Difficulty: 4/5
- Abstract: Proposes a Byzantine-robust aggregation method for federated learning that works with heterogeneous data distributions. Uses bucketing to group similar client updates and applies robust aggregation within buckets. Provides theoretical guarantees on convergence even with a significant fraction of malicious clients.
- Keywords: Federated learning, Byzantine robustness, heterogeneous data, aggregation methods, distributed learning
The Limitations of Federated Learning in Sybil Settings
- Clement Fung et al., RAID 2020 | Pages: 15 | Difficulty: 3/5
- Abstract: Analyzes federated learning security under Sybil attacks where adversaries control multiple fake identities. Shows that existing Byzantine-robust aggregation methods fail when attackers can create unlimited Sybil identities. Demonstrates fundamental limitations of federated learning in permissionless settings.
- Keywords: Federated learning, Sybil attacks, Byzantine robustness, distributed systems, attack scenarios
Attack of the Tails: Yes, You Really Can Backdoor Federated Learning
- Hongyi Wang et al., NeurIPS 2020 | Pages: 12 | Difficulty: 4/5
- Abstract: Presents sophisticated edge-case backdoor attacks that target rare inputs while maintaining high model utility on common data. Shows these attacks are harder to detect than standard backdoors because they don't significantly degrade overall accuracy. Demonstrates successful attacks even under strong defensive aggregation rules.
- Keywords: Federated learning, backdoor attacks, edge cases, model poisoning, distributed learning
DBA: Distributed Backdoor Attacks against Federated Learning
- Chulin Xie et al., ICLR 2020 | Pages: 13 | Difficulty: 3/5
- Abstract: Introduces distributed backdoor attacks where multiple malicious clients collaborate to inject backdoors while evading detection. Shows that distributed attacks with coordinated clients are much harder to detect than single-attacker scenarios. Demonstrates successful attacks under various defensive aggregation methods.
- Keywords: Federated learning, distributed attacks, backdoor attacks, collaborative adversaries, model poisoning
Local Model Poisoning Attacks on Federated Learning: A Survey
- Zhao Chen et al., arXiv 2022 | Pages: 38 | Difficulty: 2/5
- Abstract: Comprehensive survey of poisoning attacks in federated learning including both untargeted and targeted (backdoor) attacks. Categorizes attacks by adversary capabilities, attack objectives, and methods. Reviews defense mechanisms and discusses open challenges in securing federated learning systems.
- Keywords: Survey paper, federated learning, poisoning attacks, backdoor attacks, defense mechanisms
Analyzing Federated Learning through an Adversarial Lens
- Arjun Nitin Bhagoji et al., ICML 2019 (extended 2020) | Pages: 18 | Difficulty: 3/5
- Abstract: Comprehensive analysis of attack vectors in federated learning including both model poisoning and backdoor attacks. Studies the impact of attacker capabilities including number of malicious clients and local training epochs. Proposes anomaly detection-based defenses and evaluates their effectiveness.
- Keywords: Federated learning, adversarial analysis, poisoning attacks, anomaly detection, distributed learning
Soteria: Provable Defense Against Privacy Leakage in Federated Learning from Representation Perspective
- Jingwei Sun et al., CVPR 2021 | Pages: 10 | Difficulty: 4/5
- Abstract: Proposes Soteria, a defense mechanism against gradient inversion attacks in federated learning. Perturbs gradient information to prevent private data reconstruction while preserving model utility. Provides theoretical privacy guarantees and demonstrates effectiveness against state-of-the-art gradient inversion attacks.
- Keywords: Federated learning, privacy defense, gradient perturbation, privacy guarantees, gradient inversion

C6. AI for Cybersecurity Defense: Software Security

Deep Learning-Based Vulnerability Detection: Are We There Yet?
- Steffen Eckhard et al., IEEE TSE 2022 | Pages: 18 | Difficulty: 3/5
- Abstract: Comprehensive empirical study evaluating deep learning approaches for vulnerability detection. Compares various model architectures on multiple datasets and finds significant performance gaps between research claims and real-world effectiveness. Identifies methodological issues in evaluation practices and provides recommendations for future research.
- Keywords: Vulnerability detection, deep learning, empirical evaluation, code analysis, software security
LineVul: A Transformer-based Line-Level Vulnerability Prediction
- Michael Fu, Chakkrit Tantithamthavorn, ICSE 2022 | Pages: 12 | Difficulty: 3/5
- Abstract: Proposes LineVul, a transformer-based model that identifies vulnerable code at line-level granularity rather than function-level. Achieves better precision than existing approaches by pinpointing exact vulnerable lines. Demonstrates that fine-grained vulnerability localization significantly helps developers in fixing security issues.
- Keywords: Transformers, CodeBERT, vulnerability detection, line-level analysis, code understanding
Vulnerability Detection with Code Language Models
- Yangruibo Ding et al., arXiv 2023 | Pages: 10 | Difficulty: 2/5
- Abstract: Investigates how far current code language models have progressed in vulnerability detection. Evaluates models including CodeBERT, GraphCodeBERT, and CodeT5 on real-world vulnerability datasets. Finds that while models show promise, significant gaps remain in detecting complex vulnerabilities.
- Keywords: Code language models, vulnerability detection, transformers, pre-trained models, code understanding
DeepDFA: Static Analysis Enhanced Deep Learning for Vulnerability Detection
- Zhen Li et al., ICSE 2021 | Pages: 12 | Difficulty: 4/5
- Abstract: Combines deep learning with traditional static analysis techniques for improved vulnerability detection. Uses data flow analysis to extract program semantics and feeds them into neural networks. Demonstrates that incorporating static analysis features significantly improves detection accuracy over pure learning approaches.
- Keywords: Static analysis, data flow analysis, deep learning, vulnerability detection, hybrid methods
(Jo) You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion
- Roei Schuster et al., USENIX Security 2021 | Pages: 17 | Difficulty: 3/5
- Abstract: Demonstrates that neural code autocompleters can be poisoned to suggest insecure code patterns. Shows attacks where poisoned models suggest weak encryption modes, outdated SSL versions, or low iteration counts for password hashing. Highlights security risks in AI-assisted software development tools.
- Keywords: Code completion, backdoor attacks, software security, neural networks, supply chain attacks
An Empirical Study of Pre-trained Models for Vulnerability Detection
- Weixiang Yan et al., ICSE 2023 | Pages: 12 | Difficulty: 3/5
- Abstract: Large-scale empirical study comparing various pre-trained models (BERT, GPT, T5) for vulnerability detection across multiple datasets. Analyzes factors affecting model performance including pre-training objectives, model size, and fine-tuning strategies. Provides practical guidance for practitioners.
- Keywords: Pre-trained models, vulnerability detection, BERT, GPT, empirical study, transfer learning
(Han) A dataset built for ai-based vulnerability detection methods using differential analysis
- Zheng et al.
- Abstract: This paper proposes D2A, a differential analysis approach that automatically labels static analysis issues by comparing code versions before and after bug-fixing commits, generating a large dataset of 1.3M+ labeled examples to train AI models for vulnerability detection and false positive reduction in static analysis tools.

C7. AI for Cybersecurity Defense: Intrusion Detection

KITSUNE: An Ensemble of Autoencoders for Online Network Intrusion Detection
- Yisroel Mirsky et al., NDSS 2018 (extended 2020) | Pages: 15 | Difficulty: 2/5
- Abstract: Proposes an unsupervised intrusion detection system using ensemble of autoencoders that learns normal network behavior. Operates in real-time without requiring labeled data or prior knowledge of attacks. Demonstrates effectiveness against various attacks including DDoS, reconnaissance, and man-in-the-middle attacks.
- Keywords: Autoencoders, intrusion detection, unsupervised learning, anomaly detection, network security
Deep Learning for Network Intrusion Detection: A Systematic Literature Review
- Saba Arif et al., IEEE Access 2021 | Pages: 25 | Difficulty: 2/5
- Abstract: Systematic review of deep learning approaches for network intrusion detection from 2010-2020. Categorizes approaches by architecture (CNN, RNN, DBN, autoencoder) and provides comparative analysis. Identifies research gaps and future directions in applying deep learning to network security.
- Keywords: Survey paper, deep learning, intrusion detection, CNN, RNN, network security
E-GraphSAGE: A Graph Neural Network Based Intrusion Detection System
- Zhongru Lo et al., arXiv 2022 | Pages: 10 | Difficulty: 3/5
- Abstract: Applies graph neural networks to intrusion detection by modeling network traffic as graphs. Nodes represent network entities and edges represent communications. Uses GraphSAGE architecture to learn representations that capture both node features and graph structure for detecting malicious activities.
- Keywords: Graph neural networks, GraphSAGE, intrusion detection, network traffic analysis, deep learning
Adversarial Attacks Against Deep Learning Based Network Intrusion Detection Systems
- Luca Demetrio et al., ESORICS 2020 | Pages: 19 | Difficulty: 3/5
- Abstract: Studies adversarial robustness of deep learning-based IDS systems. Demonstrates successful evasion attacks against various neural network architectures including MLP, CNN, and RNN. Shows that malware can evade detection through carefully crafted perturbations while preserving functionality.
- Keywords: Adversarial attacks, intrusion detection, evasion attacks, deep learning, network security
LSTM-Based Intrusion Detection for IoT Networks
- Ayush Kumar et al., IEEE IoT Journal 2021 | Pages: 12 | Difficulty: 2/5
- Abstract: Proposes LSTM-based intrusion detection specifically designed for IoT networks with resource constraints. Addresses challenges of high-dimensional data and real-time processing requirements. Demonstrates high detection rates on IoT-specific attack datasets while maintaining computational efficiency.
- Keywords: LSTM, IoT security, intrusion detection, recurrent neural networks, resource-constrained devices
Federated Learning for Intrusion Detection: Challenges and Opportunities
- Jin-Hee Cho et al., IEEE Communications Magazine 2022 | Pages: 8 | Difficulty: 3/5
- Abstract: Explores using federated learning for collaborative intrusion detection across organizations without sharing raw data. Discusses challenges including data heterogeneity, Byzantine attacks, and communication costs. Proposes research directions for making federated IDS practical and secure.
- Keywords: Federated learning, intrusion detection, privacy-preserving learning, collaborative security

C8. AI for Cybersecurity Defense: Malware Classification

MalConv: Deep Learning for Malware Classification from Raw Bytes
- Edward Raff et al., AAAI 2018 (extended 2020) | Pages: 14 | Difficulty: 3/5
- Abstract: Proposes MalConv, a CNN architecture that classifies malware directly from raw byte sequences without manual feature engineering. Demonstrates that end-to-end learning from raw bytes can achieve competitive accuracy compared to hand-crafted features. Addresses the feature engineering bottleneck in malware analysis.
- Keywords: CNNs, malware classification, end-to-end learning, raw bytes, deep learning
Adversarial Malware Binaries: Evading Deep Learning for Malware Detection
- Bojan Kolosnjaji et al., ESORICS 2018 (extended 2020) | Pages: 18 | Difficulty: 4/5
- Abstract: Demonstrates adversarial attacks against deep learning-based malware detectors. Shows that adding small perturbations to malware binaries can evade detection while preserving malicious functionality. Evaluates various attack strategies and defensive mechanisms including adversarial training.
- Keywords: Adversarial attacks, malware detection, evasion attacks, binary analysis, deep learning robustness
Transformer-Based Language Models for Malware Detection
- Muhammed Demirkıran et al., arXiv 2022 | Pages: 10 | Difficulty: 3/5
- Abstract: Applies transformer models to malware classification using API call sequences as input. Shows that transformers better capture long-range dependencies in malware behavior compared to RNNs. Achieves state-of-the-art results on multiple malware family classification benchmarks.
- Keywords: Transformers, malware detection, API sequences, BERT, sequence modeling
Deep Learning for Android Malware Detection: A Systematic Literature Review
- Pinyaphat Tasawong et al., IEEE Access 2021 | Pages: 32 | Difficulty: 2/5
- Abstract: Comprehensive survey of deep learning approaches for Android malware detection. Categorizes methods by input representation (static features, dynamic behavior, hybrid) and model architecture. Analyzes 150+ papers from 2015-2020 and identifies trends and research gaps.
- Keywords: Survey paper, Android security, malware detection, deep learning, mobile security
Explainable Malware Detection with Attention-Based Neural Networks
- Zhaoqi Zhang et al., IEEE TIFS 2020 | Pages: 14 | Difficulty: 3/5
- Abstract: Proposes attention-based neural networks for malware detection that provide explanations for classification decisions. Attention mechanism highlights which code segments or API calls contribute most to malware classification. Improves trust and debuggability of deep learning malware detectors.
- Keywords: Attention mechanisms, explainability, malware detection, interpretable ML, neural networks

C9. AI for Cybersecurity Defense: Blockchain Security

SmartGuard: An LLM-Enhanced Framework for Smart Contract Vulnerability Detection
- Yangruibo Ding et al., arXiv 2023 | Pages: 12 | Difficulty: 3/5
- Abstract: Proposes SmartGuard, a framework that combines LLMs with program analysis techniques to detect smart contract vulnerabilities. Uses chain-of-thought prompting and static analysis results to guide LLMs in identifying security issues. Achieves higher precision than traditional static analyzers on Solidity contracts.
- Keywords: LLMs, smart contracts, vulnerability detection, blockchain security, Solidity, chain-of-thought
Deep Learning for Blockchain Security: Opportunities and Challenges
- Shijie Zhang et al., IEEE Network 2021 | Pages: 8 | Difficulty: 2/5
- Abstract: Survey paper discussing applications of deep learning to blockchain security including smart contract analysis, anomaly detection, and fraud detection. Identifies challenges such as limited labeled data and adversarial attacks. Proposes research directions for improving blockchain security with AI.
- Keywords: Survey paper, blockchain security, deep learning, smart contracts, anomaly detection
Graph Neural Networks for Smart Contract Vulnerability Detection
- Zhuang Yuan et al., ICSE 2020 | Pages: 11 | Difficulty: 4/5
- Abstract: Uses graph neural networks to detect vulnerabilities in smart contracts by representing contracts as control flow and data flow graphs. Learns vulnerability patterns from graph structures. Demonstrates superior performance compared to sequence-based and tree-based models.
- Keywords: Graph neural networks, smart contracts, vulnerability detection, program graphs, GNN
Detecting Ponzi Schemes on Ethereum: Towards Healthier Blockchain Technology
- Weili Chen et al., WWW 2020 | Pages: 10 | Difficulty: 3/5
- Abstract: Proposes deep learning methods to detect Ponzi schemes deployed as smart contracts on Ethereum. Extracts features from account behaviors and contract code. Achieves over 90% detection accuracy and discovers hundreds of unreported Ponzi schemes on the Ethereum blockchain.
- Keywords: Ponzi schemes, Ethereum, fraud detection, smart contracts, deep learning

C10. AI for Cybersecurity Defense: Phishing Detection

An Improved Transformer-Based Model for Detecting Phishing Emails
- Ahmad Jamal et al., IEEE Access 2023 | Pages: 12 | Difficulty: 2/5
- Abstract: Proposes IPSDM, a fine-tuned BERT-based model for detecting phishing and spam emails. Addresses the challenge of increasingly sophisticated phishing attacks that evade traditional rule-based filters. Achieves high accuracy on real-world email datasets and provides interpretable attention-based explanations.
- Keywords: BERT, phishing detection, email security, transformers, fine-tuning, NLP
ChatSpamDetector: Leveraging LLMs for Phishing Email Detection
- Takashi Koide et al., arXiv 2023 | Pages: 10 | Difficulty: 2/5
- Abstract: Introduces a novel phishing detection system using large language models with chain-of-thought reasoning. LLM analyzes email content, headers, and links to determine phishing likelihood and provides detailed reasoning. Achieves over 95% accuracy while offering human-readable explanations.
- Keywords: LLMs, phishing detection, chain-of-thought, email security, zero-shot learning
Deep Learning Approaches for Phishing Website Detection: A Systematic Literature Review
- Rongqin Liang et al., Computers & Security 2022 | Pages: 28 | Difficulty: 2/5
- Abstract: Systematic review of deep learning methods for phishing website detection covering 2015-2021. Categorizes approaches by input features (URL, HTML, visual) and model architecture. Compares performance metrics and identifies research trends and gaps in phishing detection.
- Keywords: Survey paper, phishing detection, deep learning, website security, URL analysis
Devising and Detecting Phishing Emails Using LLMs
- Florian Heiding et al., arXiv 2023 | Pages: 14 | Difficulty: 3/5
- Abstract: Studies both offensive and defensive capabilities of LLMs for phishing. Shows GPT-4 can generate highly convincing phishing emails that evade traditional filters. Also evaluates LLMs' ability to detect phishing, finding they outperform rule-based systems but require careful prompt engineering.
- Keywords: LLMs, phishing generation, phishing detection, GPT-4, email security, adversarial use

C11. Cyber Threat Intelligence

Transformer-Based Named Entity Recognition for Cyber Threat Intelligence
- Panos Evangelatos et al., IEEE ICNC 2021 | Pages: 6 | Difficulty: 3/5
- Abstract: Applies transformer models like BERT to extract security entities (malware names, vulnerabilities, attack techniques) from cyber threat intelligence reports. Outperforms traditional NER approaches on security-specific entity types. Enables automated extraction of actionable intelligence from unstructured text.
- Keywords: NER, transformers, BERT, threat intelligence, information extraction, NLP
Automated Generation of Fake Cyber Threat Intelligence Using Transformers
- Priyanka Ranade et al., arXiv 2021 | Pages: 8 | Difficulty: 3/5
- Abstract: Demonstrates that transformer models can automatically generate realistic but fake cyber threat intelligence reports to mislead defenders. Raises concerns about adversarial use of language models in cyber operations. Shows that generated reports can fool both human analysts and automated systems.
- Keywords: GPT-2, text generation, deception, threat intelligence, adversarial AI, transformers
LLM4Sec: Fine-Tuned LLMs for Cybersecurity Log Analysis
- Even Karlsen et al., arXiv 2023 | Pages: 10 | Difficulty: 3/5
- Abstract: Proposes LLM4Sec framework for fine-tuning language models on cybersecurity logs. Benchmarks multiple model variants on log classification and anomaly detection tasks. Shows DistilRoBERTa achieves F1-score of 0.998 across diverse security log datasets while being computationally efficient.
- Keywords: LLMs, log analysis, RoBERTa, fine-tuning, anomaly detection, security logs
Deep Learning for Threat Intelligence: A Survey
- Xiaojun Xu et al., arXiv 2022 | Pages: 25 | Difficulty: 2/5
- Abstract: Comprehensive survey of deep learning applications in cyber threat intelligence including threat detection, attribution, and prediction. Reviews architectures (CNNs, RNNs, transformers, GNNs) and their applications. Discusses challenges including adversarial attacks and data scarcity.
- Keywords: Survey paper, threat intelligence, deep learning, threat detection, NLP

C12. AI Model Security & Supply Chain

Weight Poisoning Attacks on Pre-trained Models
- Keita Kurita et al., ACL 2020 | Pages: 11 | Difficulty: 3/5
- Abstract: Demonstrates that pre-trained language models in public repositories can be poisoned with backdoors that persist through fine-tuning. Attackers poison model weights such that backdoors activate on downstream tasks after users fine-tune the model. Highlights supply chain risks in the model-sharing ecosystem.
- Keywords: Weight poisoning, pre-trained models, backdoor attacks, supply chain security, BERT, transfer learning
Backdoor Attacks on Self-Supervised Learning
- Aniruddha Saha et al., CVPR 2022 | Pages: 10 | Difficulty: 3/5
- Abstract: Shows that backdoors injected during self-supervised pre-training transfer to downstream supervised tasks. Even when fine-tuning on clean data, backdoored features persist and can be activated with appropriate triggers. Demonstrates attacks on contrastive learning methods like SimCLR and MoCo.
- Keywords: Self-supervised learning, backdoor attacks, contrastive learning, transfer learning, SimCLR
Model Stealing Attacks Against Inductive Graph Neural Networks
- Asim Waheed Duddu et al., IEEE S&P 2022 | Pages: 16 | Difficulty: 4/5
- Abstract: Demonstrates model extraction attacks specifically targeting graph neural networks. Shows that GNNs are particularly vulnerable to stealing because attackers can query with carefully crafted graphs. Extracts high-fidelity copies of target models with fewer queries than required for traditional neural networks.
- Keywords: Model stealing, graph neural networks, model extraction, API attacks, intellectual property
Proof-of-Learning: Definitions and Practice
- Hengrui Jia et al., IEEE S&P 2021 | Pages: 17 | Difficulty: 4/5
- Abstract: Introduces proof-of-learning, a cryptographic protocol that allows model trainers to prove they performed the training computation honestly. Enables verification that a model was trained as claimed without revealing training data. Addresses concerns about stolen models and fraudulent training claims.
- Keywords: Proof-of-learning, cryptographic protocols, model verification, training provenance, zero-knowledge proofs
Protecting Intellectual Property of Deep Neural Networks with Watermarking
- Yusuke Uchida et al., AsiaCCS 2017 (extended 2020) | Pages: 13 | Difficulty: 3/5
- Abstract: Proposes watermarking techniques to protect ownership of neural networks. Embeds watermarks that survive fine-tuning and model extraction attempts. Watermarks can be verified to prove ownership without degrading model performance. Addresses intellectual property protection in model sharing.
- Keywords: Model watermarking, intellectual property, ownership verification, backdoor-based watermarks, model protection

C13. Robustness & Certified Defenses

Certified Adversarial Robustness via Randomized Smoothing
- Jeremy Cohen et al., ICML 2019 (extended 2020) | Pages: 17 | Difficulty: 4/5
- Abstract: Provides provable robustness certificates using randomized smoothing by adding Gaussian noise. Transforms any classifier into a certifiably robust version with theoretical guarantees. Achieves state-of-the-art certified accuracy on ImageNet and demonstrates scalability to large models and datasets.
- Keywords: Certified defenses, randomized smoothing, Gaussian noise, provable robustness, theoretical guarantees
Provable Defenses via the Convex Outer Adversarial Polytope
- Eric Wong, Zico Kolter, ICML 2018 (extended 2020) | Pages: 11 | Difficulty: 5/5
- Abstract: Uses convex optimization to train neural networks with provable robustness guarantees. Computes exact worst-case adversarial loss during training through linear relaxation. Limited to small networks due to computational complexity but provides strongest possible guarantees.
- Keywords: Certified defenses, convex optimization, provable robustness, linear relaxation, formal verification
Benchmarking Neural Network Robustness to Common Corruptions and Perturbations
- Dan Hendrycks, Thomas Dietterich, ICLR 2019 (extended 2020) | Pages: 17 | Difficulty: 2/5
- Abstract: Introduces ImageNet-C benchmark for evaluating robustness to natural image corruptions like noise, blur, and weather effects. Shows that adversarially trained models often fail on common corruptions despite improved adversarial robustness. Demonstrates importance of testing robustness beyond adversarial perturbations.
- Keywords: Robustness benchmarks, natural corruptions, distribution shift, model evaluation, ImageNet-C
Smoothed Analysis of Neural Networks
- Huan Zhang et al., NeurIPS 2020 | Pages: 12 | Difficulty: 5/5
- Abstract: Provides smoothed analysis framework for studying neural network behavior under input perturbations. Derives tighter robustness certificates by analyzing networks with random smoothing. Bridges gap between worst-case analysis and average-case performance.
- Keywords: Smoothed analysis, robustness certification, theoretical analysis, neural networks, perturbation analysis

C14. Interpretability & Verification for Security

Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks
- Guy Katz et al., CAV 2017 (extended 2020) | Pages: 20 | Difficulty: 5/5
- Abstract: Introduces formal verification of neural networks using SMT solving. Can prove or disprove safety properties about ReLU network behavior. Foundational work enabling formal guarantees about neural network decisions. Demonstrated on aircraft collision avoidance system verification.
- Keywords: Formal verification, SMT solvers, ReLU networks, safety properties, symbolic reasoning
DeepXplore: Automated Whitebox Testing of Deep Learning Systems
- Kexin Pei et al., SOSP 2017 (extended 2020) | Pages: 18 | Difficulty: 3/5
- Abstract: Introduces neuron coverage as a metric for testing deep learning systems. Automatically generates test inputs that maximize differential behavior across multiple models. Discovers thousands of erronous behaviors in production DL systems including self-driving cars.
- Keywords: DNN testing, neuron coverage, differential testing, automated test generation, model testing
Attention is Not Always Explanation: Quantifying Attention Flow in Transformers
- Samira Abnar, Willem Zuidema, EMNLP 2020 | Pages: 11 | Difficulty: 3/5
- Abstract: Analyzes whether attention weights in transformers provide faithful explanations of model behavior. Introduces attention flow to track information through layers. Shows attention weights can be manipulated without changing predictions, questioning their reliability as explanations in security-critical applications.
- Keywords: Attention mechanisms, interpretability, transformers, explanation faithfulness, NLP analysis
Quantifying Uncertainty in Neural Networks for Security Applications
- Lewis Smith, Yarin Gal, arXiv 2020 | Pages: 10 | Difficulty: 3/5
- Abstract: Uses Bayesian neural networks to quantify prediction uncertainty in security tasks. Shows uncertainty estimates can detect adversarial examples and out-of-distribution inputs. Proposes using uncertainty quantification as an additional defense layer in security-critical systems.
- Keywords: Uncertainty quantification, Bayesian neural networks, Monte Carlo dropout, adversarial detection, OOD detection

C15. AI for Offensive Security

Generating Adversarial Examples with Generative Models
- Chaowei Xiao et al., NDSS 2019 (extended 2020) | Pages: 15 | Difficulty: 4/5
- Abstract: Uses generative models (GANs, VAEs) to create adversarial examples that lie on the natural data manifold. These attacks are more realistic and harder to detect than perturbation-based attacks. Demonstrates successful attacks against defended models that detect out-of-distribution adversarial examples.
- Keywords: GANs, VAEs, adversarial examples, generative models, natural adversarial examples
Automating Network Exploitation with Reinforcement Learning
- William Glodek et al., arXiv 2020 | Pages: 10 | Difficulty: 3/5
- Abstract: Applies deep reinforcement learning to automated network penetration testing. Agents learn to exploit vulnerabilities through trial and error in simulated environments. Demonstrates potential for AI-driven offensive security testing but also highlights limitations in complex real-world scenarios.
- Keywords: Reinforcement learning, penetration testing, network exploitation, automated security testing, DRL
Generating Natural Language Adversarial Examples on a Large Scale
- Moustafa Alzantot et al., EMNLP 2018 (extended 2020) | Pages: 12 | Difficulty: 3/5
- Abstract: Uses genetic algorithms to generate adversarial text that fools NLP classifiers while maintaining semantic similarity. Demonstrates vulnerabilities in sentiment analysis, textual entailment, and reading comprehension systems. Shows that text classifiers are brittle to synonymous substitutions.
- Keywords: Adversarial NLP, genetic algorithms, text perturbations, semantic similarity, NLP attacks
LLM-Fuzzer: Fuzzing Large Language Models with Adaptive Prompts
- Jiahao Yu et al., arXiv 2023 | Pages: 16 | Difficulty: 2/5
- Abstract: Proposes automated fuzzing framework for discovering LLM vulnerabilities using mutation-based approach. Adaptively generates test prompts that trigger jailbreaks, hallucinations, and other failure modes. Demonstrates effectiveness in finding alignment failures across multiple LLM models.
- Keywords: Fuzzing, LLMs, automated testing, jailbreaks, vulnerability discovery, prompt generation

class/gradsec2026.1773594006.txt.gz · Last modified: 2026/03/16 00:00 by mhshin · [Old revisions]

Table of Contents

Overview

Participants

Agenda

Class Information

Reading List for LLM-based Cybersecurity

C1. Adversarial Machine Learning

C2. Model Poisoning & Backdoor Attacks

C3. Privacy Attacks on Machine Learning

C4. LLM Security & Jailbreaking

C5. Federated Learning Security

C6. AI for Cybersecurity Defense: Software Security

C7. AI for Cybersecurity Defense: Intrusion Detection

C8. AI for Cybersecurity Defense: Malware Classification

C9. AI for Cybersecurity Defense: Blockchain Security

C10. AI for Cybersecurity Defense: Phishing Detection

C11. Cyber Threat Intelligence

C12. AI Model Security & Supply Chain

C13. Robustness & Certified Defenses

C14. Interpretability & Verification for Security

C15. AI for Offensive Security