Differences

This shows you the differences between two versions of the page.

--- class:gradsec2026 [2026/05/12 08:26]
hanwoo [C2. Model Poisoning & Backdoor Attacks]
+++ class:gradsec2026 [2026/05/15 10:38] (current)
hanwoo [C3. Privacy Attacks on Machine Learning]
@@ Line 50: / Line 50: @@
 | 5/6 | Jo |  [[https://arxiv.org/pdf/2310.12815 | Formalizing and Benchmarking Prompt Injection Attacks and Defenses]]| {{ :class:formalizing_and_benchmarking_prompt_injection_attacks_and_defenses_-_복사본.pptx |발표본}} |  |
 | ::: | Han | [[https://arxiv.org/pdf/2305.00944|Poisoning Language Models During Instruction Tuning]] | {{ :class:poisoning_language_models_during_instruction_tuning.pdf|poisoning_language_models_during_instruction_tuning}} |  |
-| 5/13 | Han | [[https://arxiv.org/pdf/2004.04692 |RETHINKING THE TRIGGER OF BACKDOOR ATTACK]] |  |  |
+| 5/13 | Han | [[https://arxiv.org/pdf/2004.04692 |RETHINKING THE TRIGGER OF BACKDOOR ATTACK]] | {{ :class:rethinking_the_trigger_of_backdoor_attack.pdf |rethinking_the_trigger_of_backdoor_attack}} |  |
-| ::: | Kwak |  |  |  |
+| ::: | Kwak | [[https://arxiv.org/pdf/1910.00033 |Hidden Trigger Backdoor Attacks]] | {{ :class:hidden_trigger_backdoor_attacks.pdf |}} |  |
 | 5/20 | Kwak |  |  |  |
 | ::: | Jo |  |  |  |
@@ Line 162: / Line 162: @@
     * Keywords: Physical adversarial examples, backdoor attacks, computer vision, robust perturbations, physical-world attacks
     * URL: https://arxiv.org/pdf/2004.04692.pdf
-  - **Hidden Trigger Backdoor Attacks**
+  - <fc red>(kawk)</fc> **Hidden Trigger Backdoor Attacks**
     * Aniruddha Saha et al., AAAI 2020 | Pages: 8 | Difficulty: 3/5
     * Abstract: Proposes backdoor attacks where triggers are hidden in the neural network's feature space rather than being visible patterns in the input. These attacks are harder to detect because there's no visible trigger pattern that can be identified through input inspection or trigger inversion techniques.
@@ Line 179: / Line 179: @@
 ==== C3. Privacy Attacks on Machine Learning ====
-  - **Extracting Training Data from Large Language Models**
+  -<fc red>(kawk)</fc> **Extracting Training Data from Large Language Models**
     * Nicholas Carlini et al., USENIX Security 2021 | Pages: 17 | Difficulty: 3/5
     * Abstract: Demonstrates that large language models like GPT-2 memorize and can be made to emit verbatim training data including personal information, phone numbers, and copyrighted content. The paper raises serious privacy concerns for LLMs trained on web data and shows that model size correlates with memorization capability.

class/gradsec2026.1778549168.txt.gz · Last modified: 2026/05/12 08:26 by hanwoo · [Old revisions]