This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
class:gradsec2026 [2026/05/15 08:58] hanwoo [Agenda] |
class:gradsec2026 [2026/05/15 10:38] (current) hanwoo [C3. Privacy Attacks on Machine Learning] |
||
|---|---|---|---|
| Line 51: | Line 51: | ||
| | ::: | Han | [[https://arxiv.org/pdf/2305.00944|Poisoning Language Models During Instruction Tuning]] | {{ :class:poisoning_language_models_during_instruction_tuning.pdf|poisoning_language_models_during_instruction_tuning}} | | | | ::: | Han | [[https://arxiv.org/pdf/2305.00944|Poisoning Language Models During Instruction Tuning]] | {{ :class:poisoning_language_models_during_instruction_tuning.pdf|poisoning_language_models_during_instruction_tuning}} | | | ||
| | 5/13 | Han | [[https://arxiv.org/pdf/2004.04692 |RETHINKING THE TRIGGER OF BACKDOOR ATTACK]] | {{ :class:rethinking_the_trigger_of_backdoor_attack.pdf |rethinking_the_trigger_of_backdoor_attack}} | | | | 5/13 | Han | [[https://arxiv.org/pdf/2004.04692 |RETHINKING THE TRIGGER OF BACKDOOR ATTACK]] | {{ :class:rethinking_the_trigger_of_backdoor_attack.pdf |rethinking_the_trigger_of_backdoor_attack}} | | | ||
| - | | ::: | Kwak | | | | | + | | ::: | Kwak | [[https://arxiv.org/pdf/1910.00033 |Hidden Trigger Backdoor Attacks]] | {{ :class:hidden_trigger_backdoor_attacks.pdf |}} | | |
| | 5/20 | Kwak | | | | | | 5/20 | Kwak | | | | | ||
| | ::: | Jo | | | | | | ::: | Jo | | | | | ||
| Line 162: | Line 162: | ||
| * Keywords: Physical adversarial examples, backdoor attacks, computer vision, robust perturbations, physical-world attacks | * Keywords: Physical adversarial examples, backdoor attacks, computer vision, robust perturbations, physical-world attacks | ||
| * URL: https://arxiv.org/pdf/2004.04692.pdf | * URL: https://arxiv.org/pdf/2004.04692.pdf | ||
| - | - **Hidden Trigger Backdoor Attacks** | + | - <fc red>(kawk)</fc> **Hidden Trigger Backdoor Attacks** |
| * Aniruddha Saha et al., AAAI 2020 | Pages: 8 | Difficulty: 3/5 | * Aniruddha Saha et al., AAAI 2020 | Pages: 8 | Difficulty: 3/5 | ||
| * Abstract: Proposes backdoor attacks where triggers are hidden in the neural network's feature space rather than being visible patterns in the input. These attacks are harder to detect because there's no visible trigger pattern that can be identified through input inspection or trigger inversion techniques. | * Abstract: Proposes backdoor attacks where triggers are hidden in the neural network's feature space rather than being visible patterns in the input. These attacks are harder to detect because there's no visible trigger pattern that can be identified through input inspection or trigger inversion techniques. | ||
| Line 179: | Line 179: | ||
| ==== C3. Privacy Attacks on Machine Learning ==== | ==== C3. Privacy Attacks on Machine Learning ==== | ||
| - | - **Extracting Training Data from Large Language Models** | + | -<fc red>(kawk)</fc> **Extracting Training Data from Large Language Models** |
| * Nicholas Carlini et al., USENIX Security 2021 | Pages: 17 | Difficulty: 3/5 | * Nicholas Carlini et al., USENIX Security 2021 | Pages: 17 | Difficulty: 3/5 | ||
| * Abstract: Demonstrates that large language models like GPT-2 memorize and can be made to emit verbatim training data including personal information, phone numbers, and copyrighted content. The paper raises serious privacy concerns for LLMs trained on web data and shows that model size correlates with memorization capability. | * Abstract: Demonstrates that large language models like GPT-2 memorize and can be made to emit verbatim training data including personal information, phone numbers, and copyrighted content. The paper raises serious privacy concerns for LLMs trained on web data and shows that model size correlates with memorization capability. | ||