CS427

Download as PDF

AI Safety

Course (UG/PG)

Undergraduate

Offering Unit/Department

Course Description

With the advancement of systems like GPT, artificial intelligence (AI) techniques are anticipated to significantly impact various aspects of individuals' lives. While these AI techniques have demonstrated remarkable, occasionally superhuman, performance across numerous applications, there is a growing concern regarding their safety and security. It has been shown AI systems are subject to a range of attacks, ranging from adversarial attacks (i.e., perturbing an input slightly causes an AI to make completely wrong predictions), backdoor attacks (i.e., backdoors can be easily embedded in neural networks), and privacy-violating attacks such as membership inference attacks (i.e., an adversary may reliably infer whether a certain sample is used during training or not). In addition, AI systems can inherit or amplify biases present in their training data, potentially leading to unfair or discriminatory outcomes. Furthermore, many AI models, including GPT, are complex and not easily interpretable. It makes understanding how these models make decisions highly nontrivial, even though it is crucial for trust and accountability.

This course aims to present a systematic view on the range of AI safety problems that have been identified, analyse their root causes, and study potential approaches to mitigate the safety and security risks. In particular, we will focus on answering two key questions. First, given an AI system, how do we systematically evaluate its safety risk? Second, given an AI system that potentially has safety issues, how do we systematically mitigate the risks? This course will feature real-life AI safety issues on popular AI systems such as ImageNet, GPT and so on.

Course Learning Outcomes

1. To understand the impact of AI for system safety and social goods
2. To gain knowledge on different safety risks of AI systems and their implications
3. To gain knowledge on difference safety risk mitigation methods for AI systems
4. To be able to conduct various attacks on AI systems
5. To be able to evaluate the safety risk of AI systems in terms of robustness, backdoor, bias and privacy
6. To be able to apply mitigation method for reducing the safety risk of AI systems
Understand different risks of AI systems, including adversarial attacks, backdoor, biases, data-leakage and so on
Understand different ways of evaluating AI systems safety risks, such as empirical evaluation and verification
Understand different ways of mitigating AI systems safety risks, and their pros and cons
Be able to evaluate and mitigate safety risks of real-world AI systems

Discipline-Specific Competencies

Software Design, Formal Proof Construction, Research, Security Assessment and Testing, Software Testing

SMU Graduate Learning Outcomes

Disciplinary Knowledge, Critical thinking & problem solving, Collaboration and leadership, Ethics and social responsibility, Self-directed learning

Grading Basis

GRD - Graded