COMP737022 : Trustworthy Machine Learning

Course Info

With the growing deployment of machine learning systems in real-world applications, ensuring that these systems are trustworthy, i.e., secure, robust, interpretable, privacy-preserving, fair, and protected by intellectual property rights, has become more important than ever. Trustworthy machine learning is an emerging branch of machine learning that aims to address the trustworthiness issues of existing machine learning algorithms and models and ultimately achieve responsible and trustworthy artificial intelligence (AI). This course systematically introduces the main research directions of trustworthy machine learning and the existing works in each direction. Specifically, the course first briefly reviews the basic concepts of machine learning, and then introduces interpretability, general robustness, adversarial robustness (adversarial attacks and defenses), data poisoning (data poisoning and defense), backdoor robustness (backdoor attacks and defenses), privacy (data leakage and model stealing), differential privacy, federated learning, fairness, data tampering and forgery, model intellectual property protection, and scientific frontier lectures in turn. By taking this course, students will gain a comprehensive understanding of trustworthy machine learning, participate in-class attack competition, contribute to open-source projects, and independently complete a self-selected research topic.

Link to online course:
#Tencent meeting ID: 713-9625-9787
#Password: 737022

Textbook: 《人工智能:数据与模型安全》(Artificial Intelligene: Data and Model Security)

Course Object

The main focus of this course is to introduce students to the main research directions of trustworthy machine learning, the problems faced by current machine learning in each direction, and possible solutions. The aim is to enable students to have a comprehensive understanding of this emerging research field and to exercise various attack and defense techniques. At the same time, the course aims to cultivate students’ ability to identify and solve problems and their enthusiasm for creating “responsible” and "trustworthy” AI.

Course Content & Schedule

Teaching weeks Slides Content & Expected Achievement Assignment
1 week 1.pdf The basic concepts of machine learning, including typical learning paradigms, loss functions, optimization methods, models, training methods, and representative application scenarios. This can help students establish a macro understanding of machine learning.
2 week 2.pdf Interpretability and common corruption robustness: the main ideas and representative methods of machine learning interpretability and the robustness of machine learning models to common corruptions. This can help students quickly establish the ability to analyze, interpret, and evaluate models.
3 week 3.pdf Adversarial exmaples: existing attacking methods of crafting adversarial examples and representative explainations for the existance of adversarial exampels.
4 week 4.pdf Adversarial defense I: adversarial example detection method, helping students to know the basic principles of detecting adversarial examples and faced challenges in practical usage.
5 N/A National Day holiday In-class adversarial attack competition ( click here to participate )
6 week 6.pdf Adversarial defense II: early adversarial example defense methods and their unreliability, adversarial training methods. THis can help students to understand the theoretical basis, training methods, and techniques of adversarial training, as well as the remaining challenges. In-class adversarial attack competition ( click here to participate )
7 week 7.pdf Data poisoning attacks and defenses: classic data poisoning attack methods and related defense methods. In-class adversarial attack competition ( click here to participate )
8 week 8.pdf Backdoor attacks and defenses: backdoor attack methods and their working mechnism/ This can help students to understand the memory function of machine learning models. In-class adversarial attack competition ( click here to participate )
9 week 9.pdf Data leakage and model extraction: two privacy issues caused by the memorization of deep learning models, namely data leakage and model extraction. This can help students understand common data and model extraction methods.
10 week 10.pdf Differential privacy: basic concepts and classic methods of differential privacy. It enables students to understand member inference attacks and defense methods. Register team for final assignment click here
11 week 11.pdf Data tampering and forgery: traditional image and video tampering algorithms, deepfakes, and detection methods. It can help students understand the necessity and challenges of detecting tampered and forged data. Register team for final assignment click here
12 week 12.pdf Federated Learning: the basic concepts of federated learning, its privacy protection functionality, convergence problem, and privacy and poisoning attacks and defenses in federated learning.
13 week 13.pdf Machine Learning Fairness: typical fairness problems and solutions faced by traditional machine learning and federated learning. This can help students understand several common forms of bias and the principle of unbiased learning. Briefly introduce AI ethics-related knowledge.
14 week 14.pdf Model intellectual property (IP) protection: classic methods for model IP protection, including model watermark, model fingerprint, and deep learning testing methods. It can help students to understand classic model IP protection methods and their advantages and disadvantages.
15 To appear Guest Lectures: Invite 1-2 international scholars to talk about their recent research.
16 To appear Final presentation
17 To appear Final presentation

Course Assessment & Grading

The course assessment standard will involve attendance, assignment(s), course paper and other(s).

Assessment Criteria Percentage Assessment Standard
Attendance 10% 10 points for full attendance, 1 point deducted for each absence
Participation 0%
Assignment(s) 30% In-calss adversarial attack competition on CodaLab
Course Paper 60% Choose to reproduce one research topic/paper, and identify the most vulnerable parts of the model/algorithm presented in the paper, such as on what data the model performs the worst, under what circumstances it fails completely.
>= 40 points: novel topic, innovative testing with academic value and practical significance, and excellent writing.
>= 30 points: novel topic, well-designed experiment, clear viewpoint, and good writing。
< 30 points: lack of background knowledge, topic choice, methodology, experimental design and analysese do not meet the basic standard.
Open-book exam 0%
Close-book exam 0%
Other(s) 0% Contributing to open source projects (10 points maximum)。

Lecturer Profile

Xingjun Ma

Dr. Xingjun Ma

Xingjun Ma is an associate professor in the School of Computer Science at Fudan University and a member of Fudan Vision and Learning Lab ( He received his PhD degree from The University of Melbourne and worked as a postdoctoral research fellow in The University of Melbourne. He was an assistant professor at Deakin University prior to Fudan. His main research area is trustworthy machine learning, aiming to design secure, robust, explainable, privacy-preserving, and fair machine learning algorithms and models for AI applications. He has published 40+ papers at top-tier conferences and journals, including ICLR, NeurIPS, ICML, and S&P. He received the Best Paper Award at SISAP’21, the Best Paper Runner-Up at SSDBM'21, and the 2022 Robert W. Cahn Best Paper Award at Journal of Materials Science. His work on personal data protection has been reported by MIT Technology Review. He also serves as a program committee member or reviewer for a number of conferences and journals.


©Contact: OpenTAI group, Fudan Vision and Learning Lab