Resolving the Mystery of Deep Learning by Statistical Physics
- Project Scheme:
- General Research Fund
- Project Year:
- Project Leader:
- Dr YEUNG, Chi Ho
- (Department of Science and Environmental Studies)
In the proposed project, we will apply tools in statistical physics to derive a fundamental understanding of DNNs, which can be applied to boost their performance and most importantly reduce any risks when DNNs are used in vital applications.
The rapid development of deep learning in the past decade has led to many remarkable applications, ranging from speech recognition, which achieved human-level performance, to Go-playing algorithms, which beat top professional players. Surprisingly, despite the increasing number of applications using deep learning, we only have a limited understanding of their remarkable performance. In particular, many deep learning applications apply deep neural networks (DNN) to infer the non-trivial input-output relations on labeled datasets. The internal representations in DNNs, the mechanism by which they arrive at good decisions, and how the over-parametrized DNNs avoid over-fitting and achieve high generalizability are not fully understood. This incomplete understanding may cause fatal dangers, as deep learning is now commonly applied to vital applications such as medical image analyses and self-driving cars. In the proposed project, we will apply tools in statistical physics to derive a fundamental understanding of DNNs, which can be applied to boost their performance and most importantly reduce any risks when DNNs are used in vital applications. We remark that statistical physics tools have already been applied to analyze shadow neural networks to obtain their macroscopic properties inaccessible by tools in other areas, but the understanding of DNNs via statistical physics is far from complete. Here, we aim to (1) develop an improved fundamental understanding on DNNs in terms of their loss landscape, training dynamics, and most importantly their remarkable generalization performance; (2) establish theoretical frameworks to analyze DNNs, by drawing an analogy with spin glasses and disordered systems; these frameworks will play a crucial role in the future theoretical studies and understanding of deep learning; (3) understand and leverage the dilemma of exploration and exploitation in training DNNs, and introduce new protocols to be used in combination with the state-of-the-art algorithms for better training and generalization of DNNs. The above tasks are challenging, but our successful attempts will be highly rewarding, as they will provide a fundamental understanding to resolve the mystery underlying DNNs, which will lead to important insights on a wide range of existing and future deep learning applications.
More Major Research Projects
Creating connections: A study of the impact and effectiveness of a visual arts teacher-curator pedagogy
Department of Cultural and Creative Arts
Latent Variable Models for Multifaceted Subspace Clustering
Department of Mathematics and Information Technology
Are Older Adults Lonely? Elucidating Age Differences in the Relationship between Solitude and Well-Being.
Department of Special Education and Counselling
Power sharing in the school-based values education curriculum: Implications for critical pedagogy practice
Department of Social Sciences