Σεμινάριο CEID and Social Hour:
“Deep Neural Networks: A Nonparametric Bayesian View with Local Competition”.
Ομιλητής: Σέργιος Θεοδωρίδης, Ομότιμος Καθηγητής, ΕΚΠΑ, Aalborg University, Denmark.
Ημερομηνία-χώρος: Παρασκευή 21 Οκτωβρίου, 3-5μμ, ΤΜΗΥΠ, αμφιθέατρο Γ.
Deep Neural Network are currently state of the art architectures in Machine Learning. Often, human or even superhuman performances are obtained on artificially constructed data sets, both in classification as well as in regression. However, these advantages do not come without drawbacks. Selecting the specific architecture, e.g., the number of neurons and the number of the involved parameters, it is still a matter of “art” (trial and error) performed in an ad-hoc manner by the user. Also, although one needs to train networks with millions of parameters, in order to achieve such impressive prediction performances, it is known that after training, only a small percentage of the parameters could be used. In other words, such networks are heavily overparameterized. Finally, deep neural networks are not robust in the so-called adversarial attacks. One can slightly perturb the input in a specific way and the performance of a network can drastically be reduced to unacceptable levels. This raises serious security concerns, especially in certain applications, e.g., in autonomous driving, financial and medical applications.
In this talk, we try to address all previously mentioned drawbacks, by resorting to a fully probabilistic approach to the design and training of deep neural networks. The inspiration springs from neuroscientific findings, where the brain is known to function probabilistically, and in particular in accordance to what is known as Bayesian learning. Our prior knowledge and prior beliefs act as prior probabilities that “bias” the learning of a task towards prior knowledge. The framework of our work is that of the nonparametric Bayesian learning, where the number of the involved parameters is considered unknown and not preselected by the user. Both fully connected as well as convolutional networks (CNNs) will be discussed. Adopting nonparametric prior distributions (prior knowledge), such as the Indian Buffet Process (IBP), the number of parameters as well as the number of neurons or the number of kernels (in CNN) are optimally estimated via the resulting posterior distributions; that is, the probability distributions that are learned from the data, starting from the priors. The training evolves around variational Bayesian arguments.
Besides the probabilistic arguments, which are followed for estimating the involved network parameters, the nonlinearities used are neither squashing functions noρ rectified linear units (ReLU), which are typically used in the standard networks. Instead, inspired by neuroscientific findings, the nonlinearities comprise units of probabilistically competing linear neurons, in line with what is known as the local winner-take-all (LTWA) strategy. It is known from neuroscience that when the brain functions only some of the neurons get active, and this is a result of competition among the neurons. In each LTWA node, only one neuron fires to provide the output. Thus, neurons, in each node, are laterally (same layer) related and only one is activated; yet, this takes place in a probabilistic context, based on an underlying distribution that relates the neurons of the respective node. It turns out that the resulting deep network architectures follow a completely different rationale compared to the more standard networks, with respect to how information is propagated through the network, from the lower to the higher layers. The experiments, over a number of standard data sets, verify that highly efficient (compressed) structures are obtained in terms of the number of nodes, parameters and kernels as well as in terms of bit precision requirements at no sacrifice to performance, compared to previously published state of the art research. Besides efficient modelling, such networks turn out to exhibit much higher resilience to attacks by adversarial examples, as it is demonstrated by extensive experiments and substantiated by some theoretical arguments. The presentation mainly focuses on the concepts and the rationale behind the methodology and less on the mathematical details.
Σχετικά με τον ομιλητή:
Sergios Theodoridis currently serves as a Distinguished Professor with the Aalborg University, Denmark. He is also Professor Emeritus of Signal Processing and Machine Learning with the Department of Informatics and Telecommunications of the National and Kapodistrian University of Athens, Greece. He has also served as Professor with the Shenzhen Research Institute of Big Data (SRIBD), the Chinese University of Hong Kong, Shenzhen, China (2018-2020). His research interests lie in the areas of Online Learning Algorithms, Distributed and Sparsity-Aware Learning, Machine Learning and Deep Networks, Signal Processing and Learning for Brain Signals, and Audio Processing and Retrieval.
He is the author of the book “Machine Learning: A Bayesian and Optimization Perspective”, Academic Press, 2 nd Ed., 2020, the co-author of the best-selling book “Pattern Recognition”, Academic Press, 4 th ed. 2009, the co-author of the book “Introduction to Pattern Recognition: A MATLAB Approach”, Academic Press, 2010, the co-editor of the book “Efficient Algorithms for Signal Processing and System Identification”, Prentice Hall 1993, and the co-author of three books in Greek, two of them for the Greek Open University.
He is the co-author of seven papers that have received Best Paper Awards including the 2014 IEEE Signal Processing Magazine Best Paper Award and the 2009 IEEE Computational Intelligence Society Transactions on Neural Networks Outstanding Paper Award. His published work has attracted more than 20000 citations according to Google Scholar.
He is the recipient of the 2021 IEEE Signal Processing Society (SPS) Norbert Wiener Award, which is the IEEE SP Society’s highest honor, the 2017 EURASIP Athanasios Papoulis Award, the 2014 IEEE SPS Carl Friedrich Gauss Education Award and the 2014 EURASIP Meritorious Service Award. He has served as a Distinguished Lecturer for the IEEE SP as well as the Circuits and Systems Societies. He was Otto Monstead Guest Professor, Technical University of Denmark, 2012, and holder of the Excellence Chair, Dept. of Signal Processing and Communications, University Carlos III, Madrid, Spain, 2011.
He currently serves as Chairman of the IEEE SP Society Awards Committee. He has served as Vice President IEEE Signal Processing Society, as President of the European Association for Signal Processing (EURASIP), as a member of the Board of Governors for the IEEE Circuits and Systems (CAS) Society, as a member of the Board of Governors (Member-at-Large) of the IEEE SP Society and as a Chair of the Signal Processing Theory and Methods (SPTM) technical committee of IEEE SPS. He has served as Editor-in-Chief for the IEEE Transactions on Signal Processing. He is Editor-in-Chief for the Signal Processing Book Series, Academic Press and co-Editor in Chief for the EReference Signal Processing, Elsevier.
He is Fellow of IET, a Corresponding Fellow of the Royal Society of Edinburgh (RSE), a Fellow of EURASIP and a Life Fellow of IEEE.