- Davide Bacciu, AlessioMicheli (University of Pisa), Deep Learning for Graphs
- Silvia Chiappa (DeepMind), Luca Oneto (University of Genoa), Fairness in Machine Learning
- Claudio Gallicchio (University of Pisa), Simone Scardapane (Sapienza University of Rome), Deep Randomized Neural Networks
- Věra Kůrková (Czech Academy of Sciences), Complexity of Shallow and Deep Networks
- Danilo P. Mandic, Ilia Kisil, and Giuseppe G. Calvi (Imperial College London), Tensor Decompositions and Applications. Blessing of Dimensionality
- German I. Parisi (University of Hamburg) and Vincenzo Lomonaco (Universigty of Bologna), Continual Lifelong Learning with Neural Networks
Deep Learning for Graphs
The tutorial will introduce the emerging field of deep learning for graphs and its applications to bioinformatics, chemistry and vision. Dealing with graph data requires learning models capable of adapting to structured samples of varying size and topology, capturing the relevant structural patterns to perform predictive and explorative tasks while maintaining the efficiency and scalability necessary to process large scale networks. The tutorial will first introduce some foundational aspects of learning with structured data samples and will survey some seminal neural network models for graphs. Then it will focus on the most recent advancements in terms of deep learning for network and graph data, including learning structure embeddings, graph convolutions, attentional models and structure generation. The tutorial is targeted to both early career researchers seeking ideas for their doctoral studies as well as to more advanced stage researchers looking to enter into a lively field of deep learning and seeking both foundational knowledge as well as a perspective on current research.
Davide Bacciu is Assistant Professor at the Computer Science Department, University of Pisa. The core of his research is on Machine Learning (ML) and deep learning models for structured data processing, including sequences, trees and graphs. He is the PI of an Italian National project on ML for structured data. He has been teaching courses of Artificial Intelligence (AI) and ML at undergraduate and graduate levels since 2010. He is the Secretary of the Italian Association for Artificial Intelligence (AI*IA), member of the IEEE CIS Task Force on Deep Learning and Associate Editor of the IEEE Transactions on Neural Networks and Learning Systems.
Alessio Micheli is Associate Professor at the Department of Computer Science, University of Pisa, where he is the coordinator of the Computational Intelligence & Machine Learning Group (CIML). His research interests include Machine Learning, Neural Networks, Deep Learning, learning in structured domains (sequence, tree and graph data) and relational learning. He has co-authored more than 140 papers published in international journals and conference proceedings. According to Google Scholar, Alessio Micheli has more than 1900 citations, H-index of 24 (at August 2018). He is the national coordinator of the “Italian Working group on Machine Learning and Data Mining” of AI*IA. He is member of the IEEE CIS Task Force on Deep Learning and PC member of conferences in ML and AI.
Fairness in Machine Learning
AI systems and products are reaching society at large and in many aspects of everyday life, including healthcare, criminal justice, education, and finance. This phenomenon has been accompanied by an increase in concern about the ethical issues that may rise from the adoption of these technologies. In response to this concern, a new area of machine learning has recently emerged that studies how to address disparate treatment caused by algorithmic errors and bias in the data. The central question is how to ensure that the learned model does not treat subgroups in the population ‘unfairly’. While the design of solutions to this issue requires an interdisciplinary effort, fundamental progress can only be achieved through a radical change in the machine learning paradigm.
In this tutorial, we will describe the state of the art on ML fairness as well as discuss currently unexplored areas of research. We will use the framework of graphical models to provide a clear and intuitive formalization and characterization of the subject.
Silvia Chiappa is a senior research scientist in Machine Learning at DeepMind, where she works on ML fairness and deep probabilistic temporal models. Silvia received a Diploma di Laurea in Mathematics from University of Bologna and a PhD in Statistical Machine Learning from École Polytechnique Fédérale de Lausanne. Before joining DeepMind, Silvia worked in several Machine Learning and Statistics research groups: The Empirical Inference Department at the Planck Institute for Intelligent Systems, the Machine Intelligence and Perception Group at Microsoft Research Cambridge, and the Statistical Laboratory, University of Cambridge. Silvia’s research interests are based around Bayesian and causal reasoning, graphical models, approximate inference, time-series models, and ML fairness.
Luca Oneto was born in Rapallo, Italy in 1986. He received his BSc and MSc in Electronic Engineering at the University of Genoa, Italy respectively in 2008 and 2010. In 2014 he received his PhD from the same university in School of Sciences and Technologies for Knowledge and Information Retrieval with the thesis “Learning Based On Empirical Data”. In 2017 he obtained the Italian National Scientific Qualification for the role of Associate Professor in Computer Engineering and in 2018 the one in Computer Science.
He is currently an Assistant Professor in Computer Engineering at University of Genoa with particular interests in Statistical Learning Theory and Data Science.
Deep Randomized Neural Networks
Randomized Neural Networks explore the behavior of neural systems where the majority of connections are fixed, either in a stochastic or a deterministic fashion. Typical examples of such systems consist of multi-layered neural network architectures where the connections to the hidden layer(s) are left untrained after initialization.
Limiting the training algorithms to operate on a reduced set of weights inherently characterizes the class of Randomized Neural Networks with a number of intriguing features. Among them, the extreme efficiency of the resulting learning processes is undoubtedly a striking advantage with respect to fully trained architectures. Besides, despite the involved simplifications, randomized neural systems possess remarkable properties both in practice, achieving state-of-the-art results in multiple domains, and theoretically, allowing to analyze intrinsic properties of neural architectures (e.g. before training of the hidden layers’ connections). In recent years, the study of Randomized Neural Networks has been extended towards deep architectures, opening new research directions to the design of effective yet extremely efficient deep learning models in vectorial as well as in more complex data domains.
This tutorial will cover all the major aspects regarding the design and analysis of Randomized Neural Networks, and some of the key results with respect to their approximation capabilities. In particular, the tutorial will first introduce the fundamentals of randomized neural models in the context of feed-forward networks (i.e., Random Vector Functional Link and equivalent models), convolutional filters, and recurrent systems (i.e., Reservoir Computing networks). Then, it will focus specifically on recent results in the domain of deep randomized systems, and their application to structured domains.
Claudio Gallicchio is Assistant Professor at the Department of Computer Science, University of Pisa, within the Computational Intelligence & Machine Learning Group (CIML). He is chair of the IEEE CIS Task Force on Reservoir Computing, and member of the IEEE CIS Task Force on Deep Learning. Claudio Gallicchio has co-organized several special sessions on Randomized Neural Networks methodologies in major international conferences, and since 2016 is co-organizer of the Italian Workshop on Machine Learning and Data Mining (MLDM.it). He serves as member of several program committees of conferences and workshops in Machine Learning and Artificial Intelligence. His research interests include Machine Learning, Deep Learning, Randomized Neural Networks, Reservoir Computing, Recurrent and Recursive Neural Networks, Sequence and Structured Domains Learning.
Simone Scardapane is a Post-Doctoral Fellow at the the “Sapienza” University of Rome. He is active as co-organizer of special sessions and special issues on themes related to Randomized Neural Networks and Randomized Machine Learning approaches. His research interests include Machine Learning, Neural Networks, Reservoir Computing and Randomized Neural Networks, Distributed and Semi-supervised Learning, Kernel Methods, and Audio Classification. Simone Scardapane is an Honorary Research Fellow with the CogBID Laboratory, University of Stirling, Stirling, U.K. Simone Scardapane is the co-organizer of the Rome Machine Learning & Data Science Meetup, that organizes monthly events in Rome, and a member of the advisory board for Codemotion Italy. He is also a co-founder of the Italian Association for Machine Learning, a not-for-profit organization with the aim of promoting machine learning concepts in the public. In 2017 he has been certified as a Google Developer expert for machine learning. Currently, he is the track director for the CNR sponsored “Advanced School of AI” (https://as-ai.org/governance/).
Complexity of Shallow and Deep Networks
Věra Kůrková (Czech Academy of Sciences)
Experimental evidence motivated theoretical research aiming to characterize tasks for which deep networks are more suitable than shallow ones. This tutorial will review recent theoretical results comparing capabilities of shallow and deep networks. In particular, it will focus on complexity requirements of shallow and deep networks performing high-dimensional tasks. The contents of the tutorial will cover:
- Universality and tractability of representations of multivariable mappings by shallow networks
- Sparse representations of functions by shallow and deep networks and output-weight regularization
- Limitations of computation of highly-varying functions by shallow networks
- Probabilistic lower bounds on model complexity of shallow and deep networks
- Constructive lower bounds on model complexity of shallow networks
- Examples of functions that can be represented compactly by deep architectures but
cannot be represented by a compact shallow architecture
- Connections to the No Free Lunch Theorem, pseudo-noise sequences, and the central paradox of coding theory
- Open problems concerning deep and shallow architectures
This tutorial is self-contained, and is suitable for researchers who already use multilayer neural networks as a tool and wish to understand their mathematical foundations, capabilities and limitations. The tutorial does not require a sophisticated mathematical background.
Věra Kůrková received Ph.D. in mathematics from the Charles University, Prague, and DrSc. (Prof.) in theoretical computer science from the Czech Academy of Sciences. Since 1990 she is affiliated as a scientist in the Institute of Computer Science, Prague, in 2002-2009 she was the Head of the Department of Theoretical Computer Science. Her research interests are mathematical theory of neurocomputing and machine learning, nonlinear approximation theory, and inverse problems. She is a member of the editorial boards of the journals Neural Networks and Neural Processing Letters, in past she also served as an associate editor of IEEE Transactions on Neural Networks and was an editor of special issues of the journals Neural Networks and Neurocomputing. She was the general chair of the conferences ICANN 2008 and ICANNGA 2001 and co-chair of ICANN 2017 and ICANN 2018. She is the president of the European Neural Network Society (ENNS).
Tensor Decompositions and Applications. Blessing of Dimensionality
The widespread use of multisensor technology and the emergence of big data sets have highlighted the limitations of standard flat-view matrix models and the necessity to move toward more versatile data analysis tools. It is therefore both timely and important to be acquainted with the most recent advances in analysis of huge multi-dimensional arrays (tensors) of data and to be equipped with the appropriate tools. To this end, the tutorial will be divided in two parts:
- The first half will be a classic lecture-type tutorial which will start with the Curse of Dimansionality in Big Data analytics, in the form of the four V’s: Volume, Variety,
Velocity, Veracity. This will be followed by a comprehensive overview of tensor decompositions, supported by a variety of examples. Overall, a whole spectrum of tensor applications will be covered, from the basics of Big Data to feasible realizations through multi-linear algebra and tensor networks.
- The second half will be a hands-on demo based on our open source software, HOTTBOX, specifically developed for preforming decompositions, visualisation and analysis of multidimensional data. All demonstrations material, as well as exercises, will be provided in the form of the Jupyter notebooks, the most popular interactive environment for exploratory data analysis. Local installation of additional software will be optional, as all examples can be seamlessly run in the Cloud.
- The first half will be a classic lecture-type tutorial which will start with the Curse of Dimansionality in Big Data analytics, in the form of the four V’s: Volume, Variety,
This two-part structure will make this tutorial suitable for the multidisciplinary machine learning and data analytic communities, together with offering the attendees an enhanced experience and a hands-on insight into immediate practical aspects and impact of the tensor technology.
Danilo P. Mandic is a Professor in signal processing with Imperial College London, UK, and has been working in the area of nonlinear adaptive signal processing and bioengineering. He is a Fellow of the IEEE, member of the Board of Governors of the International Neural Networks Society (INNS), member of the Big Data Chapter within INNS, and has received several best paper awards in Brain Computer Interface. Prof Mandic runs the Smart Environments Lab at Imperial, has more than 300 publications in journals and conferences, and has received President’s Award for excellence in postgraduate supervision at Imperial. His work related to this Tutorial includes co-authoring two recent research monographs on Tensor Networks for Dimensionality Reduction and Large Scale Optimisation (Now Publishers, 2016 and 2017).
Ilia Kisil is currently a graduate Research Assistant, pursuing his PhD at the Department of Electrical and Electronic Engineering of Imperial College London. In 2016, he received MRes in Advanced Computing from Imperial College London, UK, after obtaining a MSc in Intelligence Systems in Robotics and a BSc in Automation Control from ITMO University,
Russia. His research focuses on analysis inherently N-dimensional arrays (tensors) with applications to biomedical and financial data. He has been the driving force behind HOTTBOX, a toolbox for tensor decompositions, visualisation, feature extraction and non-linear classification of multi-dimensional data.
Giuseppe G. Calvi in an advanced stage of his PhD at the Department of Electrical and Electronic Engineering of Imperial College London. He obtained his BSc degree in Telecommunication Engineering from the Polytechnic University of Turin in 2014 and an equivalent degree from the Tongji University of Shanghai in 2015. In 2016 he attained his MSc degree in Communications and Signal Processing from Imperial College London, earning the Best Thesis Award. His research focuses on tensor decompositions for signal processing applications. In particular he studies the integration of tensors and neural networks, support tensor machines for financial applications, and tensor networks and their taxonomy. He is a major to the HOTTBOX initiative.
Continual Lifelong Learning with Neural Networks
Artificial agents interacting in highly dynamic environments are required to continually acquire and fine-tune their knowledge overtime. In contrast to conventional deep neural networks that typically rely on a large batch of annotated training samples, lifelong learning systems must account for situations in which the number of tasks is not known a priori and the data samples become incrementally available over time. Despite recent advances in deep learning, lifelong machine learning has remained a long-standing challenge due to neural networks being prone to catastrophic forgetting, i.e., the learning of new tasks interferes with previously learned ones and leads to abrupt disruptions of performance. Recently proposed deep supervised and reinforcement learning models for addressing catastrophic forgetting suffer from flexibility, robustness, and scalability issues with respect to biological systems. In this tutorial, we will present and discuss well-established and emerging neural network approaches motivated by lifelong learning factors in biological systems such as neurosynaptic plasticity, complementary memory systems, multi-task transfer learning, and intrinsically motivated exploration.
German I. Parisi is working at Apprente, Inc. (USA). He received his BSc and MSc in Computer Science from the University of Milano-Bicocca, Italy. In 2017 he received his PhD in Computer Science from the University of Hamburg. In 2015 he was a visiting researcher at the Cognitive Neuro-Robotics Lab at the Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea. Since 2016 he is a research associate of international project Transregio TRR 169 on Crossmodal Learning in the Knowledge Technology Institute at the University of Hamburg. His main research interests include lifelong and transfer learning, multisensory integration, and neural network self-organization.
Vincenzo Lomonaco is a 3rd year PhD student at the University of Bologna, Italy and founder of ContinualAI.org, a non-profit research organization on Continual/Lifelong Learning for AI. He is also the PhD students representative at the Department of Computer Science of Engineering (DISI) and teaching assistant of the courses “Machine Learning” and “Computer Architectures” in the same department. He has been a visiting scholar at the Purdue University, USA in 2017 and at the ENSTA ParisTech Grande École, France in 2018. His main research interests include continual/lifelong Learning with deep architectures, multi-task learning, knowledge distillation and transfer, and their applications to embedded systems, robotics and internet-of-things.