Hybrid Data Structures and beyond
Paolo Ferragina (University of Pisa)
The ever growing need to efficiently store, retrieve and analyze massive datasets, originated by very different sources, is currently made more complex by the different requirements posed by users, devices and applications. Such a new level of complexity cannot be handled properly by current data structures for Big Data problems.
To successfully meet these challenges, new surprising results have appeared recently in the literature that integrate classic approaches (such as B-trees) with various kinds of learning models (such as Neural Networks), called Hybrid Data Structures. They achieve improved space-time trade-offs and open new research scenarios.
In this talk, I’ll survey the evolution of search data structures, point out new challenges and results, and propose the novel concept of Personalized Data Structures and its corresponding algorithmic framework, called Multicriteria Data Structures. Here, one wishes to seamlessly integrate, via a principled optimization approach, classic or compressed data structures with new, revolutionary, data structures “learned” from the input data by using proper machine-learning tools. The Hybrid Data Structures are just a simple instance of this framework, which we believe deserves much research attention because of its scientific challenges and significant practical impacts.
He is Professor of Algorithms at the University of Pisa and the Director of the PhD in Computer Science, hosted by the University of Pisa and run in collaboration with the Universities of Florence and Siena. He leads the Acube Lab, whose research is devoted to design algorithms and data structures for storing, compressing, mining and retrieving information from Big Data, with several industrial projects in collaboration with companies, such as Google, Bloomberg, Yahoo!, ST Microelectronics, ENEL, Tiscali, CERVED, etc.
He got his PhD (1996) in Computer Science from the University of Pisa, and his Post-doc from the Max-Planck Institut fur Informatik (Saarbrucken, 1998). His promotion to full professor was sponsored by Yahoo! Research. From 2010 to 2016, he was Vice-Rector on “Applied Research and Innovation” at the University of Pisa and, in the same period, he was the President of the IT Center, which is a competence center about Cloud and HPC for Dell and Intel.
His research results received three US Patents (three more are pending) and some international awards, among the latest ones we mention three Google research award (2010, 2012 and 2016) and one Bloomberg Data Science research grant (2017). He has been invited speaker of many international conferences and workshops, Area Editor of the Encyclopedia of Algorithms (Springer) and of the Encyclopedia of Big Data Technologies (Springer). He serves in the Editor Board of the Journal of Graph Algorithms and Applications (JGAA), and served as co-chair and editor of several international conferences and special issues in journals. He co-authored three books, some chapters, and more than 160 in international refereed conferences and journals on Theoretical Computer Science and Algorithmics, publications (his H-index in Google is 38 with about 8k citations).
Extreme Learning Machines (ELM) – When ELM and Deep Learning Synergize
Guang-Bin Huang (School of Electrical and Electronic Engineering
Nanyang Technological University, Singapore)
One of the most curious in the world is how brains produce intelligence. The objectives of this talk are three-folds: 1) There exists some convergence between machine learning and biological learning. Although there exist many different types of techniques for machine learning and also many different types of learning mechanism in brains, Extreme Learning Machines (ELM) as a common learning mechanism may fill in the gap between machine learning and biological learning. In fact, ELM theories have been validated by more and more direct biological evidences recently. ELM theories point out that the secret learning capabilities of brains may be due to the globally ordered and structured mechanism but with locally random individual neurons in brains, and such a learning system happens to have regression, classification, sparse coding, clustering, compression and feature learning capabilities, which are fundamental to cognition and reasoning; 2) Single hidden layer of ELM unifies Support Vector Machines (SVM), Principal Component Analysis (PCA), Non-negative Matrix Factorization (NMF); 3) ELM provides some theoretical support to the universal approximation and classification capabilities of Convolutional Neural Networks (CNN). In addition to the good performance in small to medium datasets, hierarchical ELM is catching up with Deep Learning in some benchmark big datasets which Deep Learning used to perform well.
Guang-Bin Huang is a Full Professor in the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore. He is a member of Elsevier’s Research Data Management Advisory Board. He is one of three Directors for Expert Committee of China Big Data Industry Ecological Alliance organized by China Ministry of Industry and Information Technology, and a member of International Robotic Expert Committee for China. He was a Nominee of Singapore President Science Award (2016, 2017 and 2018), was awarded by Thomson Reuters “Highly Cited Researcher” (in two fields: Engineering and Computer Science), and listed in Thomson Reuters’s “The World’s Most Influential Scientific Minds.” He received the best paper award from IEEE Transactions on Neural Networks and Learning Systems (2013). His two works on Extreme Learning Machines (ELM) have been listed by Google Scholar in 2017 as Top 2 and Top 7, respectively in its “Classic Papers: Articles That Have Stood The Test of Time” – Top 10 in Artificial Intelligence.
He serves as an Associate Editor of Neurocomputing, Cognitive Computation, neural networks, and IEEE Transactions on Cybernetics.
He is Principal Investigator of BMW-NTU Joint Future Mobility Lab on Human Machine Interface and Assisted Driving, Principal Investigator (data and video analytics) of Delta – NTU Joint Lab, Principal Investigator (Scene Understanding) of ST Engineering – NTU Corporate Lab, and Principal Investigator (Marine Data Analysis and Prediction for Autonomous Vessels) of Rolls Royce – NTU Corporate Lab. He has led/implemented several key industrial projects (e.g., Chief architect/designer and technical leader of Singapore Changi Airport Cargo Terminal 5 Inventory Control System (T5 ICS) Upgrading Project, etc).