The 3rd International Summer School on Deep Learning is over! The emotions have subsided so we can finally share some impressions with all of you. We spent 5 magnificent days at Global Expo in Warsaw, meeting hundreds of professionals and enthusiasts in AI and Deep Learning, and soaking up the knowledge provided by renowned academics and industry pioneers, who shared their views with us.

DeepLearn2019 was a research training event with a global scope aiming at updating participants about the most recent advances in the critical and fast developing areas of deep learning. This is a branch of artificial intelligence covering a spectrum of current exciting machine learning research and industrial innovation that provides more efficient algorithms to deal with large-scale data in neurosciences, computer vision, speech recognition, language processing, human-computer interaction, drug discovery, biomedical informatics, healthcare, recommender systems, learning theory, robotics, games, etc.

Most deep learning subareas has been displayed, and main challenges identified through 3 keynote lectures and 22 four-hour and a half courses, which tackled the most active and promising topics.

Surely the best Keynote was presented by Mihaela Van Der Schaar from the University of Cambridge, that was quite similar to another talk:

The number of cutting edge papers and magnificence of solutions was astonishing. One of them ‘Forecasting Individualized Disease Trajectories using Interpretable Deep Learning’ shows that we need both a probabilistic structure of HMMs and RNNs to model state dynamics to achieve a general and versatile deep probabilistic model capturing
complex, non-stationary representations for patient-level trajectories.

Björn Schuller from the London Imperial College, presented examples of best architectures for audio and multimodal tasks. There were also presented best tools for Efficient Data Collection, Intelligent Data Annotation or Feature Extraction. Some of the presented audio solutions like „Snore sound classification using image-based deep spectrum features” were quite similar to our older approaches – incidentally, we can read in this work:

„To date, very little research has been undertaken exploring deep CNN feature representations for audio processing; to the best of our knowledge, they have only been used together with spectrograms for Music Information retrieval [27].”

By the way –  [27] is „Deep Image Features in Music Information Retrieval” and this work is done by our Lead Data Scientists 🙂

„Snore sound classification using image-based deep spectrum features”

The great course was done by Aaron Courville from the University of Montréal who talked about the main three approaches for Generative Models: Normalizing Flows, Vae, and Gan. Nevertheless, the lecture was really mathematical, Aaron is a great speaker and showed many cool applications like Spade (Semantic Image Synthesis with Spatially-Adaptive Normalization):

By this, he achieved a golden mean between required mathematical foundations and application potential.

Qiang Ji from Rensselaer Polytechnic Institute started with fundamentals of a probabilistic approach going through Bayesian Neural Networks and Probabilistic Models. He pointed out that probabilistic models can capture data, model and output uncertainties that can be used in many applications like outlier detection or active learning. But still probabilistic deep learning suffers from intractable inference problems that lead to scale problems so it is a really good research topic. One of the mentioned probabilistic programming toolboxes in TensorFlow Probability about which you can read here.

Professor Fabio Roli from the University of Cagliari gave an amazing lecture about Adversarial attacks proofing how important is training ML models with an awareness of them. A bit similar presentation, however much shorter, you can find here. Professor Roli introduced a whole adversarial framework, both describing the most important attacks from a mathematical point of view and giving amazing examples like:

One thing we can be certain – Machine Learning Security is going to be a hot topic.

Gaël Varoquaux from INRIA pointed out that Data Scientists struggle with problems where there is not a lot of data and presented a bag of tricks on how to deal with them. There was a quite funny example related to ‘Scalable and accurate deep learning with electronic health records’. Despite the uncontested reputation of the authors and a huge amount of data (about 200 000 examples), their Deep Learning solution gains about 2% AUC-ROC relative to baseline one which is a logistic regression 🙂

There was a lot of stuff related to a generalization error, matrix factorization or fisher kernels – everything was served with a high amount of intuition. One really nice example is related to Gamma-Poisson factorization that gives positive and sparse representations of professions that are highly interpretable.

James Kwok from Hong Kong University of Science and Technology had one purpose – to show all techniques related to making Neural Networks smaller and faster! Some common material, however much smaller you can find here.There are many ways of achieving this: network sparsification, quantization, low-rank approximation, distillation – the number of methods is just overwhelming. However, it does not matter if the purpose is to put a Neural Network on a mobile phone or making Distributed Learning more effective (bandwidth save), the topic is really hot for both academic and industry.

Ming-Hsuan Yang from the University of California revealed many Computer Vision anecdotes and presented Object Tracking problems from fundamentals to current Deep Learning solutions. We started from the evaluation protocols for object tracking and some classical methods like Adaptive Correlation Filters, to review the newest solutions based on Deep Learning in the final. The last shown work ‘Fast and Accurate Online Video Object Segmentation via Tracking Parts’ can be briefed as track then segment then recognize and then reconstruct. Shortly saying, we go beyond bounding boxes and we are on the way toward Cognitive Visual Tracking.

See you next year! 🙂

Grzegorz Gwardys  – W Promity jest liderem zespołu Data Science / Computer Vision, odpowiedzialnego za rozwój projektów związanych ze sztuczną inteligencją i maszynowym uczeniem. Rozwija również system rozpoznawania twarzy oraz współtworzył dla Promity rozwiązania z obszaru Big Data.