Last month, I went to Long Beach, California to attend the 36th international conference machine learning (ICML). Having been to several academic conferences during graduate school in neuroscience (SfN) and biophysics (BPS), I was very excited to go to a comprehensive machine learning conference for the first time. Here are my key takeaways and favorite talks at ICML 2019.
Check my Github repo for detailed notes, my favorite papers and presentations https://github.com/yangju2011/2019_ICML_note
Table of Contents
Wide industry participation
Different from other conferences I attended in the past, I was surprised to see a wide and active participation from industry such as Google, Facebook, Amazon, and J.P. Morgan. According to the analysis done by AndreasDoerr on Reddit, 77% contribution is from academic affiliations and 23% contribution is from industrial affiliations. And 3 out of the top 4 industry papers are from Google and Google subsidiaries. Machine learning, as a fast-developing field, attracts professionals from all different research and application background: computer science, cognitive science, neuroscience, statistics, etc. The combination of theoretical study and business problems was very inspiring.
Bayesian Opto and AutoML
Bayesian optimization has become the new norm in model optimization and hyperparameter tuning. Compared to traditional grid search, Bayesian optimization utilizes historical function evaluation results to select the next best input variables. See my previous post on Bayesian Optimization.
Due to its high efficiency in black-box optimization, Bayesian optimization is being widely adopted by many companies including Google, Uber, Amazon, and Facebook.
The 6th ICML Workshop on Automated Machine Learning (https://sites.google.com/view/automl2019icml/) and Algorithm Configuration workshop (http://ml.informatik.uni-freiburg.de/~hutter/ICML19_AC.pdf) presented latest research and application of Bayesian optimization in AutoML and algorithm design.
I particularly like the Q&A session at the end of the AutoML workshop. Panelists discussed important issues such as whether AutoML will replace human machine learning workers, and the carbon footprint of AutoML
Deep learning v.s. non-deep learning
The majority of the posters and talks at ICML were related to deep learning with neural networks. Deep learning has been one of the hottest research frontiers and commercial key words, and has shown great value and unique advantage in learning unstructured data such as image, text, and audio, and feature representation. On the other hand, traditional machine learning algorithms (aka non-deep learning) such as linear regression, random forest, and k-means clustering seem less predominant at ICML. There were a few sessions in supervised learning and unsupervised learning which I personally found the most relevant to my work from the perspective of immediate industry adoption.
Learn to learn
One phrase that was constantly mentioned across several talks was “learn to learn”. I also have a post with the same title “Learn to learn: Hyperparameter Tuning and Bayesian Optimization“. The concept of “learn to learn” originates from the goal of building a general learning agent that is not only able to learn a specific task, but also able to generalize and perform multiple tasks. Never-Ending Learning tutorial https://sites.google.com/site/neltutorialicml19/) and Meta-Learning tutorial (https://drive.google.com/file/d/1DuHyotdwEAEhmuHQWwRosdiVBVGm8uYx/view) presented a comprehensive overview of the field, and showed exciting prospects. In addition, the invited talk “What 4 year olds can do and AI can’t (yet)” by professor Alison Gopnikan revealed intriguing discoveries of how children learn differently from adults, and from AI.
In Data Shapley (https://arxiv.org/abs/1904.02868), Ghorbani and Zou proposed data Shapley as a metric to quantify the value of each training datum to the predictor and address data valuation in the context of supervised machine learning. Shapley value data inform what type of new data to acquire to improve the predictor.
In Scalable Fair Clustering (https://arxiv.org/abs/1902.03519), Backurs et al. modified the fair k-clustering algorithm first proposed by Chierichetti et al. to allow for finer control over the balance of resulting clusters.
I always enjoy learning how machine learning is applied in different fields to solve real world problems. Here I am listing some of my favorite papers in the biomedical and healthcare field.
In Direct Uncertainty Prediction for Medical Second Opinions (https://arxiv.org/abs/1807.01771), Maithra Raghu et al. trained a model to predict an uncertainty score of doctor disagreements directly from the raw patient features on a large scale medical imaging application.
In Dynamic Measurement Scheduling for Event Forecasting using Deep RL (https://arxiv.org/abs/1901.09699), Chang et al. used deep reinforcement learning to decide what and when should be measured to forecast detrimental events.
Last but not the least, I was very excited to see Tran et al. applied autoencoder to represent the chemical space of odorants in DeepNose: Using artificial neural networks to represent the space of odorants (https://www.biorxiv.org/content/10.1101/464735v1). I used to study olfactory learning in Drosophila. Since there are numerous odor molecules but only limited numbers of odorant receptors, odor representation has been a challenging problem in neuroscience. Using autoencoder to perform dimension reduction and clustering provides great insight in understanding how human perceives high dimension world with limited receptor capacity.