

Why you should care about ICML
ICML is one of the top academic conferences on machine learning and, as so, one of the biggest events on the subject. The MFG Labs data team was part of the show as gold sponsor of the conference. It was a great opportunity for us to meet fellow data geeks, discuss with academics and get a feeling of what’s coming next in the field.
But first, you may wonder what ICML is and why you should care about it. To answer this question, we will first look at what is an academic conference and what’s at stake for academics in such an event. Then we’ll see how these events allow companies and academics to meet and form tight relationships. Finally, we’ll discuss briefly one hot topic of this conference, a consequence of the tight relationships between companies and academia: the fear of a new AI winter.
What’s an academic conference like?
Say you are a scientist, working with your team, in your lab, on a cool research project. After some time, a few (actually a hell lot of) trials and errors, you happen to find a solution to the problem you were trying to solve. This solution is often a proof of some theorem or the experimental confirmation of a phenomenon. Now, you want to share it with the research community so it can be validated (peer reviews) and other scientists can build upon it. This is probably the most essential rule of academia : any piece of scientific work must be peer validated. Now, with the growing number of scientists in the world, peer reviewing all contributions (outcome of a successful research project) is a challenge. That’s the first role of journals and conferences. The second one is to facilitate open research: it is a way for scientists to gather all contributions on a same topic, so they can keep up with the growing number of contribution made in the world. These two missions are absolutely essential, and rest essentially on academic journals and conferences. So how does it work ?

Once a research project is successful, the next step is to write a paper that contains the context and the contribution. The context explains why was this project studied, and why it matters. The contribution explains precisely the work produced during the project, so it can be understood and reproduced. Once this paper is written, it is submitted to a conference and/or a journal, so it can be shared, reviewed and validated by other scientists.
We’re talking about ICML so let’s focus on conferences.
Every conference is held within particular deadlines. It always starts by a call for paper (CFP) that details the acceptable domains and contributions for the conference. For instance, ICML is a big machine learning conference so the CFP is focused on machine learning. WWW is a conference on web data and is more focused on the web rather than on machine learning. RecSys is a conference on recommender system and is focused on, well, you guessed it, recommender systems. The CFP also defines the submission deadline. Here is for exemple the ICML 2015 CFP. Scientists have to submit their papers before this submission deadline. Once the paper is submitted, then the reviewing begins. Other scientists, from the same domain are randomly selected to review the paper. This review can be done with all names known (reviewers and authors), blind (the name of the authors is visible, reviewer are anonymous) or double blind (authors and reviewer are anonymous). Reviewers’ job is to grade the paper and explain the grade. Papers are then ranked based on the grades and only the best papers are kept as part of the conference program. They are the accepted papers of the conference, they are the papers that will be presented during the talks. Here is the list of accepted papers of ICML 2015.
Now that the conference has a list of accepted papers, the only thing left is to attend it. Everyone can attend the conference (there’s often a fee) and listen to the talks, meet scientists, or if you’re really lucky talk to MFG Labs data scientists. This is why conferences are so vital for the academic world : they provide a solution to the systematic review of scientific contributions, provide an event where to meet fellow scientists and exchange ideas and all accepted papers are gathered together in proceedings so they can easily be accessed by other scientists.

Interactions between academia and companies
We explained why these conferences were so essential, and this is why so many scientists attend them. This makes them one of the best places for companies and academics to interact with each others and benefit from each other. The first benefit is that companies can sponsor conferences. This is vital to making the organisation of such conferences even possible. Sponsors of ICML paid a fee to have a booth so they helped the conference and had a nice spot in the main conference hall where they could greet fellow scientists and interact with them. We are proud sponsors of ICML and it is very important for us to be involved in academic research. In machine learning, this close relationship between companies and academics is quite important, and here is why.
They are roughly two types of contributions to machine learning. Theoretical advances that focus on proofs of theorems like generalization bounds or convergence speed of algorithms, and experimental advances that apply known algorithms to new data and new contexts. Both benefit from each other : the former gives insights on what is to expect when pushing an algorithm to new data, the latter generates theoretical questions that have to be answered. Basically companies help with the latter : they have new datasets and new challenges for the scientists. One perfect illustration is the Netflix Prize: the company provided a dataset (movie reviews), a task (recommend movies) and a financial support for the winner. This challenge drove research on recommender systems further. Netflix benefited from the results as they could integrate the advances made by scientists in their own recommender system. Another benefit of such interactions is the complementarity of scientists from the academic and private sector : companies often hire scientists to work on projects they cannot or do not want to outsource to a company. It provides an opportunity for scientists to work on different, real data and for companies to involve experts in the field.
Now, that’s for the brighter side of this interaction. Sadly enough, there is a darker side that was often discussed at ICML this year. They are two main impacts of the close relationship of academia and private sector. The first one is that, in some domains (like deep neural networks) more and more scientists join private sector companies. As so, reviewers of such domains are likely to belong to private companies. This raises a neutrality issue on the reviewing process for such domains: are these employees biased towards the results of their own company ? The other impact is the so called AI Winter. AI winter is a reference to the 70s where fundings in AI research were really low. This was a consequence of the over confident predictions about the possibilities of AI in the 60s. In those days, the first neural networks (actually only one neuron at the time !) was introduced, as well as many rule learning engines. Companies and scientists advertised there results with great confidence, planning to replace human intelligence in many domains… but they failed to deliver. As a results, investments on AI dropped drastically. Might history repeat itself (again) ? Today, many companies base their business plans on overconfident AI… We certainly made a huge leap forward since the 60s (both in hardware, software and theory) but still. The wound inflicted by the first AI winter is still fresh for many and the hype around machine learning is seen with both excitation and fear.

BYOC (bring your own coat)
Sure, a second AI winter is a risk. How do we avoid it ? Our belief is that it can be avoided by focusing on the benefits of a close relationship between academia and companies while keeping a cool head about what can be done. There are many tasks that can be automated, yet we are still far from reproducing human intelligence. The programmable world is about providing the right information at the right time. This raises many difficult challenges around data analysis and processing and often generates new theoretical questions that drives research further. Academia provides strong knowledge on the behavior of machine learning and helps us understand how and why should a model fit a particular use. A clean communication between academia and private sector is for us the warmest coat to protect ourselves from this winter.
Mickaël — @mpoussevin
You should follow us on Twitter: @mfg_labs