As more professionals clamor to enter the fields of data science and machine learning, a new generation of educational institutions has arrived on the scene. Business schools offer new executive education programs, such as CS departments specialized programs in data science. We’re also seeing a surge of organizations that offer programs of intense training within the duration of a few weeks, which are gaining notoriety and popularity. Having several options is great, but it can often be confusing to decide which direction to follow. One must ask the question- “which program is the right for me?”
We feel that a sampling approach is a great method to discover which program is the best fit for each individual. This April, MLconf NYC attendees will have the opportunity to experience a demo class from METIS. Mike Galvin will present “An introduction to word2vec and working with text data”. In this session, Galvin will introduce the basics of working with textual data using various Python libraries while several examples of real-world applications are offered. The class will start by introducing the basic bag of word representation and move onto other models with an emphasis on word2vec. By the end of this talk, participants will be able to use Python to explore and build their own models on text data.
If you are attending MLconf, sign up here to reserve your spot!
MLconf will be adding a training session soon! Stay tuned for more details!
-Nik Vasiloglou, Technical Chair, MLconf
MLconf NYC – Hotel Partnership
Good news! We have partnered with St Giles to offer MLconf NYC event attendees a special discounted rate on hotel rooms this April.
Please make sure to mention “The Machine Learning Conference” in order to receive the negotiated rate. Reservations can be made by calling the in-house Accommodations Team at 1-888-406-8588 between the hours of 9 am – 6 pm Monday – Friday or 10 am – 6 pm on Saturday or Sunday.
Deadline for Reservations: 03/25/2016
Dates Available: April 14 – 16, 2016
Price: $259.00 Deluxe Studio with 1 king bed *Room rates are subject to 14.75% New York City and state taxes plus $3.50 occupancy tax per room, per night. (wifi included)
You can see their rooms and have further information about their properties on the website: http://www.stgiles.com/new-york-hotels/
MLconf NYC- What to Expect This April!
This is the 3rd MLconf in NYC and we are very excited to see the audience growing! This year we are going to see companies like Maluuba presenting applications of deep learning in machine reading. In the same wavelength, Baidu will present their new open source platform for deep learning in speech recognition. Kilian Weinberger from Cornell is going to impress us with his recent “deep manifold traversal algorithm” that turns deep learning inside out and reveals all its secrets, get ready for a very impressive demo! Google and Facebook will defend their choices to pursuit deep learning with tensorFlow and Torch, while Mathias Brandewinder will show us the advantages of functional programming in machine learning development, from the Microsoft stack point of view. But machine learning is not only deep learning- Samantha Kleinberg will guide us through her new book about Causality, while Yael Elmatad and Damien Lefortier from Tapad and Criteo are here to remind us that online advertising one of the oldest success stories of machine learning can still provide us with useful algorithms, practices and insights. We are very happy to see Edo Liberty back at MLconf, explaining to the audience algorithms in his own unique way. Furong Huang, the winner of the MLconf Industry Impact Student Research Award, Sponsored by Google, will introduce us to the amazing world of tensors and how we can use them in practice.
Every year we are trying to introduce a novel feature at MLconf, in an effort to make your experience better. Last year we introduced the MLQuiz, which is aimed to help you improve comprehension of the content. Machine Learning bootcamps and training is growing and many data scientists are showing a lot of interest- This year METIS will demo a class during the afternoon break. We hope you find it valuable and enlightening. Additionally, Welch Labs will premier their new series of videos that have been embraced with enthusiasm in the past conferences.
Thanks for reading!
Machine Learning “Flows” at MLconf
There is always a mini session inside MLconf about machine learning tools and software. Recently, the plan was not to have a focus on the platforms, but the open sourcing of TensorFlow by Google a week prior to MLcon SF 2015, inevitably brought the topic in the agenda. Speakers have the flexibility to adjust their talks and they did take advantage of it. The keynote speaker, Alex Smola, presented on scalable methods in machine learning with his Parameter Server and MXNet. Comparisons and benchmarks with TensorFlow were inevitable. One of the first experiments at scale with its competitors, Theano and Torch, created discussion during the break.
But is it really about performance? Quoc Le, one of the Google pioneers in deep learning, presented his new deep learning dialogue system built entirely on TensorFlow, We might know how fast it took him to train it, but we do know that it didn’t take that long to develop it, once he figured out the math!
Ufora, a NYC-based startup, took a different angle in scaling numerical computations in a more systems agnostic way, by presenting a new language paradigm that can create a supercomputer for 10$ per hour!
Five months later, the conversations about the future of the machine learning platforms that are not only fast, but also very agile and productive are heating up. More platforms, more advances and it is time for MLconf to have a special session on the platforms that the giants are using. Join us to hear the most recent developments at Facebook, Google, Samsung, Microsoft with the next generation ML platforms. Mention “Flows15” during registration and save 15% on tickets to MLconf NYC this April!
MLconf Industry Impact Student Research Award Winners
This year, we started a new award program called the MLconf Industry Impact Student Research Award, which is sponsored by Google. This fall, our committee of distinguished ML professionals reviewed several nominations sent in from members of the MLconf community. There were several great researchers that were nominated and the committee arrived at awarding 2 students whose work, they believe, has the potential to disrupt the industry in the future. The two winners that were announced at MLconf SF 2015 are UC Irvine Student, Furong Huang and UC Berkeley Student, Virginia Smith. Below are summaries of their research. We’ve invited both researchers to present their work at upcoming MLconf events.
Furong Huang, Research Assistant, UC Irvine
Furong Huang’s research has focused on tensor factorization models and their application in unsupervised learning. There are three key observations that make her research successful. The first observation is the information richness of higher order moment tensors. By using matrices it was only possible to look at second order moments. With tensors it is possible to look at higher order moments. The second observation has to do with the fact that the elements of those decomposed tensors estimate parameters for many hard machine learning problems, such as mixture of experts, latent models, mutli layer networks etc. So the tensors can be a unified learning platform. The third observation, is that the tensor decomposition although a non-convex optimization problem, all the local minima can be equivalent. This is a key advantage of the whole framework as in the industry it is very important to have consistent results. It makes debugging easy. Another reason why we believe the tensor learning framework has the potential to grow, is its simplicity and scalability.The framework convert the complex high-dimensional learning problem to a simple tensor decomposition problem in a low-dimensional and compact space. Therefore, the framework is scalable to data in high-dimensions. At last we need to point out that the mapping of the problem to high-level BLAS linear algebra operations makes tensor learning extremely fast with the use of GPUs and other optimized linear algebra HPC packages. Furong has taken the extra mile to implement her work in different platforms (GPU, Apache Spark and Hadoop MapReduce) that are widely accepted by the industry.
Virginia Smith, Researcher, UC Berkeley
Virginia Smith’s research focuses on distributed optimization for large-scale machine learning. The main challenge in many large-scale machine learning tasks is to solve an optimization objective involving data that is distributed across multiple machines. In this setting, optimization methods that work well on single machines must be re-designed to leverage parallel computation while reducing communication costs. This requires developing new distributed optimization methods with both competitive practical performance and strong theoretical convergence guarantees. Virginia’s work aims to determine policies for distributed computation that meet these requirements, in particular through the development of a novel primal-dual framework, CoCoA, which is written on Spark. The theoretical and practical development of CoCoA is an important step for future data scientists hoping to deploy efficient large-scale machine learning algorithms.
MLconf SF, In Summary
This MLconf has been our largest event yet! In effort to keep the intimate feel of MLconf, we’ve opted to keep attendance limited to 500 or less for each event. So far, we’ve found this to keep the personality and experience we’ve been hoping for. Overall, I have to say that this MLconf was the most successful one, so far. In terms of organization, things seemed to flow smoothly. Audio/Visual worked surprisingly well, including the streaming of the event. I remember last year 5 minutes before the morning keynote all the electronics were down!!! We were grateful to not have that stress this year. We did experience one unexpected glitch during lunch- the venue did not have enough waiters to cater the crowd and we, the organizers, had to rush into the kitchen and start serving food! I can’t complain- it was a fun experience.. All hands on deck!
The morning track had a theme of deep learning and tensor flow. Alex Smola, CEO at Marianas Labs, gave an impressive keynote with tricks, algorithms, and facts about mainstream algorithms; and of course- about deep learning. The second track, by Braxton McKee, CEO at Ufora, gave a different angle on parallelization and scaling. Following a programing language/compiler approach he emphasized on an automatic way where all the details about cache efficiency etc are taken care by machine learning algorithms that optimize data distribution under the hood. Ufora chose MLconf to announce the open sourcing of their platform. Taking a small break from platforms, Isabelle Guyon, President at ChaLearn, emphasized the problem of inferring causal factors in everyday data science. The competition she ran, revealed an interesting approach discovered by someone who happened to be in the audience! The first session ended with the room being packed, it was impossible to find a spot even to stand.
Right after the break we announced the winners of the MLconf Industry Impact Student Research Award, sponsored by Google. The winners were: Furong Huang, Student at UC Irvine, and Virginia Smith, Student at UC Berkeley. Find out more about their work in our blogpost. Following their brief summaries of their work, we introduced Quoc Le, Software Engineer at Google. Quoc presented on the recent advances in deep learning at Google, that raised philosophical issues about the meaning of life!
Irina Rish, Research Staff at IBM Watson, presented the brain, schizophrenia, FMRI, EKG while demonstrating a live and stylish EKG sensor. That was the best demonstration of sparse learning in action. Following Irina, Alison Gilmore, Data Scientist at Ayasdi, presented on the fascinating world of topological learning and showed how it can be applied in data analysis. As she presented, it’s all about finding the shape of the data. Just before going for lunch, Subutai Ahmad, VP of Research at Numenta, announced the open sourcing of their anomaly detection for streaming data.
During the break a loop of videos by Welch Labs presented the fundamentals of neural networks. Welch labs offers a very nice set of videos for machine learning in a user friendly way. It is not education, it is edutainment!
Following lunch, MLconf veteran speaker, Xavier Amatriain, VP of Engineering at Quora, gave us another 10 lessons he learnt from machine learning. Last year, he gave us his first 10 lessons. Both of his talks turned out to be total crowd favorites! Another presentation of lessons learned followed after Xavier; from Ben Hamner, CTO at Kaggle. Apparently this was the lesson’s session, which also included Justin Basilico, Research/Engineering Manager, at Netflix, who spoke about the lessons he learned from recommender systems. The session ended with a different type of recommendations- Brad Klingenberg, the Director of Styling Algorithms at Stitch Fix, reminded us that the human element is very important in recommendations for fashion and still more reliable than the machine.
Anima Anandkumar, from UC Irvine, presented on the application of tensors in a practical model. It is amazing what this simple model can do. Guaranteeing global optimality is a big plus. Following Anima, Alessandro Magnani, Data Scientist at Walmart Labs, presented a problem which is becoming hot these days- recommending items with short lifestyle! The last two talks by Naraynan Sundaram, Research Scientist at Intel, and Melanie Warrick, Deep Learning Engineer at Skymind kept the attendance at very high levels although it was already late. Narayan’s talk was the only one this time about graphs. The platform they develop at Intel seems promising and very fast. Melanie closed the conference with the favorite subject, attention models. It was a nice presentation for a subject that was mentioned briefly by Quoc earlier in the morning.
We want to thank the speakers for devoting considerable time to prepare their material and present it. If you want to get a glance of MLconf and test how much you understand the content, feel free to take our quiz. Video footage and slides from all the talks can be found on the event page.