Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor, CalTech
Anima Anandkumar is a principal scientist at Amazon Web Services and a Bren professor at Caltech CMS department. Her research interests are in the areas of large-scale machine learning, non-convex optimization and high-dimensional statistics. In particular, she has been spearheading the development and analysis of tensor algorithms. She is the recipient of several awards such as the Alfred. P. Sloan Fellowship, Microsoft Faculty Fellowship, Google research award, ARO and AFOSR Young Investigator Awards, NSF Career Award, Early Career Excellence in Research Award at UCI, Best Thesis Award from the ACM Sigmetrics society, IBM Fran Allen PhD fellowship, and several best paper awards. She has been featured in a number of forums such as the yourstory, Quora ML session, O’Reilly media, and so on. She received her B.Tech in Electrical Engineering from IIT Madras in 2004 and her PhD from Cornell University in 2009. She was a postdoctoral researcher at MIT from 2009 to 2010, an assistant professor at U.C. Irvine between 2010 and 2016, and a visiting researcher at Microsoft Research New England in 2012 and 2014.
Large-scale Machine Learning: Deep, Distributed and Multi-Dimensional:
Modern machine learning involves deep neural network architectures which yields state-of-art performance on multiple domains such as computer vision, natural language processing and speech recognition. As the data and models scale, it becomes necessary to have multiple processing units for both training and inference. Apache MXNet is an open-source framework developed for distributed deep learning. I will describe the underlying lightweight hierarchical parameter server architecture that results in high efficiency in distributed settings.
Pushing the current boundaries of deep learning requires using multiple dimensions and modalities. These can be encoded into tensors, which are natural extensions of matrices. We present new deep learning architectures that preserve the multi-dimensional information in data end-to-end. We show that tensor contractions and regression layers are an effective replacement for fully connected layers in deep learning architectures. They result in significant space savings with negligible performance degradation. These functionalities are available in the Tensorly package with MXNet backend interface for large-scale efficient learning.
Jonas Schneider, Head of Robotics, OpenAI
Jonas leads technology development for OpenAI’s robotics group, developing methods to apply machine learning and AI to robots. He also helped build the infrastructure to scale OpenAI’s distributed ML systems to thousands of machines.
Machine Learning Systems at Scale:
OpenAI is a non-profit research company, discovering and enacting the path to safe artificial general intelligence. As part of our work, we regularly push the limits of scalability in cutting-edge ML algorithms. We’ve found that in many cases, designing the systems we build around the core algorithms is as important as designing the algorithms themselves. This means that many systems engineering areas, such as distributed computing, networking, and orchestration, are crucial for machine learning to succeed on large problems requiring thousands of computers. As a result, at OpenAI engineers and researchers work closely together to build these large systems as opposed to a strict researcher/engineer split. In this talk, we will go over some of the lessons we’ve learned, and how they come together in the design and internals of our system for learning-based robotics research.
Doug Eck, Research Scientist, Google
Doug leads Magenta, a Google Brain project working to generate music, video, image and text using deep learning and reinforcement learning. A main goal of Magenta is to better understanding how AI can enable artists and musicians to express themselves in innovative new ways. Before Magenta, Doug led the Google Play Music search and recommendation team. From 2003 to 2010 Doug was faculty at the University of Montreal’s MILA Machine Learning lab, where he worked on expressive music performance and automatic tagging of music audio.
The Role of AI and Machine Learning in Creativity:
I’ll discuss Magenta, a Google Brain project investigating music and art generation using deep learning and reinforcement learning. I’ll describe the goals of Magenta and how it fits into the general trend of AI moving into our daily lives. One crucial question is: Where does AI and Machine Learning fit in the creative process? I’ll argue that it’s about augmenting and extending the artist rather than just creating artifacts (songs, paintings, etc.) with machines. I’ll talk about two recent projects. In the first, we explore the use of recurrent neural networks to extend musical phrases in different ways. In the second we look at teaching a neural network to draw with strokes. This will be a high-level overview talk with no need for knowledge of AI or Machine Learning.
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Laboratories
Tamara G. Kolda is a member of the Data Science and Cyber Analytics Department at Sandia National Laboratories in Livermore, CA. Her research is generally in the area of computational science and data analysis, with specialties in multilinear algebra and tensor decompositions, graph models and algorithms, data mining, optimization, nonlinear solvers, parallel computing and the design of scientific software. She has received a Presidential Early Career Award for Scientists and Engineers (PECASE), been named a Distinguished Scientist of the Association for Computing Machinery (ACM) and a Fellow of the Society for Industrial and Applied Mathematics (SIAM). She was the winner of an R&D100 award and three best paper prizes at international conferences. She is currently a member of the SIAM Board of Trustees and serves as associate editor for both the SIAM J. Scientific Computing and the SIAM J. Matrix Analysis and Applications.
Tensor Decomposition: A Mathematical Tool for Data Analysis:
Tensors are multiway arrays, and tensor decompositions are powerful tools for data analysis. In this talk, we demonstrate the wide-ranging utility of the canonical polyadic (CP) tensor decomposition with examples in neuroscience and chemical detection. The CP model is extremely useful for interpretation, as we show with an example in neuroscience. However, it can be difficult to fit to real data for a variety of reasons. We present a novel randomized method for fitting the CP decomposition to dense data that is more scalable and robust than the standard techniques. We further consider the modeling assumptions for fitting tensor decompositions to data and explain alternative strategies for different statistical scenarios, resulting in a _generalized_ CP tensor decomposition.
Xavier Amatriain, Cofounder & CTO, Curai
Xavier Amatriain is currently co-founder and CTO of Curai, a stealth startup trying to radically improve healthcare for patients by using AI. Previous to this, he was VP of Engineering at Quora, and Research/engineering Director at Netflix, where he led the team building the famous Netflix recommendation algorithms. Before going into leadership positions in industry, Xavier was a research scientist at Telefonica Research and a research director at UCSB. With over 50 publications (and 3k+ citations) in different fields, Xavier is best known for his work on machine learning in general and recommender systems in particular. He has lectured at different universities both in the US and Spain and is frequently invited as a speaker at conferences and companies.
ML to Cure the World:
The practice of medicine involves diagnosis, treatment, and prevention of diseases. Recent technological breakthroughs have made little dent to the centuries-old system of practicing medicine: complex diagnostic decisions are still mostly dependent on “educated” work-ups of the doctors, and rely on somewhat outdated tools and incomplete data. All of this often leads to imperfect, biased, and, at times, incorrect diagnosis and treatment.
With a growing research community as well as tech companies working on AI advances to medicine, the hope for healthcare renaissance is definitely not lost. The emphasis of this talk will be on ML-driven medicine. We will discuss recent AI advancements for aiding medical decision including language understanding, medical knowledge base construction and diagnosis systems. We will discuss the importance of personalized medicine that takes into account not only the user, but also the context, and other metadata. We will also highlight challenges in designing ML-based medical systems that are accurate, but at the same time engaging and trustworthy for the user.
Dr. Steve Liu, Chief Scientist, Tinder
Dr. Steve Liu is chief scientist at Tinder. In his role, he leads research innovation and applies novel technologies to new product developments.
He is currently a professor and William Dawson Scholar at McGill University School of Computer Science. He has also served as a visiting research scientist at HP Labs. Dr. Liu has published more than 280 research papers in peer-reviewed international journals and conference proceedings. He has also authored and co-authored several books. Over the course of his career, his research has focused on big data, machine learning/AI, computing systems and networking, Internet of Things, and more. His research has been referenced in articles publishing across The New York Times, IDG/Computer World, The Register, Business Insider, Huffington Post, CBC, NewScientist, MIT Technology Review, McGill Daily and others. He is a recipient of the Outstanding Young Canadian Computer Science Researcher Prizes from the Canadian Association of Computer Science and is a recipient of the Tomlinson Scientist Award from McGill University.
He is serving or has served on the editorial boards of ACM Transactions on Cyber-Physical Systems (TCPS), IEEE/ACM Transactions on Networking (ToN), IEEE Transactions on Parallel and Distributed Systems (TPDS), IEEE Transactions on Vehicular Technology (TVT), and IEEE Communications Surveys and Tutorials (COMST). He has also served on the organizing committees of more than 38 major international conferences and workshops.
Dr. Liu received his Ph.D. in Computer Science with multiple honors from the University of Illinois at Urbana-Champaign. He received his Master’s degree in Automation and BSc degree in Mathematics from Tsinghua University.
Personalized User Recommendations at Tinder: The TinVec Approach:
With 26 million matches per day and more than 20 billion matches made to date, Tinder is the world’s most popular app for meeting new people. Our users swipe for a variety of purposes, like dating to find love, expanding social networks and meeting locals when traveling.
Recommendation is an important service behind-the-scenes at Tinder, and a good recommendation system needs to be personalized to meet an individual user’s preferences. In this talk, we will discuss a new personalized recommendation approach being developed at Tinder, called TinVec. TinVec embeds users’ preferences into vectors leveraging on the large amount of swipes by Tinder users. We will discuss the design, implementation, and evaluation of TinVec as well as its application to
Ted Willke, Sr. Principal Engineer, Intel
Ted Willke leads a team that researches large-scale machine learning and data mining techniques in Intel Labs. His research interests include parallel and distributed systems, image processing, machine learning, graph analytics, and cognitive neuroscience. Ted is also a co-principal investigator in a multi-year grand challenge project on real-time brain decoding with the Princeton Neuroscience Institute. Previously, he founded an Intel venture focused on graph analytics for data science that is now an Intel-supported open source project. In 2014, he won Intel’s highest award for this effort. In 2015, he was appointed to the Science & Technology Advisory Committee of the US Department of Homeland Security. Ted holds a doctorate in electrical engineering from Columbia University, a master’s from the University of Wisconsin-Madison, and a bachelor’s from the University of Illinois.
Matineh Shaker, Artificial Intelligence Scientist, Bonsai
Matineh Shaker is an Artificial Intelligence Scientist at Bonsai in Berkeley, CA, where she builds machine learning, reinforcement learning, and deep learning tools and algorithms for general purpose intelligent systems. She was previously a Machine Learning Researcher at Geometric Intelligence, Data Science Fellow at Insight Data Science, Predoctoral Fellow at Harvard Medical School. She received her PhD from Northeastern University with a dissertation in geometry-inspired manifold learning.
Deep Reinforcement Learning with Shallow Trees:
In this talk, I present Concept Network Reinforcement Learning (CNRL), developed at Bonsai. It is an industrially applicable approach to solving complex tasks using reinforcement learning, which facilitates problem decomposition, allows component reuse, and simplifies reward functions. Inspired by Sutton’s options framework, we introduce the notion of “Concept Networks” which are tree-like structures in which leaves are “sub-concepts” (sub-tasks), representing policies on a subset of state space. The parent (non-leaf) nodes are “Selectors”, containing policies on which sub-concept to choose from the child nodes, at each time during an episode. There will be a high-level overview on the reinforcement learning fundamentals at the beginning of the talk.
Ashfaq Munshi, ML7 Fellow, Pepper Data
Before joining Pepperdata, Ash was executive chairman for Marianas Labs, a deep learning startup sold in December 2015. Prior to that he was CEO for Graphite Systems, a big data storage startup that was sold to EMC DSSD in August 2015. Munshi also served as CTO of Yahoo, as a CEO of both public and private companies, and is on the board of several technology startups.
Josh Wills, Head of Data Engineering, Slack
Josh Wills is the head of data engineering at Slack. Prior to Slack, he built and led data science teams at Cloudera and Google. He is the founder of the Apache Crunch project, co-authored an O’Reilly book on advanced analytics with Apache Spark, and wrote a popular tweet about data scientists.
Rushin Shah, Engineering Leader, Facebook
Rushin Shah is an engineering leader at Facebook, currently working on natural language understanding and dialog. Previously, he was at Siri at Apple for 5 years, where he built and headed the natural language understanding group, which included teams dedicated to modeling, engineering and data science. He also worked at the query understanding group at Yahoo. He has worked on a broad range of problems in the NLP area including parsing, information extraction, dialog and question answering. He holds degrees in language technologies and computer science from Carnegie Mellon and IIT Kharagpur.
Franziska Bell, Data Science Manager on the Platform Team, Uber
Franziska Bell is a Data Science Manager on the Platform Team at Uber and leads Applied Machine Learning, Forecasting Platform, Anomaly Detection, Customer Support Data Science, and Communications Platform Data Science.
Before Uber, Franziska was a Postdoc at Caltech where she developed a novel, highly accurate approximate quantum molecular dynamics theory to calculate chemical reactions for large, complex systems, such as enzymes. Franziska earned her Ph.D. in theoretical chemistry from UC Berkeley focusing on developing highly accurate, yet computationally efficient approaches which helped unravel the mechanism of non-silicon-based solar cells and properties of organic conductors.
Uncertainty Estimation in Neural Networks for Time Series Prediction at Uber:
Reliable uncertainty estimations for forecasts are critical in many fields, including finance, manufacturing, and meteorology.
At Uber, probabilistic time series forecasting is essential for accurate hardware capacity predictions, marketing spend allocations, and real-time system outage detection across millions of metrics. Classical time series models are often used in conjunction with a probabilistic formulation for uncertainty estimation. However, such models can be hard to tune, scale, and add exogenous variables to. Motivated by the recent resurgence of Long Short Term Memory networks, we propose a novel end-to-end Bayesian deep model that provides time series prediction along with uncertainty estimation at scale.
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital
June Andrews is a Principal Data Scientist at Wise.io, From GE Digital working on a machine learning and data science platform for the Industrial Internet of Things, which includes aviation, trains, and power plants. Previously, she worked at Pinterest spearheading the Data Trustworthiness and Signals Program to create a healthy data ecosystem for machine learning. She has also lead efforts at LinkedIn on growth, engagement, and social network analysis to increase economic opportunity for professionals. June holds degrees in applied mathematics, computer science, and electrical engineering from UC Berkeley and Cornell.
Counter Intuitive Machine Learning for the Industrial Internet of Things:
The Industrial Internet of Things (IIoT) is the infrastructure and data flow built around the world’s most valuable things like airplane engines, medical scanners, nuclear power plants, and oil pipelines. These machines and systems require far greater uptime, security, governance, and regulation than the IoT landscape based around consumer activity. In the IIoT the cost of being wrong can be the catastrophic loss of life on a massive scale. Nevertheless, given the growing scale through the digitalization of industrial assets, there is clearly a growing role for machine learning to help augment and automate human decision making. It is against this backdrop that traditional machine learning techniques must be adapted and need based innovations created. We see industrial machine learning as distinct from consumer machine learning and in this talk we will cover the counterintuitive changes of featurization, metrics for model performance, and human-in-the-loop design changes for using machine learning in an industrial environment.
Daniel Shank, Data Scientist, Talla
Daniel Shank is a Senior Data Scientist at Talla, a company developing a platform for intelligent information discovery and delivery. His focus is on developing machine learning techniques to handle various business automation tasks, such as scheduling, polls, expert identification, as well as doing work on NLP. Before joining Talla as the company’s first employee in 2015, Daniel worked with TechStars Boston and did consulting work for ThriveHive, a small business focused marketing company in Boston. He studied economics at the University of Chicago.
Getting Value Out of Chat Data:
Chat-based interfaces are increasingly common, whether as customers interacting with companies or as employees communicating with each other within an organization. Given the large number of chat logs being captured, along with recent advances in natural language processing, there is a desire to leverage this data for both insight generation and machine learning applications. Unfortunately, chat data is user-generated data, meaning it is often noisy and difficult to normalize. It is also mostly short texts and heavily context-dependent, which cause difficulty in applying methods such as topic modeling and information extraction.
Despite these challenges, it is still possible to extract useful information from these data sources. In this talk, I will be providing an overview of techniques and practices for working with chat-based user interaction data with a focus on machine-augmented data annotation and unsupervised learning methods.
Suneel Marthi, Sr. Principal Engineer, Red Hat Inc.
Suneel is a Senior Principal Engineer, Office of CTO, Red Hat. He is a member of Apache Software Foundation and is a committer on several apache projects like Apache Mahout, Apache OpenNLP, Apache Flink, Apache MxNet. He’s presented at Flink Forward, Hadoop Summit Europe, Berlin Buzzwords, Flink Forward and Apache Big Data in the past.
Deriving Actionable Insights from High Volume Media Streams:
Media analysts have to deal with with analyzing high volumes of real-time news feeds and social media streams which is often a tedious process because they need to write search profiles for entities. Python tools like NLTK do not scale to large production data sets and cannot be plugged into a distributed scalable frameworks like Apache Flink.
Apache Flink being a streaming first engine is ideally suited for ingesting multiple streams of news feeds, social media, blogs etc.. and for being able to do streaming analytics on the various feeds. Natural Language Processing tools like Apache OpenNLP can be plugged into Flink streaming pipelines so as to be able to perform common NLP tasks like Named Entity Recognition (NER), Chunking, and text classification.
In this talk, we’ll be building a real-time media analyzer which does Named Entity Recognition (NER) on the individual incoming streams, calculates the co-occurrences of the named entities and aggregates them across multiple streams; index the results into a search engine and being able to query the results for actionable insights. We’ll also be showing as to how to handle multilingual documents for calculating co-occurrences.
NLP practitioners will come away from this talk with a better understanding of how the various Apache OpenNLP components can help in processing large streams of data feeds and can easily be plugged into a highly scalable and distributed framework like Apache Flink.