The MLconf Blog

Guest Post: The Rise of Machines Against Humans

Our friends at Pokersites recently shared a blog & Infographic with us, titled:
Poker & AI: The Raise of Machines Against Humans. It details insights and research about the evolution of poker artificial intelligence, the history, as well as where it is now. They thought our community might be interested and asked us to share with you.
See the original post on their blog, here.
Enjoy!

MLconf San Francisco 2017 Speaker Resources

Franziska Bell, Data Science Manager on the Platform Team, Uber

Gal and Ghahramani (2016), A Theoretically Grounded Application of Dropout in Recurrent Neural Networks, NIPS
Engineering Extreme Event Forecasting at Uber with Recurrent Neural Networks (Laptev, Smyl, Shanmugam): https://eng.uber.com/neural-networks/

Josh Wills, Head of Data Engineering, Slack

Jonas Schneider, Head of Engineering for Robotics, OpenAI

Robots that Learn: https://blog.openai.com/robots-that-learn/
Dota 2: https://blog.openai.com/dota-2/ & https://blog.openai.com/more-on-dota-2/
Kubernetes at OpenAI: https://blog.openai.com/infrastructure-for-deep-learning/ & https://www.youtube.com/watch?v=v4N3Krzb8Eg
Trust Region Policy Optimization (TRPO): https://arxiv.org/abs/1502.05477
Proximal Policy Optimization (PPO): https://blog.openai.com/openai-baselines-ppo/
Evolution Strategies: https://blog.openai.com/evolution-strategies/
Deep Deterministic Policy Gradients (DDPG): https://arxiv.org/abs/1509.02971
Hindsight Experience Replay (HER): https://arxiv.org/abs/1707.01495
Jenkins Continuous Integration: https://jenkins.io/

LN Renganarayana, Architect, ML Platform and Services, Workday
Madhura Dudhgaonkar, Head of Engineering, Search, Data Science and Machine Learning, Workday

https://medium.com/workday-engineering

Michael Alcorn, Sr. Software Engineer, Red Hat Inc.

Guest Blog by Michael Alcorn, Senior Software Engineer, Red Hat
Feature Engineering/Representation Learning Wiki Page
Karl Weiss, Taghi M. Khoshgoftaar, DingDing Wang, A survey of transfer learning
Goodfellow, et al; Deep Learning, Ch. 15: Representation Learning
Bengio et al, Representation Learning: A Review and New Perspectives
NVIDIA Blog: Introduction to Neural Machine Translation with GPUs (Part 2)
Mikolov et al. (2013) Efficient Estimation of Word Representations inVector Space
Deeplearning4j – “Word2vec”
Blog Post: Janelle Shane: New paint colors invented by neural network
SCATTERPLOT3D
https://access.redhat.com/solutions/25190
https://access.redhat.com/solutions/10107
Le and Mikolov (2014) Distributed Representations of Sentences and Documents
“NLP 05: From Word2vec to Doc2vec: a simple example with Gensim”
Wang and Zemel (2016) Classifying NBA Offensive Plays Using Neural Networks
Github: airalcorn2/ Deep-Semantic-Similarity-Model
Github: airalcorn2/ batter-pitcher-2vec
Github Blog: “Learning to Coach Football“

Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Labs

http://www.tensortoolbox.org/

Xavier Amatriain, Co-Founder and CTO, Curai

“Probabilistic diagnosis using a reformulation of the INTERNIST-1/QMR knowledge base” . Shwe et al. 1991.
“Computer-assisted diagnostic decision support: history, challenges, and possible paths forward” Miller. 2009.
“Mining Biomedical Ontologies and Data Using RDF Hypergraphs” Liu et al. 2013.
“Health Recommender Systems: Concepts, Requirements, Technical Basics & Challenges”, Wiesner & Pfeifer, 2014.
“A ‘Green Button’ For Using Aggregate Patient Data At The Point Of Care” Longhurst et al. 2014.
“Building the graph of medicine from millions of clinical narratives” Finlayson et al. 2014.
“Comparison of Physician and Computer Diagnostic Accuracy” Semigran et al. 2016.
“Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization”. Joshi et al. 2016.
“Clinical Tagging with Joint Probabilistic Models” . Halpern et al. 2016.
“Deep Patient: An Unsupervised Representation to Predict the Future of Patients from EHR”. Miotto et al. 2016.
“Learning a Health Knowledge Graph from Electronic Medical Records” Rotmensch et al. 2017.
“Clustering Patients with Tensor Decomposition”. Ruffini et al. 2017.
“Patient Similarity Using Population Statistics and Multiple Kernel Learning”. Conroy et al. 2017.
“Diagnostic Inferencing via Clinical Concept Extraction with Deep Reinforcement Learning”. Ling et al. 2017.
“Generating Multi-label Discrete Patient Records using Generative Adversarial Networks” Choi et al. 2017
Suresh, H., Szolovits, P., & Ghassemi, M. (2017, March 20). The Use of Autoencoders for Discovering Patient Phenotypes. arXiv.org.

Tweet @ MLconf SF to Win Free ML Books!

TWEET FOR CHANCE TO WIN FREE BOOKS!
We’ll be hosting book giveaways at the conclusion of the event, for the most unique tweets that mention @mlconf and/or #mlconfsf.
Our generous publishers are sending books again to MLconf Atlanta. Make sure to grab coupons, as they’ll be offering exclusive book discounts at MLconf! We’ll be hosting book giveaways at the conclusion of the event for the most unique tweets that mention @mlconf and/or #mlconfsf Thanks to MIT Press, Manning Publications and CRC Press for participating!

CRC Press:
*Save 20% when ordering online and enter promo code: AWR96

Manning Publications:

The MIT Press:
Use promo code: MITMLSF to save during check-out!
Additional Books on Display:

Arbesman, Samuel, 2 of Overcomplicated: Technology at the Limits of Comprehension,
Brynjolfsson, Erik, The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies
Parker, Geoffrey G., Platform Revolution: How Networked Markets Are Transforming the Economy–And How to Make Them Work for You
Ford, Martin, Rise of the Robots: Technology and the Threat of a Jobless Future
Kurzweil, Ray, How to Create a Mind: The Secret of Human Thought Revealed
Bostrom, Nick, Superintelligence: Paths, Dangers, Strategies
Domingos, Pedro, The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World
Raschka, Sebastian, Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow, 2nd Edition
Reznor, Eliot, Cryptocurrency: A Beginner’s Guide To Understanding And Winning With Fintech (Bitcoin, Blockchain, Trading, Investing, Mining, Digital Money, Smart Contracts)

Interview with Erik Schlicht

One of our Program Committee members, Reshama Shaikh, recently interviewed Erik Schlicht, PhD, the founder of the Computational Cognition Group, LLC. Prior to leaving academia to found C2-g, Dr. Schlicht’s research utilized quantitative methods to investigate human decision making under uncertainty and risk. He leveraged methods from AI, machine learning and cognitive science to understand real-world decision making.

RS) Tell us briefly about yourself and your work.

ES) I received my PhD in Cognitive and Brain Sciences (with a minor in Human Factors) from the University of Minnesota. My thesis utilized Bayesian decision-theoretic methods to predict natural human sensorimotor control. I was a postdoctoral researcher between Harvard and Caltech and conducted behavioral- and neuro-economic experiments using a simplified poker task that I developed.

After my academic training, I moved on to applied research at both MIT Lincoln Laboratory and the University of Minnesota. In these positions, I utilized multifidelity methods for both aerospace and transportation studies, respectively.

RS) For those of us who are unfamiliar, what is low and high, multi – fidelity – method?

ES) The goal of the applied research I was involved with attempted to predict operational decision-making. Operational decisions were defined to be an expert making a real-world decision in a risky and uncertain context. To predict expert behavior using machine learning, it requires an adequate amount of high-fidelity (real-world, expert) data to train and evaluate the model. However, in these operational contexts there is seldom enough high-fidelity data available to adequately train the model.

Researchers can leverage data lower on the fidelity spectrum to collect lots of inexpensive data. For example, data can be collected from novices participating in a distributed simulation that approximates the operational context of interest. The risk of utilizing low-fidelity data is that it may not be useful for predicting expert behavior, due to both simulation and participant factors.

Therefore, multifidelity methods attempt to leverage the strengths of each data source, while overcoming the limitations, allowing for low-fidelity data to be used to train models that accurately predict expert behavior. This also enables accurate models of expert behavior to be developed inexpensively.

RS) Can you give an example of how this method is used currently?
Investigate human decision making under uncertainty and risk (pilots, drivers, dating, marketing?) What are effective and popular applications?

ES) In general, I have found multifidelity approaches to be extremely useful for understanding human-technology interactions. For example, we leveraged multifidelity approaches to understand the safety associated with UAS (unmanned aerial system) Operators during self-separation scenarios, in addition to exploring the risk involved with providing drivers with in-vehicle signage.

To extend this example, assume we want to evaluate the safety associated with unmanned aircraft (say, drones) in the NAS (national airspace system). Suppose two aircraft are on a trajectory to collide. How do we simulate these instances so that operator behavior is reflected correctly, and encounters are represented accurately? Multifidelity methods can be used to infer operator utility weights from low-fidelity data, enabling accurate prediction of their decisions and producing valid safety estimates.

The benefit of using this approach for understanding human-technology interactions is that you’re able to gather low-fidelity data on different technology concepts and exclusively produce the one that is estimated to be safest and/or lead to improved performance in the real-world setting. In the absence of multifidelity approaches, technologies have to be manufactured and deployed before data are available regarding its effectiveness, which obviously leads to greater cost (time and money).

RS) What excites you most about the direction of this research? Where do you see it going in 5 or 10 years?

ES) This area is relatively new, so there hasn’t been a ton of research of which I’m aware. Back in 2011, Judea Pearl won the Turing award, so he gave a keynote speech when I first presented our multifidelity work at UAI-2012. His lecture was on the concept of metasynthesis and how he’s excited to see where this area leads. During our poster session, his student commented that he believed this multifidelity work to be a specific example of metasynthesis. I took that as a cue to keep pursuing this line of research, so I will utilize the multifidelity method as opportunities arise, and I hope many others do the same.

I see this making an impact in domains where human-technology interactions directly impact outcomes (e.g., healthcare and operational decision-making). Overall, this could lead to technology being developed because it improves system performance and safety, rather than more superficial reasons. So, I’m excited to see where this method can contribute.

RS) What tools do you use in your research? Platform (AWS?), Python, R? How large are your datasets? Algorithms?

ES) I primarily use MATLAB for rapid algorithm development and then deploy the code on whatever native language is necessary. Since I am leveraging multifidelity data, we tend to run the spectrum on data size and algorithm complexity.

RS) Are there ways for other researchers and students to participate in this research?

ES) I would recommend students read the limited literature there is in the area and see if they can improve on the methods that were developed to date (either in theory or application). I am sure most of the researchers in the area would be happy to help, so just contact them for collaboration opportunities.

About Erik Schlicht

Dr. Schlicht is the Founder of the Computational Cognition Group (C2-g), LLC. His research utilizes quantitative methods to investigate human decision making under uncertainty and risk. He leverages techniques from AI, machine learning, and computational cognitive science to understand real-world decision making. This expertise has been used to innovate across many different data-driven domains. Here’s a link to his webpage: http://schlicht.org

Guest Blog by Alan Descoins, CTO, Tryolabs

Why We Built an Open Source Computer Vision Toolkit

Introduction

Over the last few years, the Machine Learning (ML) landscape has changed dramatically. The comeback of neural networks in the form of Deep Learning has opened new creative ways to approach many classical ML and Artificial Intelligence (AI) problems.
Out of all the areas vastly improved by Deep Learning techniques, computer vision has been particularly revolutionized by major breakthroughs that have only very recently occurred.

The Deep Learning revolution in computer vision

It all started when a new model published in a paper by Krizhevsky et al. — known as AlexNet — won the ImageNet challenge in 2012. This challenge had been running for quite some number of years, and one parts consists of an image classification task. In it, algorithms are given images and must classify them into a fixed set of categories. The training set consists of 1.2 million images, each of which can be labeled into 1000 categories. For the evaluation, the algorithm must decide the categories of images which it hasn’t previously seen during training.
This new AlexNet model could reduce the error rate from 25% in 2011 to a mere 16%, greatly improving the results of prior state of the art techniques which, at the time, still relied on handcrafted features such as HOG or SIFT. Subsequent years continued the trend, and we quickly saw the error rate drop to less than 3% — often surpassing human performance in the narrow task of assigning a category to images. Deep Learning was here to stay, image classification was “solved” and this xkcd comic was obsolete.
What changed? It is interesting to note that conceptually, AlexNet was almost equivalent to a model called LeNet-5, proposed by Yann LeCun (now Director of AI Research at Facebook) almost 15 years earlier, used back then to recognise handwritten digits. However, LeNet had roughly 2 orders of magnitude less parameters (AlexNet had around 60 million), and it had a much smaller dataset on which to train (MNIST). Advances in modern GPU technology, added to the availability of larger open datasets (such as ImageNet) have really driven algorithmic advancements which ended up significantly improving accuracy in almost every area where Deep Learning has been applied.
In particular, both LeNet and AlexNet belong to a family of neural network architectures called Convolutional Neural Networks (CNNs or ConvNets). These networks have a number of properties that make them really well suited for image processing tasks. CNNs use a layered structure, in which each layer has the ability to slide different filters through the output of the previous layer. This allows the networks to learn patterns of increasing level of abstraction, until they can grasp complex concepts that would otherwise be hard to express. For example, it has been shown that lower layers learn simple features like edges and color gradients, while higher layers can learn more abstract features like eyes or faces.

On object detection in images

The power of CNNs has also been applied to object detection, dramatically improving results over previous methods. In object detection, we aim to identify a variable number of objects present in an image, and for each one, classifying it into a fixed set of categories and also drawing a bounding box that encloses it. This is a much harder problem than image classification, and so far it has not been “solved”.

This problem has plenty of applications, ranging from simple counting any type of object (like people or cars), aerial image analysis, visual search engines or health care, where it can help with analysis of medical images. Did we also mention self driving cars?
One can arguably say that with the recent explosion in Deep Learning frameworks (Google’s TensorFlow, Facebook’s PyTorch or Caffe2, Microsoft’s CNTK, Amazon’s MXNet, among many others) and the tools built around them it is fairly straightforward for most programmers to implement state of the art image classification techniques into applications. However, object detection is a tougher nut to crack.

Object detection in the industry

Over the last couple of years, we have identified some issues most companies face when incorporating these technologies into their platforms.
Intense learning curve
Even for people with tons of experience (and even with PhDs in the field!), the developments in the space have been evolving so quickly that it’s very hard to keep up with the pace of recent advancements and new techniques.
Moreover, many implementations of object detection techniques come from the academic world, where they were conceived as research prototypes. and not really intended to be used in production applications. This means that configuring them and modifying them can be challenging.
While there are some high level abstractions such as Keras, it is not always straightforward to apply them directly to the latest research findings. At the cost of simplicity, abstractions end up limiting what you can customize.
SaSS solutions are not always applicable
Many large companies have developed their own SaaS API solutions, like Google’s Vision API, Microsoft’s Computer Vision API, Amazon’s Rekognition, among others. Those are super simple to integrate into most applications and sometimes offer great solutions. However, these APIs usually only work for a predefined number of tasks/classes/object types like detecting faces, cars, dogs, cats, etc. While these are good for a broad range of applications, it is very common that companies have their own needs specific to the type of problem they are trying to solve. What if you are manufacturing company and want to identity defective pieces in your production chain? Or if you work in health care, and want to detect certain patterns in medical imagery? In these cases, you need something that can be trained with any dataset.
There are services that even let you upload samples and do the training in the cloud. However, sometimes you want to be the owner of your data model and deploy in your own servers, and not have an important part of your core business tech hosted outside your platform, using a Cloud API provider to whose model (or implementation) you have no real access to.
Besides, depending on the volume of usage you expect, SaaS might just become too expensive or the latency in transmitting images (which require high bandwidth) will be prohibitive for your application.
Creating a functional implementation from a research paper is damn hard
Even if you are an expert in any of the Deep Learning frameworks, coming up with something that actually works is pretty damn hard. Small implementation details have no room in academic papers, and sometimes they can and do make a huge difference in the results. Moreover, many things can be implemented in several ways, and these might in turn also affect your results.

Bridging the gap: birth of Luminoth

After identifying these issues and realizing that ourselves at Tryolabs ended up rewriting many of the common TensorFlow boilerplate code and models over and over for our clients, we started to think about the need for factoring out some code, and how an ideal toolkit would look like.
As a result, we built Luminoth, an open source toolkit for computer vision. Currently, we support object detection and image classification, but we are aiming for much more. It is built in Python, using TensorFlow, and Sonnet, a very useful library built by DeepMind for building complex neural networks with reusable components.

Luminoth’s main strengths

– State of the art algorithms: currently, we only support image classification and Faster R-CNN model for object detection, but we are actively working on providing more models and keeping up to date with the latest research in the field.
– Open source: Luminoth is free and open source. You can download it, customize it for your needs and integrate it into your product or cloud. You can also collaborate with it!
– Developer friendly: we’ve poured our experience of working on bridging the gap between academic Machine Learning findings and production ready software to make Luminoth accessible. We strived for an easy to use interface, beautiful code with comments, and unit tests. Of course, there is still a lot of room for improvement, and things will get better as the toolkit gets more mature.
– Made with TensorFlow and Sonnet: production ready, reliable, robust and maintained frameworks.
– Customizable: you can train Computer Vision models with your own data. You are not limited to existing datasets such as COCO or ImageNet.
– Cloud integration: we strived for a super simple Google Cloud integration, specifically ML Engine. This means training distributedly is very straightforward: there is no need to buy those GPUs.

Final thoughts

We expect Luminoth to be helpful both to developers wanting to integrate Deep Learning based computer vision algorithms into their products, as well as researchers who want to experiment with new models and techniques while skipping tedious boilerplate code.
Feel free to explore our GitHub repository. Feedback and contributions are more than welcome!

Alan is the CTO of Tryolabs. He holds a Computer Engineering degree and +7 yrs of experience developing robust backends, infrastructures and Machine Learning based algorithms. Currently, he’s the main consultant for every project developed and is an active member of the R&D team. A Python expert with deep understanding of Machine Learning related technologies.

MLconf Atlanta 2017 Speaker Resources

MLconf Atlanta 2017 Speaker Resources

Alexandra Johnson, Software Engineer, SigOpt

Mike McCourt. Building a Better Mousetrap with Multicriteria Bayesian Optimization.
Scott Clark. Using Model Tuning to Beat Vegas
Scott Clark, Ian Dewancker, and Sathish Nagappan. Deep Neural Network Optimization with SigOpt and Nervana Cloud.

Aran Khanna, Software Engineer, Amazon Web Services

Hanjun Dai, PhD Student, School of Computational Science and Engineer, Georgia Tech

structure2vec: https://arxiv.org/abs/1603.05629
sequence2vec: https://academic.oup.com/bioinformatics/article/doi/10.1093/bioinformatics/btx480/4042128/Sequence2Vec-A-novel-embedding-approach-for
deepcoevolve: https://arxiv.org/pdf/1609.03675.pdf
S2V-DQN: https://arxiv.org/abs/1704.01665

Jennifer Marsman, Principal Software Development Engineer, Microsoft

https://www.microsoft.com/en-us/research/project/farmbeats-iot-agriculture/

Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State University

Qiaoling Liu, Lead Data Scientist, Careerbuilder
Qiaoling Liu, Faizan Javed, Vachik S. Dave, and Ankita Joshi: “Supporting Employer Name Normalization at both Entity and Cluster Level”, in Proc. of the 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD2017).

Poster
Video

Qiaoling Liu, Faizan Javed, and Matt Mcnair: “CompanyDepot: Employer Name Normalization in the Online Recruitment Industry”, in Proc. of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD2016).

Poster
Video

Talha Obaid, Email Security, Symantec

Tim Chartier, Chief Academic Officer, Tresata

Guest Post: The Rise of Machines Against Humans

MLconf San Francisco 2017 Speaker Resources

MLconf San Francisco 2017 Speaker Resources

Tweet @ MLconf SF to Win Free ML Books!

Interview with Erik Schlicht

RS) Tell us briefly about yourself and your work.

RS) For those of us who are unfamiliar, what is low and high, multi – fidelity – method?

RS) Can you give an example of how this method is used currently?
Investigate human decision making under uncertainty and risk (pilots, drivers, dating, marketing?) What are effective and popular applications?

RS) What excites you most about the direction of this research? Where do you see it going in 5 or 10 years?

RS) What tools do you use in your research? Platform (AWS?), Python, R? How large are your datasets? Algorithms?