The winner of the 2016 MLconf Industry Impact Student Research Award, which is sponsored by Google has been announced. Our committee has reviewed several nominees and found Tianqi Chen’s research on XGBoost and MXNet to be the most impactful and interesting for future developments in industry.
Tianqi Chen is the winner of the 2016 MLconf Industry Impact Student Research Award! This announcement was made on Friday, November 11th, 2016 in San Francisco. Tianqi accepted via a video acceptance speech (available here).
In 2015, there were 2 winners of the award, including Viriginia Smith (UC Berkeley) whom presented on November 11, 2016 at MLconf SF and Furong Huang (UC Irvine) whom presented at MLconf NYC in April 2016.
Tianqi has been invited to present his work on XGBoost in Seattle at MLconf in 2017. His advisor, Dr. Carlos Guestrin, has presented at MLconf numerous times as well.
Tianqi works at the intersection of learning and systems. He has built many scalable learning systems. His research focuses on scalable boosted trees and work on a package XGBoost, which is widely used for competitive ML and in the industry for supervised learning problems where you train data to predict another variable because it provides parallelized boosted trees that run in an efficient and accurate way. XGBoost is available in many distributed environments for production such as Hadoop, MPI, SGE, Flink, & Spark, and in many preferred languages such as python, R, Julia, java, scala. The framework constructs tree ensembles. It is not easy to train trees at once, so XGBoost takes an additive model and trains one tree, uses the information from it and adds another tree. Then, after the tree ensembles are created, the model needs to be regularized. First, the complexity is defined in order to regularize the model and better understand what information is being learned. Regularization is one part most tree packages treat less carefully, or ignore. This is because the traditional treatment of tree learning emphasized improving impurity, while complexity control was left to heuristics. By defining it formally, we understand it better and it works well in practice. One can derive a structure score and a goodness-of-fit measure for the tree ensemble.
Tianqi is also well-known for his contribution to work on MXNet. MX stands for mix and minimize and is a dynamic dependency scheduler that automatically parallelizes both declarative and imperative operations. The heart of MXNet is NNMVM an intermediate layer just like LLVM. the abstraction to NNVm allows several just in time code optimization s that significantly boost the performance. MXNet as a competitor to TensorFlow is widely recognized as it has been heavily invested in by Amazon.

Bio:
Tianqi holds a bachelor’s degree in Computer Science from Shanghai Jiao Tong University, where he was a member of ACM Class, now part of Zhiyuan College in SJTU. He did his master’s degree at Changhai Jiao Tong University in China on Apex Data and Knowledge Management before joining the University of Washington as a PhD. He has had several prestigious internships and has been a visiting scholar including: Google on the Brain Team, at Graphlab authoring the boosted tree and neural net toolkit, at Microsoft Research Asia in the Machine Learning Group, and the Digital Enterprise Institute in Galway Ireland. What really excites Tianqi is what processes and goals can be enabled when we bring advanced learning techniques and systems together. He pushes the envelope on deep learning, knowledge transfer and lifelong learning. His PhD is supported by a Google PhD Fellowship.
MLconf SF 2016 Speaker Resources
We recently asked the speakers of MLconf San Francisco 2016 to share their favorite articles, books & papers with the MLconf audience. We hope you find this list interesting and educational!

Daria Sorokina, Applied Scientist, A9(Amazon)
Amazon Search: The Joy of Ranking Products

Stephanie deWet, Software Engineer, Pinterest
Yunsong Guo. Pinnability: Machine Learning in the Pinterest Home Feed. https://engineering.pinterest.com/blog/pinnability-machine-learning-home-feed
Deepak Agarwal, Bee-Chung Chen, Rupesh Gupta, Joshua Hartman, Qi He, Anand Iyer, Sumanth Kolar, Yiming Ma, Pannagadatta Shivaswamy, Ajit Singh, and Liang Zhang. 2014. Activity ranking in LinkedIn feed. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD ’14). ACM, New York, NY, USA, 1603-1612. DOI: http://dx.doi.org/10.1145/2623330.2623362
Hao Ma, Xueqing Liu, and Zhihong Shen. 2016. User Fatigue in Online News Recommendation. In Proceedings of the 25th International Conference on World Wide Web (WWW ’16). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 1363-1372. DOI: http://dx.doi.org/10.1145/2872427.2874813
Ewa Dominowska. Generating a Billion Personal News Feeds. MLConf SEA 2016. live talk. https://www.youtube.com/watch?v=iXKR3HE-m8c

Virginia Smith, Researcher, UC Berkeley
CoCoA: A General Framework for Communication-Efficient Distributed Optimization. V. Smith, S. Forte, C. Ma, M. Takac, M. I. Jordan, M. Jaggi. Preprint, 2016. https://arxiv.org/abs/1611.02189.
Adding vs. Averaging in Distributed Primal-Dual Optimization. C. Ma, V. Smith, M. Jaggi, M. I. Jordan, P. Richtarik, M. Takac. International Conference on Machine Learning (ICML ’15). https://arxiv.org/abs/1502.03508.
Communication-Efficient Distributed Dual Coordinate Ascent. M. Jaggi, V. Smith, M. Takac, J. Terhorst, S. Krishnan, T. Hofmann, M. I. Jordan. Neural Information Processing Systems (NIPS ’14). https://arxiv.org/abs/1409.1458.

Guy Lebanon, Director of Machine Learning & Data Science, Netflix
Blog Post: Selecting the best artwork for videos through A/B testing

Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineering, University of Texas at Austin
Software and datasets:
Tuebingen Benchmark:
https://webdav.tuebingen.mpg.de/cause-effect/
Tetrad project:
http://www.phil.cmu.edu/projects/tetrad/
Entropic Causality:
https://github.com/mkocaoglu/Entropic-Causality
Video Tutorials:
https://www.youtube.com/watch?v=9yEYZURoE3Y&feature=youtu.be
CCD Summer Short Course 2016
CMU Center for Causal Discovery short course on Causality and Tetrad.
https://www.youtube.com/watch?v=PpY7Slo57XQ
Tutorial: All of Causal Discovery (by Frederick Eberhardt)
Books and Papers:
P. Spirtes, C. Glymour and R. Scheines, Causation, Prediction, and Search. Bradford Books, 2001.
https://www.amazon.com/Causation-Prediction-Adaptive-Computation-Learning/dp/0262194406
Causality by J. Pearl
Cambridge University Press, 2009.
https://www.amazon.com/Causality-Reasoning-Inference-Judea-Pearl/dp/052189560X
Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction,
G. Imbens and D. Rubin
https://www.amazon.com/Causal-Inference-Statistics-Biomedical-Sciences/dp/0521885884
Jonas Peters, Peter Buehlmann and Nicolai Meinshausen (2016)
Causal inference using invariant prediction: identification and confidence intervals
Journal of the Royal Statistical Society, Series B
https://www.statslife.org.uk/files/rss-preprint-causal-inference-may-2016.pdf
Frederich Eberhardt, Clark Glymour, and Richard Scheines.
On the number of experiments sufficient and in the worst case necessary to identify all causal relations among n variables.
http://www.jmlr.org/proceedings/papers/v6/eberhardt10a/eberhardt10a.pdf
Alain Hauser and Peter Buhlmann. Two optimal strategies for active learning of causal models from interventional data.
International Journal of Approximate Reasoning, 55(4):926–939, 2014.
http://leo.ugr.es/pgm2012/submissions/pgm2012_submission_11.pdf
Learning Causal Graphs with Small Interventions
K. Shanmugam, M. Kocaoglu, A.G. Dimakis, S. Vishwanath (NIPS 2015)
https://papers.nips.cc/paper/5909-learning-causal-graphs-with-small-interventions.pdf
Nonlinear causal discovery with additive noise models,
Patrik O Hoyer, Dominik Janzing, Joris M Mooij, Jonas Peters, Bernhard Scholkopf (NIPS 2008)
http://is.tuebingen.mpg.de/fileadmin/user_upload/files/publications/NIPS2008-Hoyer-neu_5406[0].pdf

Daniel Shank, Data Scientist, Talla
Implementations:
Tensorflow: https://github.com/carpedm20/NTM-tensorflow
Go: https://github.com/fumin/ntm
Torch: https://github.com/kaishengtai/torch-ntm
Node.JS: https://github.com/gcgibson/NTM
Lasagne: https://github.com/snipsco/ntm-lasagne
Theano: https://github.com/shawntan/neural-turing-machines
Papers:
Graves et al. 2016 – Hybrid computing using a neural network with dynamic external memory
Graves et al. 2014 – Neural Turing Machines
Yu et al. 2015 – Empirical Study on Deep Learning Models for Question Answering
Rae et al. 2016 – Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes

Harm van Seijen, Research Scientist, Maluuba
Further Reading:
“Introduction to Reinforcement Learning” by Richard S. Sutton & Andrew G. Barto
https://webdocs.cs.ualberta.ca/~sutton/book/the-book.html
“Algorithms for Reinforcement Learning” by Csaba Szepesvari
https://sites.ualberta.ca/~szepesva/RLBook.html
“Policy Networks with Two-Stage Training for Dialogue Systems” by Mehdi Fatemi, Layla El Asri, Hannes Schulz, Jing He, Kaheer Suleman
https://arxiv.org/abs/1606.03152
Code Examples:
Simple DQN Example In Python:
https://edersantana.github.io/articles/keras_rl/
Tool For Testing/Developing RL Algorithms:
https://gym.openai.com/
Interview with Austin Marshall, Numenta
Our past Technical Chair, interviewed Numenta’s Austin Marshall about HTM’s Numenta’s view in Neural Networks/AI. [Read more…] about Interview with Austin Marshall, Numenta
Chatty Thoughts on Chatty Bots
As recently as this spring, we’re seeing an influx of releases of bots in all industries, from lean startups to some of the largest companies in existence. According to VentureBeat, just this week, Microsoft announced that they have a relationship assistant in the works. According to the article and announcement, Microsoft will incorporate a mobile-friendly “relationship assistant” into its Dynamics CRM customer relationship management (CRM) software. This assistant is designed to recommend actions for salespeople to take..”This can make customer service more efficient and improve customer satisfaction”, Nadella said. From the tone of the announcement, the tool is meant to help humans with efficiency, not take their jobs.
Microsoft isn’t the only tech giant to have recently made such an announcement.. Early last week, Salesforce announced the launch of an AI platform, Einstein, that will work within Sales Cloud and Marketing Cloud. Salesforce also announced a new research division, focusing on deep learning, led by Richard Socher, formerly co-founder and chief executive of A.I. startup MetaMind, which Salesforce acquired earlier this year. The tone of this announcement is similar to that of Microsoft- these AI enhancements are aimed to help sales and marketing teams to be more efficient, not to replace them. More details can be found in this VentureBeat article here.
The tech behemoths aren’t the only ones releasing business minded bots- some smaller startups are making waves with their work with bots in the recruiting industry. On September 3rd, our co-founder, Shon Burton posted his guest blog post “Why I hate Chatbots” In this post, Burton described a new recruiting AI tool, “RAI” and how he believes sourcers and recruiters will benefit from using the tool. Burton states “It won’t fool anyone into thinking it’s human, but for practical recruiting tasks it’s already quite useful. By combining the practical conversational interface of tools like Siri and Alexa, with a Wolfram Alpha inspired ‘knowledge engine’, we’ve developed a recruiting assistant with which a non-expert user can conduct a talent search conversationally.” This week, we saw a press release: https://hiringsolved.com/hello/rai
Also this week, Burton was featured in an interview piece in Fast Company about RAI, specifically focusing on the Diversity Boost feature within RAI, a feature that changes the relevance algorithm within a search, rather than a simple filter, helping recruiters to find a diverse pool of candidates to interview for open positions.
Other new projects to mention are: Mya, which came out of private beta as a job-seeker-facing chatbot to assist with the job application process this summer, as well as an MLconf sponsor: Talla. When asked to participate in this blog, Talla provided a few words:
“At Talla we launched a few bots and realized that the bot-as-an-application model is limited. Therefore, we’ve taken our product up to the platform level. Now, our bot is just the UI for an intelligent layer that lives across your HR applications, and can automate many different workflows.
A very simple interface can be used to do many things, like create custom employee onboarding flows that are delivered conversationally, answer basic benefits questions, and send simple polls throughout the organization within a company’s existing chat platform. All of these workflows are based on common core technologies, like automating or scheduling messages, tapping into common data sources, like HR systems, and performing NLP on structured or unstructured text.”
Though these various bots aim to solve different problems that exist in relationship management; whether in sales, marketing, recruiting, or onboarding, they all seem to aim to assist, enhance and raise efficacy, not replace. We’ll keep our eyes out for developments within these mentioned projects.. Stay tuned!
*[Photo: Flickr user interestedbystandr]
MLconf Atlanta Recommended Academic Papers
Hussein Mehana, Director of Engineering, Facebook:
- https://arxiv.org/abs/1502.01710
Patrick Koch, Principal Data Scientist, and Funda Gunes, Sr. Research Statistician Developer, SAS Institute Inc:
1. Bottou, L., Curtis, F. E., Nocedal, J., Optimization Methods for Large-Scale Machine Learning,arXiv:1606.04838 [stat.ML], 2016.
2. Sutskever, I., Martens, J., Dahl, G. and Hinton, G., E. On the importance of initialization and momentum in deep learning, In Proceedings of the 30th international conference on machine learning (ICML-13), Atlanta, GA, pp. 1139–1147, June 2013.
3. Bergstra, J. and Bengio, Y., Random Search for Hyper-Parameter Optimization, J. Machine Learning Research, 13: 281–305, 2012.
4. Sparks, E. R. , Talwalkar, A., Haas, D. , Franklin, M. J., Jordan, M. I., and Kraska, T., Automating Model Search for Large Scale Machine Learning, Proceedings of the Sixth ACM Symposium on Cloud Computing, August 27-29, 2015, Kohala Coast, Hawaii.
5. Local Search Optimization, SAS/OR®
6. SAS® Viya™ Distributed Analytics Platform
Dr. Le Song, Assistant Professor, College of Computing, Georgia Institute of Technology:
- H. Dai, Y. Wang, R. Trivedi and L. Song. Recurrent Coevolutionary Feature Embedding Processes for Recommendation, Recsys Workshop on Deep Learning for Recommendation Systems, 2016. PDF (BEST PAPER) (http://arxiv.org/pdf/1609.
03675.pdf) - H. Dai, B. Dai and L. Song. Discriminative Embeddings of Latent Variable Models for Structured Data, International Conference on Machine Learning (ICML), 2016. PDF (https://arxiv.org/pdf/1603.
05629.pdf) - Dai, B., Xie, B., He, N., Liang, Y., Raj, A., Balcan, M., and Song, L. Scalable Kernel Methods via Doubly Stochastic Gradients. Neural Information Processing Systems (NIPS 2014). PDF (https://arxiv.org/pdf/1407.
5599.pdf)
Great Machine Learning and Data Science Books to be on Display at MLconf Atlanta
We’re so grateful for the participating publishers that are sending books to be displayed and given away at MLconf Atlanta on Friday! We’re also displaying and giving out a collection of relevant machine learning books! Check them out!
To Win Books: participate in our event-day twitter contest. The most interesting and unique tweets will be awarded with free ML books! Make sure to mention @mlconf and #MLATL to win!
CRC Press:
*For more details, go to their virtual booth: https://www.crcpress.com/go/MLconf2016
- A First Course in Machine Learning, Second Edition
- Text Mining and Visualization: Case Studies Using Open-Source Tools
- Handbook of Big Data
- Accelerating Discovery: Mining Unstructured Information for Hypothesis Generation
- Statistical Learning with Sparsity: The Lasso and Generalizations
- Statistical Reinforcement Learning: Modern Machine Learning Approaches
- High Performance Parallel I/O
- Sparse Modeling: Theory, Algorithms, and Applications
- Computational Trust Models and Machine Learning
- Regularization, Optimization, Kernels, and Support Vector Machines
- Big Data and Social Science: A Practical Guide to Methods and Tools
Cambridge University Press:
- Agarwal/Chen, Statistical Methods for Recommender Systems
- Braun/Murdoch, A First Course in Statistical Programming with R
- Efron/Hastie, Computer Age Statistical Inference
- Flach, Machine Learning
- Fouss, Algorithms and Models for Network Data and Link Analysis
- Leskovec et al, Mining of Massive Data Sets
Springer Publishing:
Additional Machine Learning Books on Display:
- The Seven Pillars of Statistical Wisdom, Stigler, Stephen M.
- Algorithms to Live By: The Computer Science of Human Decisions, Christian, Brian
- Overcomplicated: Technology at the Limits of Comprehension, Arbesman, Samuel
- Naked Statistics: Stripping the Dread from the Data, Wheelan, Charles
- The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World, Domingos, Pedro
- Data Science from Scratch: First Principles with Python, Grus, Joel
- The Theory That Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy, McGrayne, Sharon Bertsch
- Think Bayes, Allen B. Downey
- How to Create a Mind: The Secret of Human Thought Revealed, Kurzweil, Ray
- Superforecasting: The Art and Science of Prediction, Tetlock, Philip E.
- The End of Average: How We Succeed in a World That Values Sameness, Rose,Todd