MLconf SF 2016 Speaker Resources

We recently asked the speakers of MLconf San Francisco 2016 to share their favorite articles, books & papers with the MLconf audience. We hope you find this list interesting and educational!

daria-june-2016
Daria Sorokina, Applied Scientist, A9(Amazon)

Amazon Search: The Joy of Ranking Products

stephanie_dewet
Stephanie deWet, Software Engineer, Pinterest

Yunsong Guo. Pinnability: Machine Learning in the Pinterest Home Feed. https://engineering.pinterest.com/blog/pinnability-machine-learning-home-feed

Deepak Agarwal, Bee-Chung Chen, Rupesh Gupta, Joshua Hartman, Qi He, Anand Iyer, Sumanth Kolar, Yiming Ma, Pannagadatta Shivaswamy, Ajit Singh, and Liang Zhang. 2014. Activity ranking in LinkedIn feed. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD ’14). ACM, New York, NY, USA, 1603-1612. DOI: http://dx.doi.org/10.1145/2623330.2623362

Hao Ma, Xueqing Liu, and Zhihong Shen. 2016. User Fatigue in Online News Recommendation. In Proceedings of the 25th International Conference on World Wide Web (WWW ’16). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 1363-1372. DOI: http://dx.doi.org/10.1145/2872427.2874813

Ewa Dominowska. Generating a Billion Personal News Feeds. MLConf SEA 2016. live talk. https://www.youtube.com/watch?v=iXKR3HE-m8c

virginia-smith
Virginia Smith, Researcher, UC Berkeley

CoCoA: A General Framework for Communication-Efficient Distributed Optimization. V. Smith, S. Forte, C. Ma, M. Takac, M. I. Jordan, M. Jaggi. Preprint, 2016. https://arxiv.org/abs/1611.02189.

Adding vs. Averaging in Distributed Primal-Dual Optimization. C. Ma, V. Smith, M. Jaggi, M. I. Jordan, P. Richtarik, M. Takac. International Conference on Machine Learning (ICML ’15). https://arxiv.org/abs/1502.03508.

Communication-Efficient Distributed Dual Coordinate Ascent. M. Jaggi, V. Smith, M. Takac, J. Terhorst, S. Krishnan, T. Hofmann, M. I. Jordan. Neural Information Processing Systems (NIPS ’14). https://arxiv.org/abs/1409.1458.

guylebanon
Guy Lebanon, Director of Machine Learning & Data Science, Netflix

Blog Post: Selecting the best artwork for videos through A/B testing

alexdimakis
Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineering, University of Texas at Austin

Software and datasets:

Tuebingen Benchmark:
https://webdav.tuebingen.mpg.de/cause-effect/

Tetrad project:
http://www.phil.cmu.edu/projects/tetrad/

Entropic Causality:
https://github.com/mkocaoglu/Entropic-Causality

Video Tutorials:

https://www.youtube.com/watch?v=9yEYZURoE3Y&feature=youtu.be
CCD Summer Short Course 2016
CMU Center for Causal Discovery short course on Causality and Tetrad.

https://www.youtube.com/watch?v=PpY7Slo57XQ
Tutorial: All of Causal Discovery (by Frederick Eberhardt)

Books and Papers:

P. Spirtes, C. Glymour and R. Scheines, Causation, Prediction, and Search. Bradford Books, 2001.
https://www.amazon.com/Causation-Prediction-Adaptive-Computation-Learning/dp/0262194406

Causality by J. Pearl
Cambridge University Press, 2009.
https://www.amazon.com/Causality-Reasoning-Inference-Judea-Pearl/dp/052189560X

Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction,
G. Imbens and D. Rubin
https://www.amazon.com/Causal-Inference-Statistics-Biomedical-Sciences/dp/0521885884

Jonas Peters, Peter Buehlmann and Nicolai Meinshausen (2016)
Causal inference using invariant prediction: identification and confidence intervals
Journal of the Royal Statistical Society, Series B
https://www.statslife.org.uk/files/rss-preprint-causal-inference-may-2016.pdf

Frederich Eberhardt, Clark Glymour, and Richard Scheines.
On the number of experiments sufficient and in the worst case necessary to identify all causal relations among n variables.
http://www.jmlr.org/proceedings/papers/v6/eberhardt10a/eberhardt10a.pdf

Alain Hauser and Peter Buhlmann. Two optimal strategies for active learning of causal models from interventional data.
International Journal of Approximate Reasoning, 55(4):926–939, 2014.
http://leo.ugr.es/pgm2012/submissions/pgm2012_submission_11.pdf

Learning Causal Graphs with Small Interventions
K. Shanmugam, M. Kocaoglu, A.G. Dimakis, S. Vishwanath (NIPS 2015)
https://papers.nips.cc/paper/5909-learning-causal-graphs-with-small-interventions.pdf

Nonlinear causal discovery with additive noise models,
Patrik O Hoyer, Dominik Janzing, Joris M Mooij, Jonas Peters, Bernhard Scholkopf (NIPS 2008)
http://is.tuebingen.mpg.de/fileadmin/user_upload/files/publications/NIPS2008-Hoyer-neu_5406[0].pdf

danielshank
Daniel Shank, Data Scientist, Talla

Implementations:

Tensorflow: https://github.com/carpedm20/NTM-tensorflow
Go: https://github.com/fumin/ntm
Torch: https://github.com/kaishengtai/torch-ntm
Node.JS: https://github.com/gcgibson/NTM
Lasagne: https://github.com/snipsco/ntm-lasagne
Theano: https://github.com/shawntan/neural-turing-machines

Papers:

Graves et al. 2016 – Hybrid computing using a neural network with dynamic external memory
Graves et al. 2014 – Neural Turing Machines
Yu et al. 2015 – Empirical Study on Deep Learning Models for Question Answering
Rae et al. 2016 – Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes

harmvanseijen
Harm van Seijen, Research Scientist, Maluuba

Further Reading:

“Introduction to Reinforcement Learning” by Richard S. Sutton & Andrew G. Barto
https://webdocs.cs.ualberta.ca/~sutton/book/the-book.html

“Algorithms for Reinforcement Learning” by Csaba Szepesvari
https://sites.ualberta.ca/~szepesva/RLBook.html

“Policy Networks with Two-Stage Training for Dialogue Systems” by Mehdi Fatemi, Layla El Asri, Hannes Schulz, Jing He, Kaheer Suleman
https://arxiv.org/abs/1606.03152

Code Examples:

Simple DQN Example In Python:
https://edersantana.github.io/articles/keras_rl/

Tool For Testing/Developing RL Algorithms:
https://gym.openai.com/