Large-scale Machine Learning: Deep, Distributed and Multi-Dimensional: Modern machine learning involves deep neural network architectures which yields state-of-art performance on multiple domains such as computer vision, natural language processing and speech recognition. As the data and models scale, it becomes necessary to have multiple processing units for both training and inference. Apache MXNet is an open-source framework developed for distributed deep learning. I will describe the underlying lightweight hierarchical parameter server architecture that results in high efficiency in distributed settings. Pushing the current boundaries of deep learning requires using multiple dimensions and modalities. These can be encoded into tensors, which are natural extensions of matrices. We present new deep learning architectures that preserve the multi-dimensional information in data end-to-end. We show that tensor contractions and regression layers are an effective replacement for fully connected layers in deep learning architectures. They result in significant space savings with negligible performance degradation. These functionalities are available in the Tensorly package with MXNet backend interface for large-scale efficient learning.