Spark is a new cluster computing engine that is ra

Xiangrui Meng • MLconf 2014 New York City

Session Summary

Spark is a new cluster computing engine that is ra

MLconf 2014 New York City

Xiangrui Meng

Databricks
Software Engineer
Learn more »

Spark is a new cluster computing engine that is rapidly gaining popularity. It is one of the most active open source projects in big data, surpassing even Hadoop MapReduce. Spark was designed to both make traditional MapReduce programming easier and to support new types of applications, with one of the earliest focus areas being machine learning. MLlib is a Spark subproject providing machine learning primitives. In this talk, we’ll demonstrate how to use Spark’s high-level API to implement scalable machine learning algorithms, and how MLlib integrates with other components (Streaming, SQL, and GraphX) of the Spark distribution to create practical machine learning pipelines. We’ll also show new features in the upcoming v1.0 release.

View the slides for this session

Session Summary

Spark is a new cluster computing engine that is ra

Xiangrui Meng

Code of Conduct

Refund Policy

Press Inquiries

Don't miss a thing!