With data as a valuable currency and the architecture of reliable, scalable Data Lakes and Lakehouses continuing to mature, it is crucial that machine learning training and deployment techniques keep up to realize value. Reproducibility, efficiency, and governance in training and production environments rest on the shoulders of both point in time snapshots of the data and a governing mechanism to regulate, track, and make best use of associated metadata. This talk will outline the challenges and importance of building and maintaining reproducible, efficient, and governed machine learning solutions as well as posing solutions built on open source technologies – namely Delta Lake for data versioning and MLflow for efficiency and governance.
Session Summary
Ensuring Reproducibility, Efficiency, and Governance in Machine Learning Solutions
MLconf Online 2021 – AI/ML Ops
Mary Grace Moesta
Databricks
Data Scientist
Learn more »