The day the ML application is deployed to production and begins facing the real world is the best and the worst day in the life of the model builder. The joy of seeing accurate predictions is quickly overshadowed by a myriad of operational challenges. Debugging, troubleshooting & monitoring takes over the majority of their day, leaving little time for model building. In DevOps, software operations are taken to a level of an art. Sophisticated tools enable engineers to quickly identify and resolve issues, continuously improving software stability and robustness. In the ML world, operations are still largely a manual process that involves Jupyter notebooks and shell scripts. One of the cornerstones of the DevOps toolchain is logging. Traces and metrics are built on top of logs enabling monitoring and feedback loops. What does logging look like in an ML system? In this talk we will show you how to enable data logging for an AI application. We will discuss how something so simple enables testing, monitoring and debugging of the AI application and the upstream data pipeline. We will dive deeper into key properties of a logging library that can handle TBs of data, run with a constraint memory footprint and produce statistically accurate log profiles of structured and unstructured data. Attendees will leave the talk equipped with best practices to supercharge MLOps in their team.
Session Summary
The critical missing component in the production ML stack
MLconf Online 2021 – AI/ML Ops
Alessya Visnjic
WhyLabs
CEO
Learn more »