At Stitch Fix we have a dedicated Data Science organization called Algorithms. It has over 130+ Full Stack Data Scientists that build & own a variety of models. These models span from your classic prediction & classification models, through to time-series forecasts, simulations, and optimizations. Rather than hand-off models for productionization to someone else, Data Scientists own and are on-call for that process; we love for our Data Scientists to have autonomy. That said, Data Scientists aren’t without engineering support, as there’s a Data Platform team dedicated to building tooling, services, and abstractions to increase their workflow velocity. One data science task that we have been speeding up is getting models to production and increasing their usability and stability. This is a necessary task that can take a considerable chunk of a Data Scientist’s time, either in terms of developing, or debugging issues; historically everyone largely carved their own path in this endeavor, which meant many different approaches, implementations, and little to leverage across teams. In this talk I’ll cover how the Model Lifecycle team on Data Platform built a system dubbed the “Model Envelope” to enable “deployment for free”. That is, no code needs to be written by a data scientist to deploy any python model to production, where production means either a micro-service, or a batch python/spark job. With our approach we can remove the need for data scientists to have to worry about python dependencies, or instrumenting model monitoring since we can take care of it for them, in addition to other MLOps concerns. Specifically the talk will cover: * Our API interface we provide to data scientists and how it decouples deployment concerns. * How we approach automatically inferring a type safe API for models of any shape. * How we handle python dependencies so Data Scientists don’t have to. * How our relationship & approach enables us to inject & change MLOps approaches without having to coordinate much with Data Scientists.
Session Summary
“Deployment for free”: removing the need to write model deployment code at Stitch Fix
MLconf Online 2021 – AI/ML Ops
Stefan Krawczyk
Stitch Fix
Manager Data Platform
Learn more »