As the machine learning market matures, new tools are evolving that better match data science and machine learning teams’ needs. Vendors, both open source and private, have been quick to introduce new products that better meet requirements making it easier and faster to develop models and enable collaboration.
These new offerings range from cloud-based platforms that make it easy to build and deploy models to specialized software that can speed up the training process. As an example, the existing field of ML experiment tracking is a multi-billion dollar market with a crowded amount of vendors: Comet ML, Weights & Biases, Neptune AI, Databriks’ MLflow and more. These tools will need to evolve to move beyond meeting only the basic requirements of ML teams like experiment tracking, visualizations and model training to address collaboration and technical requirements so these teams can get models to market faster.
The types of teams that are driving modern requirements for data science are the bet-the-farm types, where it’s all about finding new technology and solutions to drive efficiencies and get models to market faster for their organizations to maintain a competitive edge. Going forward, we’re seeing the following as requirements for ML teams as they evaluate tools to add to their MLOps stack.
An Open, Modular Approach
Data scientists and ML engineers use multiple tools to do their jobs, from Jupyter notebooks for model coding to DVC for experiment and data versioning, to automation tools like AirFlow to model visualization tools. Solutions will need to be open to integration with other tools instead of a walled garden that many ML tools, like those offered by cloud vendors (eg Amazon SageMaker, etc.) are now.
Family Developer Experience
ML developers and engineers are familiar with standard DevOps tools and workflows like GitOps (think GitHub or GitLab) and live in CLI-oriented solutions. ML tools should integrate tightly with these existing tools instead of complicating the developer experience with a separate ecosystem. And as a part of the developer experience, workflow is also important. Data scientists can run hundreds of experiments using their preferred CLI-based solutions but, often, all those experiments are streamed and posted into existing ML tools. A more optimal approach would be to offer data scientists the experience of running experiments locally and letting them commit only the most important experiments for further discussion and collaboration so the team isn’t overwhelmed.
Strong Connection to Source Code
This approach makes a stronger foundation for something like tracking ML experiments. As ML teams race to get models to market and continually improve them, being in lockstep with DevOps teams to deploy those ML models into production is critical. An example would be a developer tracking five experiments with changing hyperparameters. For these experiments, they may have a link to a Git repo for a particular version of source code. But what if the source code changes or data set changes are not committed or reported in their existing ML solution? Then, the connection is lost and existing metrics would be based on a previous version of source code or data that’s no longer accurate. That lost connection slows down time-to-market and makes it harder to optimize models.
As seen with the above examples, ML teams juggle myriad tools and workflows as they develop models for various applications and services. The requirements discussed will help these teams to collaborate better and build models faster in an increasingly heterogeneous MLOps landscape.