Data Science Automation (MLOps) Services
The platform has pre-deployed services for data science and machine-learning operations (MLOps) automation and tracking:
MLRun is Iguazio's open-source MLOps orchestration framework, which offers an integrative approach to managing machine-learning pipelines from early development through model development to full pipeline deployment in production. MLRun offers a convenient abstraction layer to a wide variety of technology stacks while empowering data engineers and data scientists to define the feature and models. MLRun also integrates seamlessly with other platform services, such as Kubeflow Pipelines, Nuclio, and V3IO Frames.
The MLRun server is provided as a default (pre-deployed) shared single-instance tenant-wide platform service (
mlrun), including a graphical user interface ("the MLRun dashboard" or "the MLRun UI"), which is integrated as part of the
The MLRun client API is available via the MLRun Python package (
mlrun), including a command-line interface (
You can easily install and update this package from the Jupyter Notebook service by using the
The MLRun library features a generic and simplified mechanism for helping data scientists and developers describe and run scalable ML and other data science tasks in various runtime environments while automatically tracking and recording execution code, metadata, inputs, and outputs. The capability to track and view current and historical ML experiments along with the metadata that is associated with each experiment is critical for comparing different runs, and eventually helps to determine the best model and configuration for production deployment.
MLRun is runtime and platform independent, providing a flexible and portable development experience. It allows you to develop functions for any data science task from your preferred environment, such as a local IDE or a web notebook; execute and track the execution from the code or using the MLRun CLI; and then integrate your functions into an automated workflow pipeline (such as Kubeflow Pipelines) and execute and track the same code on a larger cluster with scale-out containers or functions.
For detailed MLRun information and examples, including an API reference, see the MLRun documentation, which is available also in the Data Science and MLOps section of the platform documentation. See also the MLRun restrictions in the platform's Software Specifications and Restrictions.
You can find full MLRun end-to-end use-case demo applications as well as a getting-started and how-to tutorials in the MLRun-demos repository repository.
These demos and tutorials are pre-deployed in each user's
Google Kubeflow Pipelines is an open-source framework for building and deploying portable, scalable ML workflows based on Docker containers. For detailed information, see the Kubeflow Pipelines documentation.
Kubeflow Pipelines is provided as a default (pre-deployed) shared single-instance tenant-wide platform service (
pipelines), which can be used to create and run ML pipeline experiments.
The pipeline artifacts are stored in a