The MPI-Operator Horovod Service

The platform has a default (pre-deployed) shared single-instance tenant-wide Kubeflow MPI Operator service (mpi-operator), which facilitates Uber's Horovod distributed deep-learning framework. Horovod, which is already preinstalled as part of the platform's Jupyter Notebook service, is widely used for creating machine-learning models that are trained simultaneously over multiple GPUs or CPUs. For more information about using the Horovod to run applications over GPUs, see Running Applications over GPUs.

