AWS Cloud Deployment Specifications

On This Page

Overview

This document lists the hardware specifications for deployment of version 3.5.5 of the MLOps Platform ("the platform") on the Amazon Web Services (AWS) cloud; for details, refer to the AWS documentation.

Docker Registries
The platform is deployed with a default, on-cluster docker registry. This registry can only be used for playground environments, as it is volatile and images are lost when it is restarted. Operational clusters must be connected to an external Docker registry. Create the registry in your cloud, then configure it either during installation (see Custom User Docker Registry in installation), or post-installation through the platform dashboard.
Note
  • All references to AWS cloud apply also to AWS Outposts, except where otherwise specified.
  • AWS platform deployments also require an Elastic IP address. For more information, see the AWS documentation.
  • All capacity calculations in the hardware specifications are performed using the base-10 (decimal) number system. For example, 1 TB = 1,000,000,000,000 bytes.
Warning
  • Provisioning of the servers is handled automatically by the platform installer (Provazio).
    Don't attempt to provision the servers manually prior to the deployment.

  • The data-node instances include Non-Volatile Memory Express (NVMe) SSD-based instance storage, which is optimized for low latency, very high random I/O performance, and high sequential read throughput. The data doesn't persist on the NVMe if the instance is stopped.
    Don't attempt to shut down any of the data nodes, as it will erase the data.

Hardware Configurations

Iguazio Playground

A single data-node and single application-node cluster implementation. This configuration is designed mainly for evaluation trials and doesn't include high availability (HA) or performance testing.

Note
You can deploy a Proof of Concept (POC) in the Iguazio playground configuration. Be aware that the POC cannot be used as development environment or for a production environment.
AWS Production Cluster

AWS Data-Node Specifications

Data nodes in platform AWS cloud deployments must use one of the following EC2 instance types and fulfill the related specifications; choose the type that best fits your requirements.

  • i3.2xlarge

    Component Specification
    vCPUs 8
    Memory 61 GiB
    Data disks (local storage) 1 x 1.9 TB NVMe SSD
    OS boot disk (EBS volume) General Purpose SSD (gp2); 400 GB (minimum)
    Usable storage capacity Single data node (Playground) — 1 TB
    3 nodes (Operational Cluster) — 2 TB
  • i3.4xlarge

    Component Specification
    vCPUs 16
    Memory 122 GiB
    Data disks (local storage) 2 x 1.9 TB NVMe SSD
    OS boot disk (EBS volume) General Purpose SSD (gp2); 400 GB (minimum)
    Usable storage capacity Single node (Playground) — 2.5 TB;
    3 nodes (Operational Cluster) — 4 TB
  • i3.8xlarge

    Component Specification
    vCPUs 32
    Memory 244 GiB
    Data disks (local storage) 4 x 1.9 TB NVMe SSD
    OS boot disk (EBS) General Purpose SSD (gp2); 400 GB (minimum)
    Usable storage capacity 3 nodes (Operational Cluster) — 9 TB

AWS Outposts Note
For deployment on AWS Outposts, currently only the i3en.6xlarge EC2 instance type is supported.

AWS Application-Node Specifications

Application nodes in platform AWS cloud deployments are supported on Elastic Kubernetes Service (EKS) and must use one of the following instance types. Choose the type that best fits your requirements. For specification details for each type, refer to the AWS documentation.

Note
All of the supported application-node configurations also require a 250 GB (minimum) General Purpose SSD (gp2) OS boot disk (EBS volume).
CPU-Based Instances
  • m5 instance family (from 4xlarge and higher)
  • r5 instance family (from 4xlarge and higher)

GPU-Based Instances

  • p3.2xlarge (trial only)
  • p3.8xlarge
  • p3.16xlarge
  • g4dn.12xlarge
  • g4dn.16xlarge

Backing up the Platform

Caution
To ensure safety of your MLOps Platform data, you must periodically back up your data and configuration.

Allocate one EFS storage drive in your EKS for backing up your platform. See full details in Backing Up the Platform.

See Also