Installing the Platform on an AWS Cloud

On This Page

Overview

This guide outlines the required steps for installing (deploying) an instance of the Iguazio Data Science Platform ("the platform") to an Amazon Web Services (AWS) cloud (including AWS Outposts). When you complete the procedure, you'll have a platform instance running under your AWS account. The installation is done by using the platform installer — Provazio — with your AWS credentials.

Note
  • The deployment procedure requires proficiency in Systems Operations on AWS, and is typically completed in 1–2 hours.
  • Do not use the AWS root user for any deployment operations.
Warning
  • Provisioning of the servers is handled automatically by the platform installer (Provazio).
    Don't attempt to provision the servers manually prior to the deployment.

  • The data-node instances include Non-Volatile Memory Express (NVMe) SSD-based instance storage, which is optimized for low latency, very high random I/O performance, and high sequential read throughput. The data doesn't persist on the NVMe if the instance is stopped.
    Don't attempt to shut down any of the data nodes, as it will erase the data.

Prerequisites

Before you begin, ensure that you have the following:

  1. A Provazio API key and a Provazio vault URL, received from Iguazio.
  2. Administrative access to an AWS account.
  3. Confirmation from Iguazio's support team that platform Amazon Machine Images (AMIs) were configured with proper permissions for your AWS account.
  4. A machine running Docker.
  5. Access to the internet, or a preloaded Provazio Docker image (quay.io/iguazio/provazio-dashboard:stable), received from Iguazio as an image archive (provazio-latest.tar.gz).

Deployment Steps

To deploy an instance of the platform to an AWS cloud, execute the following steps.

Step 1: Create an IAM user | Step 2: Create an AWS instance profile | Step 3: Configure the installation environment | Step 4: Run the platform installer | Step 5: Access the installer dashboard | Step 6: Choose the AWS scenario | Step 7: Configure general parameters | Step 8: Configure cluster parameters | Step 9: Configure cloud parameters | Step 10: Review the settings | Step 11: Wait for completion

Step 1: Create an IAM User

Follow the Creating an AWS IAM User guide to create a restricted AWS IAM user with the required credentials for performing the installation. Note that the IAM user is required only during the installation, and can be deleted after the installation, as explained in the post-deployment how-to.

Step 2: Create an AWS Instance Profile

Follow the Creating an AWS IAM Role and Instance Profile guide to create an AWS instance profile with a restricted IAM role that allows the platform's Amazon Elastic Compute Cloud (EC2) instances to call the AWS API.

Step 3: Configure the Installation Environment

Create a /tmp/env.yaml configuration file with the following environment information.

dashboard:
  frontend:
    cloud_provider_regions:
      aws:
      - <AWS Region>

client:
  infrastructure:
    ec2:
      access_key_id: <Access Key ID>
      secret_access_key: <Secret Access Key>
      data_cluster_instance_profile: IguazioDataScienceNode
      app_cluster_instance_profile: IguazioDataScienceNode

  vault:
    api_key: <Provazio API Key>
    url: <Provazio vault URL>

provisioning:
  whitelisted_services: ["*"]

Replace the <...> placeholders with the information for your environment:

AWS Region
A list of one or more AWS regions that you'd like to choose from (for example, "us-east-2").
Access Key ID
The AWS Access Key ID for the IAM user created in Step 1.
Secret Access Key
The AWS Secret Access Key for the IAM user created in Step 1.
Provazio API Key
A Provazio API key, received from Iguazio (see the installation prerequisites).
Provazio Vault URL
A Provazio vault URL, received from Iguazio (see the installation prerequisites).

Step 4: Run the Platform Installer

Run the platform installer, Provazio, by running the following command from a command-line shell:

docker pull quay.io/iguazio/provazio-dashboard:stable && docker run --rm --name provazio-dashboard \
    -v /tmp/env.yaml:/tmp/env.yaml \
    -e PROVAZIO_ENV_SPEC_PATH=/tmp/env.yaml \
    -p 8060:8060 \
    quay.io/iguazio/provazio-dashboard:stable

Step 5: Access the Installer Dashboard

In a web browser, browse to localhost:8060 to view the Provazio dashboard.

Installer-UI home page

Select the plus-sign icon (+) to create a new system.

Step 6: Choose the AWS Scenario

On the Installation Scenario page, check AWS, and then select Next.

Choose scenario

Step 7: Configure General Parameters

On the General page, fill in the configuration parameters, and then select Next.

General settings

System Name

A platform name (ID) of your choice (for example, "my-platform-0"). The installer prepends this value to the value of System Domain parameter to create the full platform domain.

  • Valid Values: A string of 1–12 characters; can contain lowercase letters (a–z) and hyphens (-); must begin with a lowercase letter
  • Default Value: A randomly generated lowercase string
Description
A free-text string that describes the platform instance.
System Version

The platform version. This is auto-populated based on the AMIs that you have access to in the region, so make sure to set the Region parameter.

Owner Full Name
An owner-name string, containing the full name of the platform owner, for bookkeeping.
Owner Email
An owner-email string, containing the email address of the platform owner, for bookkeeping.
Username

The username of a platform user to be created by the installation. This username will be used together with the configured password to log into platform dashboard. You can add additional users after the platform is provisioned.

User Password

A platform password for the user generated by the installation — to be used with the configured username to log into platform dashboard; see the password restrictions. You can change this password after the platform is provisioned.

Region
The region in which to install the platform.
System Domain

A custom platform domain (for example, "customer.com"). The installer prepends the value of the System Name parameter to this value to create the full platform domain.

Allocate Public IP Addresses
Check this option to allocate public IP addresses to all of the platform nodes (EC2 instances).
Termination Protection
The protection level for terminating the platform installation from the installer dashboard.

Step 8: Configure Cluster Parameters

On the Clusters page, fill in the configuration parameters, and then select Next. For additional information and guidelines, see the AWS resource-calculation guide guide.

Cluster settings

Common Parameters (Data and Application Clusters)

The following parameters are set for both the data and application clusters. Node references in the parameter descriptions apply to the platform's data nodes for the data cluster and application nodes for the application cluster.

# of Nodes
The number of nodes (EC2 instances) to allocate for the cluster.
Node Size
The EC2 instance type, which determines the size of the clusters' nodes.
Root Block Device Type

The Amazon Elastic Block Store (EBS) type for the control plane.

  • Default Value: EBS General Purpose SSD (gp2), which provides a good balance between performance and cost. Note that the data plane uses high-speed NVMe storage.
Root Block Device Size
The size of the EBS for the control plane.
Storage Encryption Kind [Tech Preview]
The type of encryption to be applied.

Application-Cluster Parameters

The following parameters are applicable only to the platform's application cluster.

EKS Application-Cluster Note
The following instructions are specific to deployment of a managed vanilla application cluster. To deploy an Amazon Elastic Kubernetes Service (Amazon EKS) application cluster, follow the Deploying an Amazon EKS Application Cluster guide instead.
Kubernetes Kind
Leave this set to New Vanilla Cluster (Iguazio Managed).

Step 9: Configure Cloud Parameters

On the Cloud page, fill in the configuration parameters, and then select Next.

VPC mode

The cloud configuration configures the platform's virtual private cloud (VPC) networking. You can select between two alternative VPC modes:

  • New — Create a new VPC and install the platform in this VPC.
  • Existing — Install the platform in an existing VPC.

The following optional parameters are applicable to both VPC modes; (see the example UI screen shots for the different VPC-mode configurations later in this step):

Region Name

Overrides the value of the Region general-configuration parameter.

Access Key ID

Overrides the value of the Access Key ID environment-configuration parameter.

Note
This parameter should typically not be set.
Secret Access Key

Overrides the value of the Secret Access Key environment-configuration parameter.

Note
This parameter should typically not be set. If you find the need to set it, consult Iguazio personnel first.
Verbose Provisioning

Configures very verbose logs.

Note
Leave this parameter unchecked unless instructed otherwise by Iguazio personnel.
Placement Kind

An AWS Placement Group.

Note
Don't change the default value of this parameter unless instructed otherwise by Iguazio personnel.
Security-Group Parameters

The following parameters are used for configuring network security groups. For more information, see the AWS network security-groups configuration guide.

Whitelisted CIDRs
A list of classless inter-domain routing (CIDR) addresses to be granted access to the platform's service port (for example, "200.40.0.1/32"). This parameter is typically relevant when the platform has public IP addresses. For a platform without public IP addresses, you can leave this parameter empty, assuming you have access to the VPC from your network.
Installer CIDR
The CIDR of the machine on which you're running the platform installer (for example, "10.0.0.1/32").
Allow Access from Iguazio Support

Check this option to allow Iguazio's support team to access the platform nodes from the Iguazio network This parameter is applicable only when the platform has public IP addresses (see the Allocate Public IP Addresses general-configuration parameter).

In addition to the common parameters, there are parameters that are specific to the selected VPC mode:

New-VPC Configuration

The following parameters are applicable only to the New VPC mode:

CIDR
The CIDR of the VPC.
Subnet CIDRs

The CIDRs of the VPC's subnets. The number of CIDRs translates to the number of subnets.

Note
For a managed vanilla application cluster, you can currently configure only one subnet, which ensures that all platform nodes use the same availability zone. A deployment with a single availability zone allows minimal latency and doesn't incur data-transfer costs. When deploying an EKS application cluster, you need to configure two subnets (for two availability zones) to fulfill EKS requirements; however, the platform uses only the first configured subnet. To use multiple availability zones (via multiple subnets), contact Iguazio for a quote. Note that while deployment with multiple availability zones offers improved availability when an availability zone is down, it has a performance impact and entails high network-utilization costs, and therefore might not fit your requirements.

New VPC

Existing-VPC Configuration

The following parameters are applicable only to the Existing VPC mode:

VPC ID
The ID of the VPC in which to install the platform.
CIDR
The IP address of the CIDR of the chosen VPC (as some VPCs have multiple CIDRs).
Subnet IDs

The IDs of the subnets within the VPC or of a subset of these subnets.
The installation currently supports two subnets for an EKS application cluster and only a single subnet otherwise. For details, see the note for the Subnet CIDRs new-VPC configuration parameter.

Security Group Mode
Leave this set to New.

Existing VPC

Step 10: Review the Settings

On the Review page, review and verify your configuration; go back and make edits, as needed; and then select Create to provision a new instance of the platform.

Review

Step 11: Wait for Completion

Provisioning a new platform instance typically takes around 30–40 minutes, regardless of the cluster sizes. You can download the provisioning logs, at any stage, by selecting Download logs from the instance's action menu.

Download logs

You can also follow the installation progress by tracking the Provazio Docker container logs.

When the installation completes, you should have a running instance of the platform in your cloud. You can use the Provazio dashboard to view the installed nodes (EC2 instances). Then, proceed to the post-deployment steps.

Post-Deployment Steps

When the deployment completes, follow the post-deployment steps.

See Also