Creating Python Virtual Environments with Conda

On This Page

Overview

A Python virtual environment is a named, isolated, working copy of Python that maintains its own files, directories, and paths so that you can work with specific versions of libraries or Python itself without affecting other Python projects. Virtual environments make it easy to cleanly separate projects and avoid problems with different dependencies and version requirements across components. The Conda command-line interface (CLI) is the preferred interface for managing installations and virtual environments with the Anaconda Python distribution. The Jupyter service of the Iguazio Data Science Platform ("the platform") comes pre-deployed with Conda. This tutorial explains how to use Conda to create a Python virtual environment that will be available as a custom kernel in JupyterLab. For general information about using Conda to create virtual environments, see the Conda documentation.

Preconfigured Conda environments

The platform provides several preconfigured Conda environments that are included in the Jupyter pod:

  • base: This is the default base environment that includes the Conda binaries. Do not use it for development purposes.
  • jupyter: This environment incudes the JupyterLab server and all of its dependencies. Do not use it for development purposes.
  • mlrun-base: This environment includes MLRun and all its dependencies preinstalled.
  • mlrun-extended: This environment includes all the packages from mlrun-base, as well as additional packages such as TensorFlow, PyTorch, and scikit-learn, which are required for the demo notebooks.

While the platform provides several preconfigured Conda environments that are included in the Jupyter pod, it's important to note that any Conda installations to these environments are not persistent. They are reset when the Jupyter service is restarted.

However, when these Conda environments are activated, the PIP_PREFIX and PYTHONPATH environment variables are automatically set to the data mount directory of the running user: /User/.pythonlibs/<environment name>. This means that PIP installations persist even after restarting the Jupyter pod.

Any new Conda environment that is created or cloned will be fully located in the data mount directory of the running user, specifically at /User/.conda/<environment name>. These environments are fully persistent.

Setting Up a Virtual Environment Using Conda

Follow these steps from your Jupyter service to create a Python virtual environment using Conda:

  1. Create a new terminal by selecting the New Launcher option (+ icon) from the top action toolbar in the left sidebar, and then selecting Terminal from the main work area.
    The next steps should be executed from your new terminal, except where otherwise specified.

  2. Create a new Python virtual environment by running the following command. Replace <environment name> with your preferred virtual-environment name:

    conda create -n <environment name>
    

    For example, the following command creates an environment named "myenv":

    conda create -n myenv
    
  3. Activate the Conda environment by running the following command:

    conda activate <environment name>
    

    For example, the following command activates an environment named "myenv":

    conda activate myenv
    
  4. Once you activate your environment, use either PIP or Conda to install the necessary Python packages.
    To do this, replace <package> with the name of the package you want to install, and optionally add ==<version> for PIP or =<version> for Conda. You can specify multiple packages in the same command.

    Using PIP:

    pip install <package> [<package> ...]
    

    Alternatively, using Conda:

    conda install <package> [<package> ...]
    

    For example, the following command uses PIP to install the SciPy, pandas version 1.4.4, and TensorFlow version 2.9.3 packages for the "myenv" environment that you activated in the previous step:

    pip install scipy pandas==1.4.4 tensorflow==2.9.3
    
  5. Export your new virtual environment to an environment file in a platform data container by running the following command. Replace <container name> with the name of a platform data container, <directory path> with an optional relative container-directories path, and <environment name> with the name of the environment that you created:

    conda env export -n <environment name> > /v3io/<container name>[/<directory path>]/<environment name>.yaml
    

    It is recommended that you save the environment to a virtual-environments directory in your running-user home directory (/v3io/users/<running user>).

    For example, the following command creates a /v3io/users/<running user>/virtual_env/myenv.yaml file:

    conda env export -n myenv > /v3io/users/$V3IO_USERNAME/virtual_env/myenv.yaml
    

    To shorten this command, use the /User data mount to the running-user directory (see Platform Data Containers):

    conda env -n myenv export > /User/virtual_env/myenv.yaml
    
  6. Refresh the JupyterLab UI to apply your changes.
    After refreshing the UI, you should see your new environment in the list of available kernels in JupyterLab.

Creating a Conda Virtual Environment from a File

If, for any reason, your Conda environment is removed from JupyterLab, you can easily deploy it again by using the YAML environment file that you exported in Step 5 of the setup procedure:

  1. Open a new Jupyter terminal.

  2. Run the following command to recreate the environment from the environment file. Replace <directory path> and <environment name> to set the path to the environment file that you saved as part of the initial setup:

    conda env create --file /v3io/<container name>[/<directory path>]/<environment name>.yaml
    

    For example, the following command loads a /v3io/users/<running user>/virtual_env/myenv.yaml environment file.
    The command uses the /User running-user directory data mount to the running-user directory in the "users" container:

    conda env create --file  /User/virtual_env/myenv.yaml
    

Cloning an existing Conda Virtual Environment

To clone an existing Conda environment into a new one, follow these steps:

  1. Open a new Jupyter terminal.

  2. Create a new Conda enviroment by cloning an existing one.
    Replace <environment name> with the name of the new environmnent and <source environment name> with the environment to be cloned:

    conda create -n <environment name> --clone <source environment name>
    


    For example, the following command clones "mlrun-base" environment into "mlrun-clone":

    conda create -n mlrun-clone --clone mlrun-base
    

Setting Up a RAPIDS Conda Environment with cuDF and cuML

To use the cuDF and cuML RAPIDS libraries, you need to create a RAPIDS Conda environment. Use the following command to create a RAPIDS conda environment named rapids:

conda create -n rapids -c rapidsai -c conda-forge -c nvidia rapids=22.12 cudatoolkit=11.7.0