10. Containers

10.1. Overview

Application containers provides a solution to package software with complex dependencies to be used during workflow execution. Starting with Pegasus 4.8.0, Pegasus has support for application containers in the nonsharedfs and condorio data configurations using PegasusLite. Pegasus currently has support for the following container technologies:

  1. Docker

  2. Singularity/Apptainer

  3. Shifter

The worker package is not required to be pre-installed in images. If a matching worker package is not installed, Pegasus will try to determine which package is required and download it.

10.2. Singularity vs Apptainer

Singularity project was moved into the Linux foundation in November 2021, with an associated name change to Apptainer. To preserve backward compatibility all apptainer releases, have a symlink named singularity that points to the apptainer executable to preserve backward compatibility.

The above notwithstanding, Pegasus has been updated to prefer the apptainer executable over the singularity executable. In the case, where a job runs on a node where only an old singularity install exists, the Pegasus jobs will fallback to using the singularity executable.

In the workflow APIs; when using Pegasus you should continue to tag containers to type Singularity.

10.3. Configuring Workflows To Use Containers

Container support is provided by the Pegasus.api.transformation_catalog Python module. Use this to generate the transformation catalog for your workflow when containers are used.

For each container used by your workflow transformations, create a Container and add it to your TransformationCatalog.

# Create and add containers to the TransformationCatalog.
tc = TransformationCatalog()

# A container that will be used to execute the following transformations.
tools_container = Container(
                  "tools-container",
                  Container.DOCKER,
                  image="docker:///ryantanaka/preprocess:latest",
                  arguments="--shm-size=2g",
                  bypass_staging=True
               )

# Add the container to the TransformationCatalog
tc.add_containers(tools_container)

When this container is run, anything specified in the arguments parameter is added as a cli argument to the resulting docker container run command that Pegasus generates. If bypass_staging is set to True, the container will be pulled directly onto the compute node where the job is run rather than to an intermediate staging site first.

Then, when creating Transformations, pass the appropriate container to the container parameter of the constructor. When is_stageable is set to False, this means that the transformation is installed at the given pfn inside of the container. If is_stageable is set to True, the executable will be staged into the container and then executed there. Note that in this situation the transformation should be hosted somewhere and made accessible via a protocol such as HTTP.

# Create and add our transformations to the TransformationCatalog.

# An executable that is installed inside of "tools_container" to be run on site "condorpool"
preprocess = Transformation(
               "preprocess",
               site="condorpool",
               pfn="/usr/local/bin/preprocess.sh",
               is_stageable=False,
               container=tools_container
            )

# A stageable python script that must be executed inside tools_container because
# it contains packages that we have when we developed it locally, but may not be
# installed on a compute node. Note that "pfn" is not a file URL but rather an
# http one because pegasus-transfer will need to fetch the file into the container.
process_text_2nd_pass = Transformation(
                           "process_text_2nd_pass.py",
                           site="workflow-cloud",
                           pfn="http://www.isi.edu/~tanaka/process_text_2nd_pass.py",
                           is_stageable=True,
                           container=tools_container
                        )


# Add the newly created transformations to the TransformationCatalog
tc.add_transformations(preprocess, process_text_2nd_pass)

10.4. Containers on OSG

OSG has it’s own way of handling container deployments for jobs that is hidden from the user and hence Pegasus. They don’t allow a user to run an image directly by invoking docker run or singluarity exec. Instead the condor job wrappers deployed on OSG do it for you based on the classads associated with the job. As a result, for a workflow to run on OSG, one cannot specify or describe the container in the transformation catalog. Instead you catalog the executables without a container reference, and the path to the executable is the path in the container you want to use. To specify the container, that needs to be setup you instead specify the following Condor profiles:

  • requirements

  • +SingularityImage

For example you can specify the following in the SiteCatalog for OSG the site:

osg = Site("OSG", arch=Arch.X86_64, os_type=OS.LINUX)\
      .add_profiles(Namespace.PEGASUS, style="condor")\
      .add_profiles(
               Namespace.CONDOR,
               universe="vanilla",
               requirements="HAS_SINGULARITY == True",
               request_cpus="1",
               request_memory="1 GB",
               request_disk="1 GB"
            )\
      .add_profiles(Namespace.CONDOR, key="+SingularityImage", value="/cvmfs/singularity.opensciencegrid.org/pegasus/osg-el7:latest")

10.5. Container Execution Model

Containerized applications in your workflow are launched as part of PegasusLite jobs when starting on a remote worker node. The container execution model is as follows:

  1. Sets up a directory to run a user job in

  2. Pulls the container image to that directory

  3. Optionally, loads the container from the container image file and sets up the user to run as in the container (only applicable for Docker containers)

  4. Mounts the job directory into the container as /scratch for Docker containers and as /srv for Apptainer/Singularity containers

  5. Container will run a job specific script created by PegasusLite that does the following:

    1. Figures the appropriate Pegasus worker to use in the container if not already installed

    2. Sets up the job environment to use including transfer and setup of any credentials transferred as part of PegasusLite

    3. Pulls in all the relevant input data and executables required by the job

    4. Launches the user application using pegasus-kickstart.

  6. Optionally, shuts down the container (only applicable for Docker containers)

  7. Ships out the output data to the staging site

  8. Cleans up the directory on the worker node

Note

Starting in Pegasus 4.9.1, the container data transfer model has been changed. Instead of data transfers for the job occurring outside the container in the PegasusLite wrapper, they now happen when the user job starts in the container.

In versions of Pegasus >= 4.9.1 the transfers are handled from within the container, and thus container recipes require some extra attention. A Dockerfile example that prepares a container for GridFTP transfers is provided below.

In this example there are three sections.

  • Essential Packages

  • Install Globus Toolkit

  • Install CA Certs

From the Essential Packages, python and either curl or wget have to be present. Install Globus Toolkit, sets up the enviroment for GridFTP transfers. And Install CA Certs copies the grid certificates in the container.

Note

Globus Toolkit introduced some breaking changes in August 2018 to its authentication module, and some sites haven’t upgraded their installations (eg. NERSC). GridFTP in order to authenticate successfully, requires the libglobus-gssapi-gsi4 package to be pinned to the version 13.8-1. The code snipet below contains installation directives to handle this but they are commented out.

##########################################
#### This Container Supports GridFTP  ####
##########################################

FROM ubuntu:18.04

#### Essential Packages ####
RUN apt-get update &&\
apt-get install -y software-properties-common curl wget python unzip &&\
rm -rf /var/lib/apt/lists/*

#### Install Globus Toolkit ####
RUN wget -nv http://www.globus.org/ftppub/gt6/installers/repo/globus-toolkit-repo_latest_all.deb &&\
dpkg -i globus-toolkit-repo_latest_all.deb &&\
apt-get update &&\
# apt-get install -y libglobus-gssapi-gsi4=13.8-1+gt6.bionic &&\
# apt-mark hold libglobus-gssapi-gsi4 &&\
apt-get install -y globus-data-management-client &&\
rm -f globus-toolkit-repo_latest_all.deb &&\
rm -rf /var/lib/apt/lists/*

#### Install CA Certs ####
RUN mkdir -p /etc/grid-security &&\
cd /etc/grid-security &&\
wget -nv https://download.pegasus.isi.edu/containers/certificates.tar.gz &&\
tar xzf certificates.tar.gz &&\
rm -f certificates.tar.gz

##########################################
#### Your Container Specific Commands ####
##########################################

10.6. Staging of Application Containers

Pegasus treats containers as other files in terms of data management. A container to be used for a job is tracked as an input dependency that needs to be staged if it is not already there. Similar to executables, you specify the location for your container image in the Transformation Catalog. You can specify the source URL’s for containers as the following.

  1. URL to a container hosted on a central hub repository

    Example of a docker hub URL is docker:///rynge/montage:latest, while for singularity shub://pegasus-isi/fedora-montage

  2. URL to a container image file on a file server

    • Docker -Docker supports the loading of a container from a tar file. Hence, containers images can only be specified as tar files and the extension for the filename is not important.

    • Singularity - Singularity supports container images in various forms and relies on the extension in the filename to determine what format the file is in. Pegasus supports the following extensions for singularity container images

      • .img

      • .tar

      • .tar.gz

      • .tar.bz2

      • .cpio

      • .cpio.gz

      • .sif

      Singularity will fail to run the container if you don’t specify the right extension when specifying the source URL for the image.

In both the cases, Pegasus will place the container image on the staging site used for the workflow, as part of the data stage-in nodes, using pegasus-transfer. When pulling in an image from a container hub repository, pegasus-transfer will export the container as a tar file in case of Docker, and as .img file in case of Singularity

10.6.1. Shifter Containers

Shifter containers are different from Docker and Singularity with respect to the fact that the containers cannot be exported to a container image file that can reside on a filesystem. Additionally, the containers are expected to be available locally on the compute sites in the local Shifter registry. Because of this, Pegasus does not do any transfer of Shifter containers. You can specify a Shifter container using the Shifter url scheme. For example, below is a transformation catalog for a namd transformation that is executed in a Shifter container.

tc = TransformationCatalog()

namd_image = Container(
   "namd_image",
   Container.SHIFTER,
   image="shifter:///papajim/namd_image:latest"
)

tc.add_containers(namd_image)

namd_tr = Transformation(
   "namd_tr",
   site="cori",
   pfn="/opt/NAMD_2.12_Linux-x86_64-multicore/namd2",
   is_stageable=False,
   container=namd_image
)\
.add_globus_profile(max_time=20)\
.add_pegasus_profile(exitcode_success_msg="End of program")

tc.add_transformations(namd_tr)

10.6.2. Symlinking and File Copy From Host OS

Since, Pegasus by default only mounts the job directory determined by PegasusLite into the application container, symlinking of input data sets works only if in the container definition in the transformation catalog user defines the directories containing the input data to be mounted in the container using the mounts key word. We recommend to keep the source and destination directories to be the same i.e. the host path is mounted in the same location in the container.

The above is also true for the case, where your input datasets are on the shared filesystem on the compute site and you want a file copy to happen, when PegasusLite job starts the container.

For example in the example below, we have input datasets accessible on /lizard on the compute nodes, and mounting them as read-only into the container at /lizard

centos_base = Container(
   "centos-base",
   Container.SINGULARITY,
   image="gsiftp://bamboo.isi.edu/lfs1/bamboo-tests/data/centos7.img",
   image_site="local",
   mounts=["/lizard:/lizard:ro"]
).add_env(JAVA_HOME="/opt/java/1.6")

To enable symlinking for containers set the following properties

props = Properties()
# Tells Pegasus to try and create symlinks for input files
props["pegasus.transfer.links"] = "true"

# Tells Pegasus to by the staging site ( creation of stage-in jobs) as
# data is available directly on compute nodes
props["pegasus.transfer.bypass.input.staging"] = "true"
props.write()

If you don’t set pegasus.transfer.bypass.input.staging then you still can have symlinking if:

  1. your staging site is same as your compute site

  2. the scratch directory specified in the site catalog is visible to the worker nodes

  3. you mount the scratch directory in the container definition, NOT the original source directory.

Enabling symlinking of containers is useful when running large workflows on a single cluster. Pegasus can pull the image from the container repository once, and place it on the shared filesystem where it can then be symlinked from when the PegasusLite jobs start on the worker nodes of that cluster. In order to do this, you need to be running the nonsharedfs data configuration mode with the staging site set to be the same as the compute site.