8.4. Staging of Application Containers

Pegasus treats containers as other files in terms of data management. Container to be used for a job is tracked as an input dependency that needs to be staged if it is not already there. Similar to executables, you specify the location for your container image in the Transformation Catalog. You can specify the source URL's for containers as the following.

  1. URL to a container hosted on a central hub repository

    Example of a docker hub URL is docker:///rynge/montage:latest, while for singularity shub://pegasus-isi/fedora-montage

  2. URL to a container image file on a file server.

    • Docker - Docker supports loading of containers from a tar file, Hence, containers images can only be specified as tar files and the extension for the filename is not important.

    • Singularity - Singularity supports container images in various forms and relies on the extension in the filename to determine what format the file is in. Pegasus supports the following extensions for singularity container images

      • .img

      • .tar

      • .tar.gz

      • .tar.bz2

      • .cpio

      • .cpio.gz

      Singularity will fail to run the container if you don't specify the right extension , when specify the source URL for the image.

In both the cases, Pegasus will place the container image on the staging site used for the workflow, as part of the data stage-in nodes, using pegasus-transfer. When pulling in an image from a container hub repository, pegasus-transfer will export the container as a tar file in case of Docker, and as .img file in case of Singularity

8.4.1. Symlinking

Since, Pegasus by default only mounts the job directory determined by PegasusLite into the application container, symlinking of input data sets works only if in the container definition in the transformation catalog user defines the directories containing the input data to be mounted in the container using the mount key word. We recommend to keep the source and destination directories to be the same i.e. the host path is mounted in the same location in the container. This is because the data transfers for the job happens in PegasusLite before the container is invoked. While we do support different source and destination directories in the mount format, it will create broken symlinks in the worker node that will resolve correctly in the container.

For example in the example below, we have input datasets accessible on /lizard on the compute nodes, and mounting them as read-only into the container at /lizard

cont centos-base{
     type "singularity"

     # URL to image in a docker hub or a url to an existing
     # singularity image file
     image "gsiftp://bamboo.isi.edu/lfs1/bamboo-tests/data/centos7.img"

     # optional site attribute to tell pegasus which site tar file
     # exists. useful for handling file URL's correctly
     image_site "local"

     # mount point in the container
     mount "/lizard:/lizard:ro"
  
     # specify env profile via env option do docker run
     profile env "JAVA_HOME" "/opt/java/1.6"	    
}

To enable symlinking for containers set the following properties

# Tells Pegasus to try and create symlinks for input files
pegasus.transfer.links true

# Tells Pegasus to by the staging site ( creation of stage-in jobs) as 
# data is available directly on compute nodes
pegasus.transfer.bypass.input.staging true

f you don't set pegasus.transfer.bypass.input.staging then you still can have symlinking if

  1. your staging site is same as your compute site

  2. the scratch directory specified in the site catalog is visible to the worker nodes

  3. you mount the scratch directory in the container definition, NOT the original source directory.

Enabling symlinking of containers is useful, when running large workflows on a single cluster. Pegasus can pull the image from the container repository once, and place it on the shared filesystem where it can then be symlinked from, when the PegasusLite jobs start on the worker nodes of that cluster. In order to do this, you need to be running the nonsharedfs data configuration mode with the staging site set to be the same as the compute site.