8.2. Configuring Workflows To Use Containers

Containers currently can only be specified in the Transformation Catalog. Users have the option of either using a different container for each executable or same container for all executables. In the case, where you wants to use a container that does not have your executable pre-installed, you can mark the executable as STAGEABLE and Pegasus will stage the executable into the container, as part of executable staging.

The DAX API extensions don't support references for containers.

8.2.1. Containerized Applications in the Transformation Catalog

Users can specify what container they want to use for running their application in the Transformation Catalog using the multi line text based format described in this section. Users can specify an optional attribute named container that refers to the container to be used for the application.

tr example::keg:1.0 { 

  #specify profiles that apply for all the sites for the transformation 
  #in each site entry the profile can be overriden 

  profile env "APP_HOME" "/tmp/myscratch"
  profile env "JAVA_HOME" "/opt/java/1.6"

  site isi {
    # environment to be set when the job is run in the container
    # overrides env profiles specified in the container
    profile env "HELLo" "WORLD"
    profile env "JAVA_HOME" "/bin/java.1.6"
    
    profile condor "FOO" "bar"
    
    pfn "/path/to/keg
    arch "x86"
    os "linux"
    osrelease "fc"
    osversion "4"
      
    # INSTALLED means pfn refers to path in the container.
    # STAGEABLE means the executable can be staged into the container
    type "INSTALLED" 

    #optional attribute to specify the container to use
    container "centos-pegasus"
  }
}

cont centos-pegasus{
     # can be either docker or singularity
     type "docker"

     # URL to image in a docker|singularity hub OR
     # URL to an existing docker image exported as a tar file or singularity image
     image "docker:///rynge/montage:latest" 

     # optional site attribute to tell pegasus which site tar file
     # exists. useful for handling file URL's correctly
     image_site "optional site"

     # mount information to mount host directories into container 
     # format src-dir:dest-dir[:options]
     mount "/Volumes/Work/lfs1:/shared-data/:ro"

     # environment to be set when the job is run in the container
     # only env profiles are supported
     profile env "JAVA_HOME" "/opt/java/1.6"	    
}

The container itself is defined using the cont entry. Multiple transformations can refer to the same container.

  1. cont cont - A container identifier.

  2. image - URL to image in a docker|singularity hub or URL to an existing docker image exported as a tar file or singularity image. Example of a docker hub URL is docker:///rynge/montage:latest, while for singularity shub://pegasus-isi/fedora-montage

  3. image_site - The site identifier for the site where the container is available

  4. mount - mount information to mount host directories into container of format src-dir:dest-dir[:options] . Consult Docker and Singularity documentation for options supported for -v and -B options respectively.

  5. Profiles - One or many profiles can be attached to a transformation for all sites or to a transformation on a particular site. For containers, only env profiles are supported.

Note

Containerized Applications can only be specified in the transformation catalog, not via the DAX API.