9. CWL Support

The Common Workflow Language (CWL) is a standardized way of describing workflow pipelines and tools. Users with existing CWL workflows can use the pegasus-cwl-converter to convert their workflows from CWL into Pegasus’s native format. Once converted, pegasus-plan can be used to plan, and submit the workflow for execution.

A Simple Example

../_images/examples-diamond-clear.png

Consider the workflow depicted above. The CWL tab below illustrates what this workflow might look like when represented using CWL. The Pegasus tab illustrates the resulting translation from CWL to Pegasus’s native format.

While there are inherent similarities between the two formats, this is not a one to one mapping. As such, some information needs to be provided to the pegasus-cwl-converter in order for the translation to be done correctly (see highlighted lines in the Pegasus tab).

This additional information is as follows:

  • For each transformation:
    • What site does it reside on? (e.g. local, condorpool, etc)

    • Can this transformation be staged to another site?

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
cwlVersion: v1.1
class: Workflow
inputs:
    f.a: File
outputs:
    final_output:
        type: File
        outputSource: analyze/of
steps:
    preprocess:
        run:
            cwlVersion: v1.1
            class: CommandLineTool
            baseCommand: $PEGASUS_LOCAL_BIN_DIR/pegasus-keg
            arguments: ["-a", "preprocess", "-T60", "-o", "f.b1", "-o", "f.b2"]
            requirements:
                DockerRequirement:
                    dockerPull: opensciencegrid/osgvo-el7
            inputs:
                if:
                    type: File
                    inputBinding:
                        prefix: -i
                        separate: true
                        position: 0
            outputs:
                of1:
                    type: File
                    outputBinding:
                        glob: f.b1

                of2:
                    type: File
                    outputBinding:
                        glob: f.b2

        in:
            if: f.a
        out: [of1, of2]

    findrange1:
        run:
            cwlVersion: v1.1
            class: CommandLineTool
            baseCommand: $PEGASUS_LOCAL_BIN_DIR/pegasus-keg
            arguments: ["-a", "findrange", "-T60", "-o", "f.c1"]
            requirements:
                DockerRequirement:
                    dockerPull: opensciencegrid/osgvo-el7
            inputs:
                if:
                    type: File
                    inputBinding:
                        prefix: -i
                        separate: true
                        position: 0

            outputs:
                of:
                    type: File
                    outputBinding:
                        glob: f.c1
        in:
            if: preprocess/of1
        out: [of]

    findrange2:
        run:
            cwlVersion: v1.1
            class: CommandLineTool
            baseCommand: $PEGASUS_LOCAL_BIN_DIR/pegasus-keg
            arguments: ["-a", "findrange", "-T60", "-o", "f.c2"]
            requirements:
                DockerRequirement:
                    dockerPull: opensciencegrid/osgvo-el7
            inputs:
                if:
                    type: File
                    inputBinding:
                        prefix: -i
                        separate: true
                        position: 0

            outputs:
                of:
                    type: File
                    outputBinding:
                        glob: f.c2
        in:
            if: preprocess/of2
        out: [of]

    analyze:
        run:
            cwlVersion: v1.1
            class: CommandLineTool
            baseCommand: $PEGASUS_LOCAL_BIN_DIR/pegasus-keg
            arguments: ["-a", "analyze", "-T60", "-o", "f.d"]
            requirements:
                DockerRequirement:
                    dockerPull: opensciencegrid/osgvo-el7
            inputs:
                if1:
                    type: File
                    inputBinding:
                        prefix: -i
                        separate: true
                        position: 0

                if2:
                    type: File
                    inputBinding:
                        position: 1

            outputs:
                of:
                    type: File
                    outputBinding:
                        glob: f.d
        in:
            if1: findrange1/of
            if2: findrange2/of
        out: [of]
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
x-pegasus:
apiLang: python
createdBy: ryantanaka
createdOn: 11-11-20T16:45:46Z
pegasus: '5.0'
name: cwl-converted-pegasus-workflow
replicaCatalog:
replicas:
- lfn: f.a
    pfns:
    - site: local
    pfn: /Users/ryantanaka/symlinks/isip/test/core/047-cwl-docker-black-diamond/cwl/f.a
transformationCatalog:
transformations:
- name: pegasus-keg
    sites:
    - name: local
    pfn: /Users/ryantanaka/ISI/pegasus/dist/pegasus-5.0.0dev/bin/pegasus-keg
    type: stageable
    container: opensciencegrid_osgvo-el7
containers:
- name: opensciencegrid_osgvo-el7
    type: docker
    image: docker://opensciencegrid/osgvo-el7
    image.site: local
jobs:
- type: job
name: pegasus-keg
id: analyze
arguments:
- -a
- analyze
- -T60
- -o
- f.d
- -i f.c1
- f.c2
uses:
- lfn: f.c1
    type: input
- lfn: f.c2
    type: input
- lfn: f.d
    type: output
    stageOut: true
    registerReplica: true
- type: job
name: pegasus-keg
id: findrange1
arguments:
- -a
- findrange
- -T60
- -o
- f.c1
- -i f.b1
uses:
- lfn: f.c1
    type: output
    stageOut: true
    registerReplica: true
- lfn: f.b1
    type: input
- type: job
name: pegasus-keg
id: findrange2
arguments:
- -a
- findrange
- -T60
- -o
- f.c2
- -i f.b2
uses:
- lfn: f.b2
    type: input
- lfn: f.c2
    type: output
    stageOut: true
    registerReplica: true
- type: job
name: pegasus-keg
id: preprocess
arguments:
- -a
- preprocess
- -T60
- -o
- f.b1
- -o
- f.b2
- -i f.a
uses:
- lfn: f.b2
    type: output
    stageOut: true
    registerReplica: true
- lfn: f.a
    type: input
- lfn: f.b1
    type: output
    stageOut: true
    registerReplica: true
jobDependencies:
- id: findrange1
children:
- analyze
- id: findrange2
children:
- analyze
- id: preprocess
children:
- findrange1
- findrange2

Using pegasus-cwl-converter

Using the pegasus-cwl-converter, how can we convert the above CWL workflow into Pegasus’s native format?

First, we need to create a file tr_spec.yml, which contains Pegasus specific information about the transformations (executables) used in the workflow.

pegasus-keg:
    site: local
    is_stageable: True

Next, we need a file that specifies where the initial input files to the workflow are physically located. For CWL workflows, input file specifications are typically specified in a separate YAML file. For the CWL workflow above, lets say that this file is called input_file_specs.yml and contains the following contents:

f.a:
    class: File
    path: /data/f.a

Using the following three files:

  1. CWL workflow file (wf.cwl)

  2. Workflow inputs file (input_file_specs.yml)

  3. Pegasus specific information about the executables (tr_specs.yml)

The converter can be invoked as:

pegasus-cwl-converter workflow.cwl input.yml tr_specs.yml  pegasus_workflow.yml

The resulting Pegasus workflow file pegasus_workflow.yml is depicted above in the Pegasus tab.

Note

pegasus-cwl-converter works on a subset of the CWL specification. If the conversion does not work for your workflow, reach out to us and we can assist you in getting your workflow up and running.