9. CWL Support
The Common Workflow Language (CWL) is a standardized way of describing workflow pipelines and tools. Users with existing CWL workflows can use the pegasus-cwl-converter to convert their workflows from CWL into Pegasus’s native format. Once converted, pegasus-plan can be used to plan, and submit the workflow for execution.
A Simple Example
Consider the workflow depicted above. The CWL tab below illustrates what this workflow might look like when represented using CWL. The Pegasus tab illustrates the resulting translation from CWL to Pegasus’s native format.
While there are inherent similarities between the two formats, this is not a one to one mapping. As such, some information needs to be provided to the pegasus-cwl-converter in order for the translation to be done correctly (see highlighted lines in the Pegasus tab).
This additional information is as follows:
- For each transformation:
What site does it reside on? (e.g.
local
,condorpool
, etc)Can this transformation be staged to another site?
1cwlVersion: v1.1
2class: Workflow
3inputs:
4 f.a: File
5outputs:
6 final_output:
7 type: File
8 outputSource: analyze/of
9steps:
10 preprocess:
11 run:
12 cwlVersion: v1.1
13 class: CommandLineTool
14 baseCommand: $PEGASUS_LOCAL_BIN_DIR/pegasus-keg
15 arguments: ["-a", "preprocess", "-T60", "-o", "f.b1", "-o", "f.b2"]
16 requirements:
17 DockerRequirement:
18 dockerPull: opensciencegrid/osgvo-el7
19 inputs:
20 if:
21 type: File
22 inputBinding:
23 prefix: -i
24 separate: true
25 position: 0
26 outputs:
27 of1:
28 type: File
29 outputBinding:
30 glob: f.b1
31
32 of2:
33 type: File
34 outputBinding:
35 glob: f.b2
36
37 in:
38 if: f.a
39 out: [of1, of2]
40
41 findrange1:
42 run:
43 cwlVersion: v1.1
44 class: CommandLineTool
45 baseCommand: $PEGASUS_LOCAL_BIN_DIR/pegasus-keg
46 arguments: ["-a", "findrange", "-T60", "-o", "f.c1"]
47 requirements:
48 DockerRequirement:
49 dockerPull: opensciencegrid/osgvo-el7
50 inputs:
51 if:
52 type: File
53 inputBinding:
54 prefix: -i
55 separate: true
56 position: 0
57
58 outputs:
59 of:
60 type: File
61 outputBinding:
62 glob: f.c1
63 in:
64 if: preprocess/of1
65 out: [of]
66
67 findrange2:
68 run:
69 cwlVersion: v1.1
70 class: CommandLineTool
71 baseCommand: $PEGASUS_LOCAL_BIN_DIR/pegasus-keg
72 arguments: ["-a", "findrange", "-T60", "-o", "f.c2"]
73 requirements:
74 DockerRequirement:
75 dockerPull: opensciencegrid/osgvo-el7
76 inputs:
77 if:
78 type: File
79 inputBinding:
80 prefix: -i
81 separate: true
82 position: 0
83
84 outputs:
85 of:
86 type: File
87 outputBinding:
88 glob: f.c2
89 in:
90 if: preprocess/of2
91 out: [of]
92
93 analyze:
94 run:
95 cwlVersion: v1.1
96 class: CommandLineTool
97 baseCommand: $PEGASUS_LOCAL_BIN_DIR/pegasus-keg
98 arguments: ["-a", "analyze", "-T60", "-o", "f.d"]
99 requirements:
100 DockerRequirement:
101 dockerPull: opensciencegrid/osgvo-el7
102 inputs:
103 if1:
104 type: File
105 inputBinding:
106 prefix: -i
107 separate: true
108 position: 0
109
110 if2:
111 type: File
112 inputBinding:
113 position: 1
114
115 outputs:
116 of:
117 type: File
118 outputBinding:
119 glob: f.d
120 in:
121 if1: findrange1/of
122 if2: findrange2/of
123 out: [of]
1x-pegasus:
2apiLang: python
3createdBy: ryantanaka
4createdOn: 11-11-20T16:45:46Z
5pegasus: '5.0'
6name: cwl-converted-pegasus-workflow
7replicaCatalog:
8replicas:
9- lfn: f.a
10 pfns:
11 - site: local
12 pfn: /Users/ryantanaka/symlinks/isip/test/core/047-cwl-docker-black-diamond/cwl/f.a
13transformationCatalog:
14transformations:
15- name: pegasus-keg
16 sites:
17 - name: local
18 pfn: /Users/ryantanaka/ISI/pegasus/dist/pegasus-5.0.0dev/bin/pegasus-keg
19 type: stageable
20 container: opensciencegrid_osgvo-el7
21containers:
22- name: opensciencegrid_osgvo-el7
23 type: docker
24 image: docker://opensciencegrid/osgvo-el7
25 image.site: local
26jobs:
27- type: job
28name: pegasus-keg
29id: analyze
30arguments:
31- -a
32- analyze
33- -T60
34- -o
35- f.d
36- -i f.c1
37- f.c2
38uses:
39- lfn: f.c1
40 type: input
41- lfn: f.c2
42 type: input
43- lfn: f.d
44 type: output
45 stageOut: true
46 registerReplica: true
47- type: job
48name: pegasus-keg
49id: findrange1
50arguments:
51- -a
52- findrange
53- -T60
54- -o
55- f.c1
56- -i f.b1
57uses:
58- lfn: f.c1
59 type: output
60 stageOut: true
61 registerReplica: true
62- lfn: f.b1
63 type: input
64- type: job
65name: pegasus-keg
66id: findrange2
67arguments:
68- -a
69- findrange
70- -T60
71- -o
72- f.c2
73- -i f.b2
74uses:
75- lfn: f.b2
76 type: input
77- lfn: f.c2
78 type: output
79 stageOut: true
80 registerReplica: true
81- type: job
82name: pegasus-keg
83id: preprocess
84arguments:
85- -a
86- preprocess
87- -T60
88- -o
89- f.b1
90- -o
91- f.b2
92- -i f.a
93uses:
94- lfn: f.b2
95 type: output
96 stageOut: true
97 registerReplica: true
98- lfn: f.a
99 type: input
100- lfn: f.b1
101 type: output
102 stageOut: true
103 registerReplica: true
104jobDependencies:
105- id: findrange1
106children:
107- analyze
108- id: findrange2
109children:
110- analyze
111- id: preprocess
112children:
113- findrange1
114- findrange2
Using pegasus-cwl-converter
Using the pegasus-cwl-converter, how can we convert the above CWL workflow into Pegasus’s native format?
First, we need to create a file tr_spec.yml
, which contains Pegasus specific
information about the transformations (executables) used in the workflow.
pegasus-keg:
site: local
is_stageable: True
Next, we need a file that specifies where the initial input files to the workflow
are physically located. For CWL workflows, input file specifications are typically
specified in a separate YAML file. For the CWL workflow above, lets say that
this file is called input_file_specs.yml
and contains the following contents:
f.a:
class: File
path: /data/f.a
Using the following three files:
CWL workflow file (
wf.cwl
)Workflow inputs file (
input_file_specs.yml
)Pegasus specific information about the executables (
tr_specs.yml
)
The converter can be invoked as:
pegasus-cwl-converter workflow.cwl input.yml tr_specs.yml pegasus_workflow.yml
The resulting Pegasus workflow file pegasus_workflow.yml
is depicted above in
the Pegasus tab.
Note
pegasus-cwl-converter works on a subset of the CWL specification. If the conversion does not work for your workflow, reach out to us and we can assist you in getting your workflow up and running.