5.5. Pegasus-Plan

pegasus-plan is the main executable that takes in the abstract workflow ( DAX ) and generates an executable workflow ( usually a Condor DAG ) by querying various catalogs and performing several refinement steps. Before users can run pegasus plan the following needs to be done:

  1. Populate the various catalogs

    1. Replica Catalog

      The Replica Catalog needs to be catalogued with the locations of the input files required by the workflows. This can be done by using pegasus-rc-client (See the Replica section of Creating Workflows).

    2. Transformation Catalog

      The Transformation Catalog needs to be catalogued with the locations of the executables that the workflows will use. This can be done by using pegasus-tc-client (See the Transformation section of Creating Workflows).

    3. Site Catalog

      The Site Catalog needs to be catalogued with the site layout of the various sites that the workflows can execute on. A site catalog can be generated for OSG by using the client pegasus-sc-client (See the Site section of the Creating Workflows).

  2. Configure Properties

    After the catalogs have been configured, the user properties file need to be updated with the types and locations of the catalogs to use. These properties are described in the basic.properties files in the etc sub directory (see the Properties section of the Configuration chapter.

    The basic properties that need to be set usually are listed below:

    Table 5.2. Basic Properties that need to be set

    pegasus.catalog.replica
    pegasus.catalog.replica.file | pegasus.catalog.replica.url
    pegasus.catalog.transformation
    pegasus.catalog.transformation.file
    pegasus.catalog.site.file

To execute pegasus-plan user usually requires to specify the following options:

  1. --dax the path to the DAX file that needs to be mapped.

  2. --dir the base directory where the executable workflow is generated

  3. --sites comma separated list of execution sites.

  4. --output the output site where to transfer the materialized output files.

  5. --submit boolean value whether to submit the planned workflow for execution after planning is done.