Chapter 10. Reference Manual

10.1. Properties

This is the reference guide to all properties regarding the Pegasus Workflow Planner, and their respective default values. Please refer to the user guide for a discussion when and which properties to use to configure various components. Please note that the values rely on proper capitalization, unless explicitly noted otherwise.

Some properties rely with their default on the value of other properties. As a notation, the curly braces refer to the value of the named property. For instance, ${pegasus.home} means that the value depends on the value of the pegasus.home property plus any noted additions. You can use this notation to refer to other properties, though the extent of the subsitutions are limited. Usually, you want to refer to a set of the standard system properties. Nesting is not allowed. Substitutions will only be done once.

There is a priority to the order of reading and evaluating properties. Usually one does not need to worry about the priorities. However, it is good to know the details of when which property applies, and how one property is able to overwrite another. The following is a mutually exclusive list ( highest priority first ) of property file locations.

  1. --conf option to the tools. Almost all of the clients that use properties have a --conf option to specify the property file to pick up.
  2. submit-dir/pegasus.xxxxxxx.properties file. All tools that work on the submit directory ( i.e after pegasus has planned a workflow) pick up the pegasus.xxxxx.properties file from the submit directory. The location for the pegasus.xxxxxxx.propertiesis picked up from the braindump file.
  3. The properties defined in the user property file ${user.home}/.pegasusrc have lowest priority.

Commandline properties have the highest priority. These override any property loaded from a property file. Each commandline property is introduced by a -D argument. Note that these arguments are parsed by the shell wrapper, and thus the -D arguments must be the first arguments to any command. Commandline properties are useful for debugging purposes.

From Pegasus 3.1 release onwards, support has been dropped for the following properties that were used to signify the location of the properties file

  • pegasus.properties
  • pegasus.user.properties

The following example provides a sensible set of properties to be set by the user property file. These properties use mostly non-default settings. It is an example only, and will not work for you:

pegasus.catalog.replica              File
pegasus.catalog.replica.file         ${pegasus.home}/etc/sample.rc.data
pegasus.catalog.transformation       Text
pegasus.catalog.transformation.file  ${pegasus.home}/etc/sample.tc.text
pegasus.catalog.site.file            ${pegasus.home}/etc/sample.sites.xml

If you are in doubt which properties are actually visible, pegasus during the planning of the workflow dumps all properties after reading and prioritizing in the submit directory in a file with the suffix properties.

10.1.1. pegasus.home

Systems: all
Type: directory location string
Default: "$PEGASUS_HOME"

The property pegasus.home cannot be set in the property file. This property is automatically set up by the pegasus clients internally by determining the installation directory of pegasus. Knowledge about this property is important for developers who want to invoke PEGASUS JAVA classes without the shell wrappers.

10.1.2. Local Directories

This section describes the GNU directory structure conventions. GNU distinguishes between architecture independent and thus sharable directories, and directories with data specific to a platform, and thus often local. It also distinguishes between frequently modified data and rarely changing data. These two axis form a space of four distinct directories.

10.1.2.1. pegasus.home.datadir

Systems: all
Type: directory location string
Default: ${pegasus.home}/share

The datadir directory contains broadly visiable and possilby exported configuration files that rarely change. This directory is currently unused.

10.1.2.2. pegasus.home.sysconfdir

Systems: all
Type: directory location string
Default: ${pegasus.home}/etc

The system configuration directory contains configuration files that are specific to the machine or installation, and that rarely change. This is the directory where the XML schema definition copies are stored, and where the base pool configuration file is stored.

10.1.2.3. pegasus.home.sharedstatedir

Systems: all
Type: directory location string
Default: ${pegasus.home}/com

Frequently changing files that are broadly visible are stored in the shared state directory. This is currently unused.

10.1.2.4. pegasus.home.localstatedir

Systems: all
Type: directory location string
Default: ${pegasus.home}/var

Frequently changing files that are specific to a machine and/or installation are stored in the local state directory. This directory is being used for the textual transformation catalog, and the file-based replica catalog.

10.1.2.5. pegasus.dir.submit.logs

System: Pegasus
Since: 2.4
Type: directory location string
Default: false

By default, Pegasus points the condor logs for the workflow to /tmp directory. This is done to ensure that the logs are created in a local directory even though the submit directory maybe on NFS. In the submit directory the symbolic link to the appropriate log file in the /tmp exists.

However, since /tmp is automatically purged in most cases, users may want to preserve their condor logs in a directory on the local filesystem other than /tmp

10.1.3. Site Directories

The site directory properties modify the behavior of remotely run jobs. In rare occasions, it may also pertain to locally run compute jobs.

10.1.3.1. pegasus.dir.useTimestamp

System: Pegasus
Since: 2.1
Type: Boolean
Default: false

While creating the submit directory, Pegasus employs a run numbering scheme. Users can use this property to use a timestamp based numbering scheme instead of the runxxxx scheme.

10.1.3.2. pegasus.dir.exec

System: Pegasus
Since: 2.0
Type: remote directory location string
Default: (no default)

This property modifies the remote location work directory in which all your jobs will run. If the path is relative then it is appended to the work directory (associated with the site), as specified in the site catalog. If the path is absolute then it overrides the work directory specified in the site catalog.

10.1.3.3. pegasus.dir.storage.deep

System: Pegasus
Since: 2.1
Type: Boolean
Default: false
See Also: pegasus.dir.storage
See Also: pegasus.dir.useTimestamp

This property results in the creation of a deep directory structure on the output site, while populating the results. The base directory on the remote end is determined from the site catalog and the property pegasus.dir.storage.

To this base directory, the relative submit directory structure ( $user/$vogroup/$label/runxxxx ) is appended.

$storage = $base + $relative_submit_directory

Depending on the number of files being staged to the remote site a Hashed File Structure is created that ensures that only 256 files reside in one directory.

To create this directory structure on the storage site, Pegasus relies on the directory creation feature of the Grid FTP server, which appeared in globus 4.0.x

10.1.3.4. pegasus.dir.create.strategy

System: Pegasus
Since: 2.2
Type: enumeration
Value[0]: HourGlass
Value[1]: Tentacles
Default: Tentacles

If the

--randomdir

option is given to the Planner at runtime, the Pegasus planner adds nodes that create the random directories at the remote pool sites, before any jobs are actually run. The two modes determine the placement of these nodes and their dependencies to the rest of the graph.

HourGlass
It adds a make directory node at the top level of the graph, and all these concat to a single dummy job before branching out to the root nodes of the original/ concrete dag so far. So we introduce a classic X shape at the top of the graph. Hence the name HourGlass.
Tentacles
This option places the jobs creating directories at the top of the graph. However instead of constricting it to an hour glass shape, this mode links the top node to all the relevant nodes for which the create dir job is necessary. It looks as if the node spreads its tentacleas all around. This puts more load on the DAGMan because of the added dependencies but removes the restriction of the plan progressing only when all the create directory jobs have progressed on the remote pools, as is the case in the HourGlass model.

10.1.3.5. pegasus.dir.create.impl

System: Pegasus
Since: 2.2
Type: enumeration
Value[0]: DefaultImplementation
Value[1]: S3
Default: DefaultImpelmentation

This property is used to select the executable that is used to create the working directory on the compute sites.

DefaultImplementation
The default executable that is used to create a directory is the dirmanager executable shipped with Pegasus. It is found at $PEGASUS_HOME/bin/dirmanager in the pegasus distribution. An entry for transformation pegasus::dirmanager needs to exist in the Transformation Catalog or the PEGASUS_HOME environment variable should be specified in the site catalog for the sites for this mode to work.
S3
This option is used to create buckets in S3 instead of a directory. This should be set when running workflows on Amazon EC2. This implementation relies on s3cmd command line client to create the bucket. An entry for transformation amazon::s3cmd needs to exist in the Transformation Catalog for this to work.

10.1.4. Schema File Location Properties

This section defines the location of XML schema files that are used to parse the various XML document instances in the PEGASUS. The schema backups in the installed file-system permit PEGASUS operations without being online.

10.1.4.1. pegasus.schema.dax

Systems: Pegasus
Since: 2.0
Type: XML schema file location string
Value[0]: ${pegasus.home.sysconfdir}/dax-3.2.xsd
Default: ${pegasus.home.sysconfdir}/dax-3.2.xsd

This file is a copy of the XML schema that describes abstract DAG files that are the result of the abstract planning process, and input into any concrete planning. Providing a copy of the schema enables the parser to use the local copy instead of reaching out to the internet, and obtaining the latest version from the GriPhyN website dynamically.

10.1.4.2. pegasus.schema.sc

Systems: Pegasus
Since: 2.0
Type: XML schema file location string
Value[0]: ${pegasus.home.sysconfdir}/sc-3.0.xsd
Default: ${pegasus.home.sysconfdir}/sc-3.0.xsd

This file is a copy of the XML schema that describes the xml description of the site catalog, that is generated as a result of using genpoolconfig command. Providing a copy of the schema enables the parser to use the local copy instead of reaching out to the internet, and obtaining the latest version from the GriPhyN website dynamically.

10.1.4.3. pegasus.schema.ivr

Systems: all
Type: XML schema file location string
Value[0]: ${pegasus.home.sysconfdir}/iv-2.0.xsd
Default: ${pegasus.home.sysconfdir}/iv-2.0.xsd

This file is a copy of the XML schema that describes invocation record files that are the result of the a grid launch in a remote or local site. Providing a copy of the schema enables the parser to use the local copy instead of reaching out to the internet, and obtaining the latest version from the GriPhyN website dynamically.

10.1.5. Database Drivers For All Relational Catalogs

10.1.5.1. pegasus.catalog.*.db.driver

System: Pegasus
Type: Java class name
Value[0]: Postgres
Value[1]: MySQL
Value[2]: SQLServer2000 (not yet implemented!)
Value[3]: Oracle (not yet implemented!)
Default: (no default)
See also: pegasus.catalog.provenance

The database driver class is dynamically loaded, as required by the schema. Currently, only PostGreSQL 7.3 and MySQL 4.0 are supported. Their respective JDBC3 driver is provided as part and parcel of the PEGASUS.

A user may provide their own implementation, derived from org.griphyn.vdl.dbdriver.DatabaseDriver, to talk to a database of their choice.

For each schema in PTC, a driver is instantiated separately, which has the same prefix as the schema. This may result in multiple connections to the database backend. As fallback, the schema "*" driver is attempted.

The * in the property name can be replaced by a catalog name to apply the property only for that catalog. Valid catalog names are

replica
provenance

10.1.5.2. pegasus.catalog.*.db.url

System: PTC, ...
Type: JDBC database URI string
Default: (no default)
Example: jdbc:postgresql:${user.name}

Each database has its own string to contact the database on a given host, port, and database. Although most driver URLs allow to pass arbitrary arguments, please use the pegasus.catalog.[catalog-name].db.* keys or pegasus.catalog.*.db.* to preload these arguments. THE URL IS A MANDATORY PROPERTY FOR ANY DBMS BACKEND.

Postgres : jdbc:postgresql:[//hostname[:port]/]database
MySQL    : jdbc:mysql://hostname[:port]]/database
SQLServer: jdbc:microsoft:sqlserver://hostname:port
Oracle   : jdbc:oracle:thin:[user/password]@//host[:port]/service

The * in the property name can be replaced by a catalog name to apply the property only for that catalog. Valid catalog names are

replica
provenance

10.1.5.3. pegasus.catalog.*.db.user

System: PTC, ...
Type: string
Default: (no default)
Example: ${user.name}

In order to access a database, you must provide the name of your account on the DBMS. This property is database-independent. THIS IS A MANDATORY PROPERTY FOR MANY DBMS BACKENDS.

The * in the property name can be replaced by a catalog name to apply the property only for that catalog. Valid catalog names are

replica
provenance

10.1.5.4. pegasus.catalog.*.db.password

System: PTC, ...
Type: string
Default: (no default)
Example: ${user.name}

In order to access a database, you must provide an optional password of your account on the DBMS. This property is database-independent. THIS IS A MANDATORY PROPERTY, IF YOUR DBMS BACKEND ACCOUNT REQUIRES A PASSWORD.

The * in the property name can be replaced by a catalog name to apply the property only for that catalog. Valid catalog names are

replica
provenance

10.1.5.5. pegasus.catalog.*.db.*

System: PTC, RC

Each database has a multitude of options to control in fine detail the further behaviour. You may want to check the JDBC3 documentation of the JDBC driver for your database for details. The keys will be passed as part of the connect properties by stripping the "pegasus.catalog.[catalog-name].db." prefix from them. The catalog-name can be replaced by the following values provenance for Provenance Catalog (PTC), replica for Replica Catalog (RC)

Postgres 7.3 parses the following properties:

pegasus.catalog.*.db.user
pegasus.catalog.*.db.password
pegasus.catalog.*.db.PGHOST
pegasus.catalog.*.db.PGPORT
pegasus.catalog.*.db.charSet
pegasus.catalog.*.db.compatible

MySQL 4.0 parses the following properties:

pegasus.catalog.*.db.user
pegasus.catalog.*.db.password
pegasus.catalog.*.db.databaseName
pegasus.catalog.*.db.serverName
pegasus.catalog.*.db.portNumber
pegasus.catalog.*.db.socketFactory
pegasus.catalog.*.db.strictUpdates
pegasus.catalog.*.db.ignoreNonTxTables
pegasus.catalog.*.db.secondsBeforeRetryMaster
pegasus.catalog.*.db.queriesBeforeRetryMaster
pegasus.catalog.*.db.allowLoadLocalInfile
pegasus.catalog.*.db.continueBatchOnError
pegasus.catalog.*.db.pedantic
pegasus.catalog.*.db.useStreamLengthsInPrepStmts
pegasus.catalog.*.db.useTimezone
pegasus.catalog.*.db.relaxAutoCommit
pegasus.catalog.*.db.paranoid
pegasus.catalog.*.db.autoReconnect
pegasus.catalog.*.db.capitalizeTypeNames
pegasus.catalog.*.db.ultraDevHack
pegasus.catalog.*.db.strictFloatingPoint
pegasus.catalog.*.db.useSSL
pegasus.catalog.*.db.useCompression
pegasus.catalog.*.db.socketTimeout
pegasus.catalog.*.db.maxReconnects
pegasus.catalog.*.db.initialTimeout
pegasus.catalog.*.db.maxRows
pegasus.catalog.*.db.useHostsInPrivileges
pegasus.catalog.*.db.interactiveClient
pegasus.catalog.*.db.useUnicode
pegasus.catalog.*.db.characterEncoding

MS SQL Server 2000 support the following properties (keys are case-insensitive, e.g. both "user" and "User" are valid):

pegasus.catalog.*.db.User
pegasus.catalog.*.db.Password
pegasus.catalog.*.db.DatabaseName
pegasus.catalog.*.db.ServerName
pegasus.catalog.*.db.HostProcess
pegasus.catalog.*.db.NetAddress
pegasus.catalog.*.db.PortNumber
pegasus.catalog.*.db.ProgramName
pegasus.catalog.*.db.SendStringParametersAsUnicode
pegasus.catalog.*.db.SelectMethod

The * in the property name can be replaced by a catalog name to apply the property only for that catalog. Valid catalog names are

replica
provenance

10.1.6. Catalog Properties

10.1.6.1. Replica Catalog

10.1.6.1.1. pegasus.catalog.replica

System: Pegasus
Since: 2.0
Type: enumeration
Value[0]: RLS
Value[1]: LRC
Value[2]: JDBCRC
Value[3]: File
Value[4]: Directory
Value[5]: MRC
Value[6]: Regex
Default: RLS

Pegasus queries a Replica Catalog to discover the physical filenames (PFN) for input files specified in the DAX. Pegasus can interface with various types of Replica Catalogs. This property specifies which type of Replica Catalog to use during the planning process.

RLS
RLS (Replica Location Service) is a distributed replica catalog, which ships with GT4. There is an index service called Replica Location Index (RLI) to which 1 or more Local Replica Catalog (LRC) report. Each LRC can contain all or a subset of mappings. In this mode, Pegasus queries the central RLI to discover in which LRC's the mappings for a LFN reside. It then queries the individual LRC's for the PFN's. To use RLS, the user additionally needs to set the property pegasus.catalog.replica.url to specify the URL for the RLI to query. Details about RLS can be found at http://www.globus.org/toolkit/data/rls/
LRC
If the user does not want to query the RLI, but directly a single Local Replica Catalog. To use LRC, the user additionally needs to set the property pegasus.catalog.replica.url to specify the URL for the LRC to query. Details about RLS can be found at http://www.globus.org/toolkit/data/rls/
JDBCRC
In this mode, Pegasus queries a SQL based replica catalog that is accessed via JDBC. The sql schema's for this catalog can be found at $PEGASUS_HOME/sql directory. To use JDBCRC, the user additionally needs to set the following properties
  1. pegasus.catalog.replica.db.url
  2. pegasus.catalog.replica.db.user
  3. pegasus.catalog.replica.db.password
File

In this mode, Pegasus queries a file based replica catalog. It is neither transactionally safe, nor advised to use for production purposes in any way. Multiple concurrent instances will clobber each other!. The site attribute should be specified whenever possible. The attribute key for the site attribute is "pool".

The LFN may or may not be quoted. If it contains linear whitespace, quotes, backslash or an equality sign, it must be quoted and escaped. Ditto for the PFN. The attribute key-value pairs are separated by an equality sign without any whitespaces. The value may be in quoted. The LFN sentiments about quoting apply.

LFN PFN
LFN PFN a=b [..]
LFN PFN a="b" [..]
"LFN w/LWS" "PFN w/LWS" [..]

To use File, the user additionally needs to specify pegasus.catalog.replica.file property to specify the path to the file based RC.

Regex

In this mode, Pegasus queries a file based replica catalog. It is neither transactionally safe, nor advised to use for production purposes in any way. Multiple concurrent access to the File will end up clobbering the contents of the file. The site attribute should be specified whenever possible. The attribute key for the site attribute is "pool".

The LFN may or may not be quoted. If it contains linear whitespace, quotes, backslash or an equality sign, it must be quoted and escaped. Ditto for the PFN. The attribute key-value pairs are separated by an equality sign without any whitespaces. The value may be in quoted. The LFN sentiments about quoting apply.

In addition users can specifiy regular expression based LFN's. A regular expression based entry should be qualified with an attribute named 'regex'. The attribute regex when set to true identifies the catalog entry as a regular expression based entry. Regular expressions should follow Java regular expression syntax.

For example, consider a replica catalog as shown below.

Entry 1 refers to an entry which does not use a resular expressions. This entry would only match a file named 'f.a', and nothing else. Entry 2 referes to an entry which uses a regular expression. In this entry f.a referes to files having name as f[any-character]a i.e. faa, f.a, f0a, etc.

f.a file:///Volumes/data/input/f.a pool="local"
f.a file:///Volumes/data/input/f.a pool="local" regex="true"

Regular expression based entries also support substitutions. For example, consider the regular expression based entry shown below.

Entry 3 will match files with name alpha.csv, alpha.txt, alpha.xml. In addition, values matched in the expression can be used to generate a PFN.

For the entry below if the file being looked up is alpha.csv, the PFN for the file would be generated as file:///Volumes/data/input/csv/alpha.csv. Similary if the file being lookedup was alpha.csv, the PFN for the file would be generated as file:///Volumes/data/input/xml/alpha.xml i.e. The section [0], [1] will be replaced. Section [0] refers to the entire string i.e. alpha.csv. Section [1] refers to a partial match in the input i.e. csv, or txt, or xml. Users can utilize as many sections as they wish.

alpha\.(csv|txt|xml) file:///Volumes/data/input/[1]/[0] pool="local" regex="true"

To use File, the user additionally needs to specify pegasus.catalog.replica.file property to specify the path to the file based RC.

Directory

In this mode, Pegasus does a directory listing on an input directory to create the LFN to PFN mappings. The directory listing is performed recursively, resulting in deep LFN mappings. For example, if an input directory $input is specified with the following structure

$input
$input/f.1
$input/f.2
$input/D1
$input/D1/f.3

Pegasus will create the mappings the following LFN PFN mappings internally

f.1 file://$input/f.1  pool="local"
f.2 file://$input/f.2  pool="local"
D1/f.3 file://$input/D2/f.3 pool="local"

pegasus-plan has --input-dir option that can be used to specify an input directory.

Users can optionally specify additional properties to configure the behvavior of this implementation.

pegasus.catalog.replica.directory.site to specify a site attribute other than local to associate with the mappings.

pegasus.catalog.replica.directory.url.prefix to associate a URL prefix for the PFN's constructed. If not specified, the URL defaults to file://

MRC

In this mode, Pegasus queries multiple replica catalogs to discover the file locations on the grid. To use it set

pegasus.catalog.replica MRC

Each associated replica catalog can be configured via properties as follows.

The user associates a variable name referred to as [value] for each of the catalogs, where [value] is any legal identifier (concretely [A-Za-z][_A-Za-z0-9]*) For each associated replica catalogs the user specifies the following properties.

pegasus.catalog.replica.mrc.[value]       specifies the type of replica catalog.
pegasus.catalog.replica.mrc.[value].key   specifies a property name key for a
particular catalog

For example, if a user wants to query two lrc's at the same time he/she can specify as follows

pegasus.catalog.replica.mrc.lrc1 LRC
pegasus.catalog.replica.mrc.lrc2.url rls://sukhna
pegasus.catalog.replica.mrc.lrc2 LRC
pegasus.catalog.replica.mrc.lrc2.url rls://smarty

In the above example, lrc1, lrc2 are any valid identifier names and url is the property key that needed to be specified.

10.1.6.1.2. pegasus.catalog.replica.url

System: Pegasus
Since: 2.0
Type: URI string
Default: (no default)

When using the modern RLS replica catalog, the URI to the Replica catalog must be provided to Pegasus to enable it to look up filenames. There is no default.

10.1.6.1.3. pegasus.catalog.replica.chunk.size

System: Pegasus, rc-client
Since: 2.0
Type: Integer
Default: 1000

The rc-client takes in an input file containing the mappings upon which to work. This property determines, the number of lines that are read in at a time, and worked upon at together. This allows the various operations like insert, delete happen in bulk if the underlying replica implementation supports it.

10.1.6.1.4. pegasus.catalog.replica.lrc.ignore

System: Replica Catalog - RLS
Since: 2.0
Type: comma separated list of LRC urls
Default: (no default)
See also: pegasus.catalog.replica.lrc.restrict

Certain users may like to skip some LRCs while querying for the physical locations of a file. If some LRCs need to be skipped from those found in the rli then use this property. You can define either the full URL or partial domain names that need to be skipped. E.g. If a user wants rls://smarty.isi.edu and all LRCs on usc.edu to be skipped then the property will be set as pegasus.rls.lrc.ignore=rls://smarty.isi.edu,usc.edu

10.1.6.1.5. pegasus.catalog.replica.lrc.restrict

System: Replica Catalog - RLS
Since: 1.3.9
Type: comma separated list of LRC urls
Default: (no default)
See also: pegasus.catalog.replica.lrc.ignore

This property applies a tighter restriction on the results returned from the LRCs specified. Only those PFNs are returned that have a pool attribute associated with them. The property "pegasus.rc.lrc.ignore" has a higher priority than "pegasus.rc.lrc.restrict". For example, in case a LRC is specified in both properties, the LRC would be ignored (i.e. not queried at all instead of applying a tighter restriction on the results returned).

10.1.6.1.6. pegasus.catalog.replica.lrc.site.[site-name]

System: Replica Catalog - RLS
Since: 2.3.0
Type: LRC url
Default: (no default)

This property allows for the LRC url to be associated with site handles. Usually, a pool attribute is required to be associated with the PFN for Pegasus to figure out the site on which PFN resides. However, in the case where an LRC is responsible for only a single site's mappings, Pegasus can safely associate LRC url with the site. This association can be used to determine the pool attribute for all mappings returned from the LRC, if the mapping does not have a pool attribute associated with it.

The site_name in the property should be replaced by the name of the site. For example

pegasus.catalog.replica.lrc.site.isi  rls://lrc.isi.edu

tells Pegasus that all PFNs returned from LRC rls://lrc.isi.edu are associated with site isi.

The [site_name] should be the same as the site handle specified in the site catalog.

10.1.6.1.7. pegasus.catalog.replica.cache.asrc

System: Pegasus
Since: 2.0
Type: Boolean
Value[0]: false
Value[1]: true
Default: false
See also: pegasus.catalog.replica

This property determines whether to treat the cache file specified as a supplemental replica catalog or not. User can specify on the command line to pegasus-plan a comma separated list of cache files using the --cache option. By default, the LFN->PFN mappings contained in the cache file are treated as cache, i.e if an entry is found in a cache file the replica catalog is not queried. This results in only the entry specified in the cache file to be available for replica selection.

Setting this property to true, results in the cache files to be treated as supplemental replica catalogs. This results in the mappings found in the replica catalog (as specified by pegasus.catalog.replica) to be merged with the ones found in the cache files. Thus, mappings for a particular LFN found in both the cache and the replica catalog are available for replica selection.

10.1.6.2. Site Catalog

10.1.6.2.1. pegasus.catalog.site

System: Site Catalog
Since: 2.0
Type: enumeration
Value[0]: XML4
Value[1]: XML3
Default: XML4

The site catalog file format is now automatically detected, so there should be no need to use the property anymore.

10.1.6.2.2. pegasus.catalog.site.file

System: Site Catalog
Since: 2.0
Type: file location string
Default: ${pegasus.home.sysconfdir}/sites.xml
See also: pegasus.catalog.site

Running things on the grid requires an extensive description of the capabilities of each compute cluster, commonly termed "site". This property describes the location of the file that contains such a site description. As the format is currently in flow, please refer to the userguide and Pegasus for details which format is expected.

10.1.6.3. Transformation Catalog

10.1.6.3.1. pegasus.catalog.transformation

System: Transformation Catalog
Since: 2.0
Type: enumeration
Value[0]: Text
Value[1]: File
Default: Text
See also: pegasus.catalog.transformation.file

Text

In this mode, a multiline file based format is understood. The file is read and cached in memory. Any modifications, as adding or deleting, causes an update of the memory and hence to the file underneath. All queries are done against the memory representation.

The file sample.tc.text in the etc directory contains an example

Here is a sample textual format for transfomation catalog containing one transformation on two sites

tr example::keg:1.0 {
#specify profiles that apply for all the sites for the transformation
#in each site entry the profile can be overriden
profile env "APP_HOME" "/tmp/karan"
profile env "JAVA_HOME" "/bin/app"
site isi {
profile env "me" "with"
profile condor "more" "test"
profile env "JAVA_HOME" "/bin/java.1.6"
pfn "/path/to/keg"
arch  "x86"
os    "linux"
osrelease "fc"
osversion "4"
type "INSTALLED"
site wind {
profile env "me" "with"
profile condor "more" "test"
pfn "/path/to/keg"
arch  "x86"
os    "linux"
osrelease "fc"
osversion "4"
type "STAGEABLE"

File
THIS FORMAT IS DEPRECATED. WILL BE REMOVED IN COMING VERSIONS. USE pegasus-tc-converter to convert File format to Text Format. In this mode, a file format is understood. The file is read and cached in memory. Any modifications, as adding or deleting, causes an update of the memory and hence to the file underneath. All queries are done against the memory representation. The new TC file format uses 6 columns:
  1. The resource ID is represented in the first column.
  2. The logical transformation uses the colonized format ns::name:vs.
  3. The path to the application on the system
  4. The installation type is identified by one of the following keywords - all upper case: INSTALLED, STAGEABLE. If not specified, or NULL is used, the type defaults to INSTALLED.
  5. The system is of the format ARCH::OS[:VER:GLIBC]. The following arch types are understood: "INTEL32", "INTEL64", "SPARCV7", "SPARCV9". The following os types are understood: "LINUX", "SUNOS", "AIX". If unset or NULL, defaults to INTEL32::LINUX.
  6. Profiles are written in the format NS::KEY=VALUE,KEY2=VALUE;NS2::KEY3=VALUE3 Multiple key-values for same namespace are seperated by a comma "," and multiple namespaces are seperated by a semicolon ";". If any of your profile values contains a comma you must not use the namespace abbreviator.

10.1.6.3.2. pegasus.catalog.transformation.file

Systems: Transformation Catalog
Type: file location string
Default: ${pegasus.home.sysconfdir}/tc.text | ${pegasus.home.sysconfdir}/tc.data
See also: pegasus.catalog.transformation

This property is used to set the path to the textual transformation catalogs of type File or Text. If the transformation catalog is of type Text then tc.text file is picked up from sysconfdir, else tc.data

10.1.6.4. Provenance Catalog

10.1.6.4.1. pegasus.catalog.provenance

System: Provenance Tracking Catalog (PTC)
Since: 2.0
Type: Java class name
Value[0]: InvocationSchema
Value[1]: NXDInvSchema
Default: (no default)
See also: pegasus.catalog.*.db.driver

This property denotes the schema that is being used to access a PTC. The PTC is usually not a standard installation. If you use a database backend, you most likely have a schema that supports PTCs. By default, no PTC will be used.

Currently only the InvocationSchema is available for storing the provenance tracking records. Beware, this can become a lot of data. The values are names of Java classes. If no absolute Java classname is given, "org.griphyn.vdl.dbschema." is prepended. Thus, by deriving from the DatabaseSchema API, and implementing the PTC interface, users can provide their own classes here.

Alternatively, if you use a native XML database like eXist, you can store data using the NXDInvSchema. This will avoid using any of the other database driver properties.

10.1.6.4.2. pegasus.catalog.provenance.refinement

System: PASOA Provenance Store
Since: 2.0.1
Type: Java class name
Value[0]: Pasoa
Value[1]: InMemory
Default: InMemory
See also: pegasus.catalog.*.db.driver

This property turns on the logging of the refinement process that happens inside Pegasus to the PASOA store. Not all actions are currently captured. It is still an experimental feature.

The PASOA store needs to run on localhost on port 8080 https://localhost:8080/prserv-1.0

10.1.7. Replica Selection Properties

10.1.7.1. pegasus.selector.replica

System: Replica Selection
Since: 2.0
Type: URI string
Default: default
See also: pegasus.replica.*.ignore.stagein.sites
See also: pegasus.replica.*.prefer.stagein.sites

Each job in the DAX maybe associated with input LFN's denoting the files that are required for the job to run. To determine the physical replica (PFN) for a LFN, Pegasus queries the replica catalog to get all the PFN's (replicas) associated with a LFN. Pegasus then calls out to a replica selector to select a replica amongst the various replicas returned. This property determines the replica selector to use for selecting the replicas.

Default
If a PFN that is a file URL (starting with file:///) and has a pool attribute matching to the site handle of the site where the compute is to be run is found, then that is returned. Else,a random PFN is selected amongst all the PFN's that have a pool attribute matching to the site handle of the site where a compute job is to be run. Else, a random pfn is selected amongst all the PFN's.
Restricted

This replica selector, allows the user to specify good sites and bad sites for staging in data to a particular compute site. A good site for a compute site X, is a preferred site from which replicas should be staged to site X. If there are more than one good sites having a particular replica, then a random site is selected amongst these preferred sites.

A bad site for a compute site X, is a site from which replica's should not be staged. The reason of not accessing replica from a bad site can vary from the link being down, to the user not having permissions on that site's data.

The good | bad sites are specified by the properties

pegasus.replica.*.prefer.stagein.sites
pegasus.replica.*.ignore.stagein.sites

where the * in the property name denotes the name of the compute site. A * in the property key is taken to mean all sites.

The pegasus.replica.*.prefer.stagein.sites property takes precedence over pegasus.replica.*.ignore.stagein.sites property i.e. if for a site X, a site Y is specified both in the ignored and the preferred set, then site Y is taken to mean as only a preferred site for a site X.

Regex

This replica selector allows the user allows the user to specific regex expressions that can be used to rank various PFN's returned from the Replica Catalog for a particular LFN. This replica selector selects the highest ranked PFN i.e the replica with the lowest rank value.

The regular expressions are assigned different rank, that determine the order in which the expressions are employed. The rank values for the regex can expressed in user properties using the property.

pegasus.selector.replica.regex.rank.[value]   regex-expression

The value is an integer value that denotes the rank of an expression with a rank value of 1 being the highest rank.

Please note that before applying any regular expressions on the PFN's, the file URL's that dont match the preferred site are explicitly filtered out.

Local
This replica selector prefers replicas from the local host and that start with a file: URL scheme. It is useful, when users want to stagin files to a remote site from your submit host using the Condor file transfer mechanism.

10.1.7.2. pegasus.selector.replica.*.ignore.stagein.sites

System: Replica Selection
Type: comma separated list of sites
Since: 2.0
Default: no default
See also: pegasus.selector.replica
See also: pegasus.selector.replica.*.prefer.stagein.sites

A comma separated list of storage sites from which to never stage in data to a compute site. The property can apply to all or a single compute site, depending on how the * in the property name is expanded.

The * in the property name means all compute sites unless replaced by a site name.

For e.g setting pegasus.selector.replica.*.ignore.stagein.sites to usc means that ignore all replicas from site usc for staging in to any compute site. Setting pegasus.replica.isi.ignore.stagein.sites to usc means that ignore all replicas from site usc for staging in data to site isi.

10.1.7.3. pegasus.selector.replica.*.prefer.stagein.sites

System: Replica Selection
Type: comma separated list of sites
Since: 2.0
Default: no default
See also: pegasus.selector.replica
See also: pegasus.selector.replica.*.ignore.stagein.sites

A comma separated list of preferred storage sites from which to stage in data to a compute site. The property can apply to all or a single compute site, depending on how the * in the property name is expanded.

The * in the property name means all compute sites unless replaced by a site name.

For e.g setting pegasus.selector.replica.*.prefer.stagein.sites to usc means that prefer all replicas from site usc for staging in to any compute site. Setting pegasus.replica.isi.prefer.stagein.sites to usc means that prefer all replicas from site usc for staging in data to site isi.

10.1.7.4. pegasus.selector.replica.regex.rank.[value]

System: Replica Selection
Type: Regex Expression
Since: 2.3.0
Default: no default
See also: pegasus.selector.replica

Specifies the regex expressions to be applied on the PFNs returned for a particular LFN. Refer to

http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html

on information of how to construct a regex expression.

The [value] in the property key is to be replaced by an int value that designates the rank value for the regex expression to be applied in the Regex replica selector.

The example below indicates preference for file URL's over URL's referring to gridftp server at example.isi.edu

pegasus.selector.replica.regex.rank.1 file://.*
pegasus.selector.replica.regex.rank.2 gsiftp://example\.isi\.edu.*

10.1.8. Site Selection Properties

10.1.8.1. pegasus.selector.site

System: Pegasus
Since: 2.0
Type: enumeration
Value[0]: Random
Value[1]: RoundRobin
Value[2]: NonJavaCallout
Value[3]: Group
Value[4]: Heft
Default: Random
See also: pegasus.selector.site.path
See also: pegasus.selector.site.timeout
See also: pegasus.selector.site.keep.tmp
See also: pegasus.selector.site.env.*

The site selection in Pegasus can be on basis of any of the following strategies.

Random
In this mode, the jobs will be randomly distributed among the sites that can execute them.
RoundRobin
In this mode. the jobs will be assigned in a round robin manner amongst the sites that can execute them. Since each site cannot execute everytype of job, the round robin scheduling is done per level on a sorted list. The sorting is on the basis of the number of jobs a particular site has been assigned in that level so far. If a job cannot be run on the first site in the queue (due to no matching entry in the transformation catalog for the transformation referred to by the job), it goes to the next one and so on. This implementation defaults to classic round robin in the case where all the jobs in the workflow can run on all the sites.
NonJavaCallout

In this mode, Pegasus will callout to an external site selector.In this mode a temporary file is prepared containing the job information that is passed to the site selector as an argument while invoking it. The path to the site selector is specified by setting the property pegasus.site.selector.path. The environment variables that need to be set to run the site selector can be specified using the properties with a pegasus.site.selector.env. prefix. The temporary file contains information about the job that needs to be scheduled. It contains key value pairs with each key value pair being on a new line and separated by a =.

The following pairs are currently generated for the site selector temporary file that is generated in the NonJavaCallout.

version is the version of the site selector api,currently 2.0.
transformation is the fully-qualified definition identifier for the transformation (TR) namespace::name:version.
derivation is teh fully qualified definition identifier for the derivation (DV), namespace::name:version.
job.level is the job's depth in the tree of the workflow DAG.
job.id is the job's ID, as used in the DAX file.
resource.id is a pool handle, followed by whitespace, followed by a gridftp server. Typically, each gridftp server is enumerated once, so you may have multiple occurances of the same site. There can be multiple occurances of this key.
input.lfn is an input LFN, optionally followed by a whitespace and file size. There can be multiple occurances of this key,one for each input LFN required by the job.
wf.name label of the dax, as found in the DAX's root element. wf.index is the DAX index, that is incremented for each partition in case of deferred planning.
wf.time is the mtime of the workflow.
wf.manager is the name of the workflow manager being used .e.g condor
vo.name is the name of the virtual organization that is running this workflow. It is currently set to NONE
vo.group unused at present and is set to NONE.
 

Group
In this mode, a group of jobs will be assigned to the same site that can execute them. The use of the PEGASUS profile key group in the dax, associates a job with a particular group. The jobs that do not have the profile key associated with them, will be put in the default group. The jobs in the default group are handed over to the "Random" Site Selector for scheduling.
Heft

In this mode, a version of the HEFT processor scheduling algorithm is used to schedule jobs in the workflow to multiple grid sites. The implementation assumes default data communication costs when jobs are not scheduled on to the same site. Later on this may be made more configurable.

The runtime for the jobs is specified in the transformation catalog by associating the pegasus profile key runtime with the entries.

The number of processors in a site is picked up from the attribute idle-nodes associated with the vanilla jobmanager of the site in the site catalog.

10.1.8.2. pegasus.selector.site.path

System: Site Selector
Since: 2.0
Type: String

If one calls out to an external site selector using the NonJavaCallout mode, this refers to the path where the site selector is installed. In case other strategies are used it does not need to be set.

10.1.8.3. pegasus.site.selector.env.*

System: Pegasus
Since: 1.2.3
Type: String

The environment variables that need to be set while callout to the site selector. These are the variables that the user would set if running the site selector on the command line. The name of the environment variable is got by stripping the keys of the prefix "pegasus.site.selector.env." prefix from them. The value of the environment variable is the value of the property.

e.g pegasus.site.selector.path.LD_LIBRARY_PATH /globus/lib would lead to the site selector being called with the LD_LIBRARY_PATH set to /globus/lib.

10.1.8.4. pegasus.selector.site.timeout

System: Site Selector
Since: 2.0
Type: non negative integer
Default: 60

It sets the number of seconds Pegasus waits to hear back from an external site selector using the NonJavaCallout interface before timing out.

10.1.8.5. pegasus.selector.site.keep.tmp

System: Pegasus
Since: 2.0
Type: enumeration
Value[0]: onerror
Value[1]: always
Value[2]: never
Default: onerror

It determines whether Pegasus deletes the temporary input files that are generated in the temp directory or not. These temporary input files are passed as input to the external site selectors.

A temporary input file is created for each that needs to be scheduled.

10.1.9. Data Staging Configuration

10.1.9.1. pegasus.data.configuration

System: Pegasus
Since: 3.1
Type: enumeration
Value[0]: sharedfs
Value[1]: nonsharedfs
Value[2]: condorio
Default: sharedfs

This property sets up Pegasus to run in different environments.

sharedfs
If this is set, Pegasus will be setup to execute jobs on the shared filesystem on the execution site. This assumes, that the head node of a cluster and the worker nodes share a filesystem. The staging site in this case is the same as the execution site. Pegasus adds a create dir job to the executable workflow that creates a workflow specific directory on the shared filesystem . The data transfer jobs in the executable workflow ( stage_in_ , stage_inter_ , stage_out_ ) transfer the data to this directory.The compute jobs in the executable workflow are launched in the directory on the shared filesystem. Internally, if this is set the following properties are set.
pegasus.execute.*.filesystem.local   false
condorio
If this is set, Pegasus will be setup to run jobs in a pure condor pool, with the nodes not sharing a filesystem. Data is staged to the compute nodes from the submit host using Condor File IO. The planner is automatically setup to use the submit host ( site local ) as the staging site. All the auxillary jobs added by the planner to the executable workflow ( create dir, data stagein and stage-out, cleanup ) jobs refer to the workflow specific directory on the local site. The data transfer jobs in the executable workflow ( stage_in_ , stage_inter_ , stage_out_ ) transfer the data to this directory. When the compute jobs start, the input data for each job is shipped from the workflow specific directory on the submit host to compute/worker node using Condor file IO. The output data for each job is similarly shipped back to the submit host from the compute/worker node. This setup is particularly helpful when running workflows in the cloud environment where setting up a shared filesystem across the VM's may be tricky. On loading this property, internally the following properies are set
pegasus.transfer.sls.*.impl          Condor
pegasus.execute.*.filesystem.local   true
pegasus.gridstart 		   PegasusLite
pegasus.transfer.worker.package      true
nonsharedfs
If this is set, Pegasus will be setup to execute jobs on an execution site without relying on a shared filesystem between the head node and the worker nodes. You can specify staging site ( using --staging-site option to pegasus-plan) to indicate the site to use as a central storage location for a workflow. The staging site is independant of the execution sites on which a workflow executes. All the auxillary jobs added by the planner to the executable workflow ( create dir, data stagein and stage-out, cleanup ) jobs refer to the workflow specific directory on the staging site. The data transfer jobs in the executable workflow ( stage_in_ , stage_inter_ , stage_out_ ) transfer the data to this directory. When the compute jobs start, the input data for each job is shipped from the workflow specific directory on the submit host to compute/worker node using pegasus-transfer. The output data for each job is similarly shipped back to the submit host from the compute/worker node. The protocols supported are at this time SRM, GridFTP, iRods, S3. This setup is particularly helpful when running workflows on OSG where most of the execution sites don't have enough data storage. Only a few sites have large amounts of data storage exposed that can be used to place data during a workflow run. This setup is also helpful when running workflows in the cloud environment where setting up a shared filesystem across the VM's may be tricky. On loading this property, internally the following properies are set
pegasus.execute.*.filesystem.local   true
pegasus.gridstart 		   PegasusLite
pegasus.transfer.worker.package      true

10.1.10. Transfer Configuration Properties

10.1.10.1. pegasus.transfer.*.impl

System: Pegasus
Type: enumeration
Value[0]: Transfer
Value[1]: GUC
Default: Transfer
See also: pegasus.transfer.refiner
Since: 2.0

Each compute job usually has data products that are required to be staged in to the execution site, materialized data products staged out to a final resting place, or staged to another job running at a different site. This property determines the underlying grid transfer tool that is used to manage the transfers.

The * in the property name can be replaced to achieve finer grained control to dictate what type of transfer jobs need to be managed with which grid transfer tool.

Usually,the arguments with which the client is invoked can be specified by

- the property pegasus.transfer.arguments
- associating the PEGASUS profile key transfer.arguments

The table below illustrates all the possible variations of the property.

Property Name Applies to
pegasus.transfer.stagein.impl the stage in transfer jobs
pegasus.transfer.stageout.impl the stage out transfer jobs
pegasus.transfer.inter.impl the inter pool transfer jobs
pegasus.transfer.setup.impl the setup transfer job
pegasus.transfer.*.impl apply to types of transfer jobs
 

Note: Since version 2.2.0 the worker package is staged automatically during staging of executables to the remote site. This is achieved by adding a setup transfer job to the workflow. The setup transfer job by default uses GUC to stage the data. The implementation to use can be configured by setting the property

pegasus.transfer.setup.impl 

property. However, if you have pegasus.transfer.*.impl set in your properties file, then you need to set pegasus.transfer.setup.impl to GUC

The various grid transfer tools that can be used to manage data transfers are explained below

Transfer

This results in pegasus-transfer to be used for transferring of files. It is a python based wrapper around various transfer clients like globus-url-copy, lcg-copy, wget, cp, ln . pegasus-transfer looks at source and destination url and figures out automatically which underlying client to use. pegasus-transfer is distributed with the PEGASUS and can be found at $PEGASUS_HOME/bin/pegasus-transfer.

For remote sites, Pegasus constructs the default path to pegasus-transfer on the basis of PEGASUS_HOME env profile specified in the site catalog. To specify a different path to the pegasus-transfer client , users can add an entry into the transformation catalog with fully qualified logical name as pegasus::pegasus-transfer

GUC
This refers to the new guc client that does multiple file transfers per invocation. The globus-url-copy client distributed with Globus 4.x is compatible with this mode.

10.1.10.2. pegasus.transfer.refiner

System: Pegasus
Type: enumeration
Value[0]: Basic
Value[1]: Cluster
Default: Cluster
Since: 2.0
See also: pegasus.transfer.*.impl

This property determines how the transfer nodes are added to the workflow. The various refiners differ in the how they link the various transfer jobs, and the number of transfer jobs that are created per compute jobs.

Basic
This is a basic refinement strategy that adds a stage-in job per compute job and a stage-out per compute jobs. It is not recommended to use this , especially for large workflows where lots of stage-in jobs maybe created for a workflow. This is only recommended for experimental setups.
Cluster

In this refinement strategy, clusters of stage-in and stageout jobs are created per level of the workflow. This workflow allows you to control the number of stagein and stageout jobs by associating pegasus profiles stagein.clusters and stageout.clusters with the jobs or in the site catalog for the staging sites.

10.1.10.3. pegasus.transfer.sls.*.impl

System: Pegasus
Type: enumeration
Value[0]: Transfer
Value[1]: Condor
Default: Transfer
Since: 2.2.0
See also: pegasus.data.configuration
See also: pegasus.execute.*.filesystem.local

This property specifies the transfer tool to be used for Second Level Staging (SLS) of input and output data between the head node and worker node filesystems.

Currently, the * in the property name CANNOT be replaced to achieve finer grained control to dictate what type of SLS transfers need to be managed with which grid transfer tool.

The various grid transfer tools that can be used to manage SLS data transfers are explained below

Transfer

This results in pegasus-transfer to be used for transferring of files. It is a python based wrapper around various transfer clients like globus-url-copy, lcg-copy, wget, cp, ln . pegasus-transfer looks at source and destination url and figures out automatically which underlying client to use. pegasus-transfer is distributed with the PEGASUS and can be found at $PEGASUS_HOME/bin/pegasus-transfer.

For remote sites, Pegasus constructs the default path to pegasus-transfer on the basis of PEGASUS_HOME env profile specified in the site catalog. To specify a different path to the pegasus-transfer client , users can add an entry into the transformation catalog with fully qualified logical name as pegasus::pegasus-transfer

Condor

This results in Condor file transfer mechanism to be used to transfer the input data files from the submit host directly to the worker node directories. This is used when running in pure Condor mode or in a Condor pool that does not have a shared filesystem between the nodes.

When setting the SLS transfers to Condor make sure that the following properties are also set

pegasus.gridstart		        PegasusLite
pegasus.execute.*.filesystem.local  true

Alternatively, you can set

pegasus.data.configuration           condorio

in lieu of the above 3 properties.

Also make sure that pegasus.gridstart is not set.

Please refer to the section on "Condor Pool Without a Shared Filesystem" in the chapter on Planning and Submitting.

10.1.10.4. pegasus.transfer.arguments

System: Pegasus
Since: 2.0
Type: String
Default: (no default)
See also: pegasus.transfer.sls.arguments

This determines the extra arguments with which the transfer implementation is invoked. The transfer executable that is invoked is dependant upon the transfer mode that has been selected. The property can be overloaded by associated the pegasus profile key transfer.arguments either with the site in the site catalog or the corresponding transfer executable in the transformation catalog.

10.1.10.5. pegasus.transfer.sls.arguments

System: Pegasus
Since: 2.4
Type: String
Default: (no default)
See also: pegasus.transfer.arguments
See also: pegasus.transfer.sls.*.impl

This determines the extra arguments with which the SLS transfer implementation is invoked. The transfer executable that is invoked is dependant upon the SLS transfer implementation that has been selected.

10.1.10.6. pegasus.transfer.stage.sls.file

System: Pegasus
Since: 3.0
Type: Boolean
Default: (no default)
See also: pegasus.gridstart
See also: pegasus.execute.*.filesystem.local

For executing jobs on the local filesystem, Pegasus creates sls files for each compute jobs. These sls files list the files that need to be staged to the worker node and the output files that need to be pushed out from the worker node after completion of the job. By default, pegasus will stage these SLS files to the shared filesystem on the head node as part of first level data stagein jobs. However, in the case where there is no shared filesystem between head nodes and the worker nodes, the user can set this property to false. This will result in the sls files to be transferred using the Condor File Transfer from the submit host.

10.1.10.7. pegasus.transfer.worker.package

System: Pegasus
Type: boolean
Default: false
Since: 3.0
See also: pegasus.data.configuration

By default, Pegasus relies on the worker package to be installed in a directory accessible to the worker nodes on the remote sites . Pegasus uses the value of PEGASUS_HOME environment profile in the site catalog for the remote sites, to then construct paths to pegasus auxillary executables like kickstart, pegasus-transfer, seqexec etc.

If the Pegasus worker package is not installed on the remote sites users can set this property to true to get Pegasus to deploy worker package on the nodes.

In the case of sharedfs setup, the worker package is deployed on the shared scratch directory for the workflow , that is accessible to all the compute nodes of the remote sites.

When running in nonsharefs environments, the worker package is first brought to the submit directory and then transferred to the worker node filesystem using Condor file IO.

10.1.10.8. pegasus.transfer.links

System: Pegasus
Type: boolean
Default: false
Since: 2.0
See also: pegasus.transfer
See also: pegasus.transfer.force

If this is set, and the transfer implementation is set to Transfer i.e. using the transfer executable distributed with the PEGASUS. On setting this property, if Pegasus while fetching data from the Replica Catalog sees a pool attribute associated with the PFN that matches the execution pool on which the data has to be transferred to, Pegasus instead of the URL returned by the Replica Catalog replaces it with a file based URL. This is based on the assumption that the if the pools match the filesystems are visible to the remote execution directory where input data resides. On seeing both the source and destination urls as file based URLs the transfer executable spawns a job that creates a symbolic link by calling ln -s on the remote pool.

10.1.10.9. pegasus.transfer.*.remote.sites

System: Pegasus
Type: comma separated list of sites
Default: no default
Since: 2.0

By default Pegasus looks at the source and destination URL's for to determine whether the associated transfer job runs on the submit host or the head node of a remote site, with preference set to run a transfer job to run on submit host.

Pegasus will run transfer jobs on the remote sites

-  if the file server for the compute site is a file server i.e url prefix file://
-  symlink jobs need to be added that require the symlink transfer jobs to
be run remotely.

This property can be used to change the default behaviour of Pegasus and force pegasus to run different types of transfer jobs for the sites specified on the remote site.

The table below illustrates all the possible variations of the property.

Property Name Applies to
pegasus.transfer.stagein.remote.sites the stage in transfer jobs
pegasus.transfer.stageout.remote.sites the stage out transfer jobs
pegasus.transfer.inter.remote.sites the inter pool transfer jobs
pegasus.transfer.*.remote.sites apply to types of transfer jobs
 

In addition * can be specified as a property value, to designate that it applies to all sites.

10.1.10.10. pegasus.transfer.staging.delimiter

System: Pegasus
Since: 2.0
Type: String
Default: :
See also: pegasus.transformation.selector

Pegasus supports executable staging as part of the workflow. Currently staging of statically linked executables is supported only. An executable is normally staged to the work directory for the workflow/partition on the remote site. The basename of the staged executable is derived from the namespace,name and version of the transformation in the transformation catalog. This property sets the delimiter that is used for the construction of the name of the staged executable.

10.1.10.11. pegasus.transfer.disable.chmod.sites

System: Pegasus
Since: 2.0
Type: comma separated list of sites
Default: no default

During staging of executables to remote sites, chmod jobs are added to the workflow. These jobs run on the remote sites and do a chmod on the staged executable. For some sites, this maynot be required. The permissions might be preserved, or there maybe an automatic mechanism that does it.

This property allows you to specify the list of sites, where you do not want the chmod jobs to be executed. For those sites, the chmod jobs are replaced by NoOP jobs. The NoOP jobs are executed by Condor, and instead will immediately have a terminate event written to the job log file and removed from the queue.

10.1.10.12. pegasus.transfer.setup.source.base.url

System: Pegasus
Type: URL
Default: no default
Since: 2.3

This property specifies the base URL to the directory containing the Pegasus worker package builds. During Staging of Executable, the Pegasus Worker Package is also staged to the remote site. The worker packages are by default pulled from the http server at pegasus.isi.edu. This property can be used to override the location from where the worker package are staged. This maybe required if the remote computes sites don't allows files transfers from a http server.

10.1.11. Gridstart And Exitcode Properties

10.1.11.1. pegasus.gridstart

System: Pegasus
Since: 2.0
Type: enumeration
Value[0]: Kickstart
Value[1]: None
Value[2]: PegasusLite
Default: Kickstart
See also: pegasus.execute.*.filesystem.local

Jobs that are launched on the grid maybe wrapped in a wrapper executable/script that enables information about about the execution, resource consumption, and - most importantly - the exitcode of the remote application. At present, a job scheduled on a remote site is launched with a gridstart if site catalog has the corresponding gridlaunch attribute set and the job being launched is not MPI.

Users can explicitly decide what gridstart to use for a job, by associating the pegasus profile key named gridstart with the job.

Kickstart
In this mode, all the jobs are lauched via kickstart. The kickstart executable is a light-weight program which connects the stdin,stdout and stderr filehandles for PEGASUS jobs on the remote site. Kickstart is an executable distributed with PEGASUS that can generally be found at ${pegasus.home.bin}/kickstart.
None
In this mode, all the jobs are launched directly on the remote site. Each job's stdin,stdout and stderr are connected to condor commands in a manner to ensure that they are sent back to the submit host.
PegasusLite
In this mode, the compute jobs are wrapped by PegasusLite instances. PegasusLite instance is a bash script, that is launced on the compute node. It determins at runtime the directory a job needs to execute in, pulls in data from the staging site , launches the job, pushes out the data and cleans up the directory after execution.

10.1.11.2. pegasus.gridstart.kickstart.set.xbit

System: Pegasus
Since: 2.4
Type: Boolean
Default: false
See also: pegasus.transfer.disable.chmod.sites

Kickstart has an option to set the X bit on an executable before it launches it on the remote site. In case of staging of executables, by default chmod jobs are launched that set the x bit of the user executables staged to a remote site.

On setting this property to true, kickstart gridstart module adds a -X option to kickstart arguments. The -X arguments tells kickstart to set the x bit of the executable before launching it.

User should usually disable the chmod jobs by setting the property pegasus.transfer.disable.chmod.sites , if they set this property to true.

10.1.11.3. pegasus.gridstart.kickstart.stat

System: Pegasus
Since: 2.1
Type: Boolean
Default: false
See also: pegasus.gridstart.generate.lof

Kickstart has an option to stat the input files and the output files. The stat information is collected in the XML record generated by kickstart. Since stat is an expensive operation, it is not turned on by on. Set this property to true if you want to see stat information for the input files and output files of a job in it's kickstart output.

10.1.11.4. pegasus.gridstart.generate.lof

System: Pegasus
Since: 2.1
Type: Boolean
Default: false
See also: pegasus.gridstart.kickstart.stat

For the stat option for kickstart, we generate 2 lof ( list of filenames ) files for each job. One lof file containing the input lfn's for the job, and the other containing output lfn's for the job. In some cases, it maybe beneficial to have these lof files generated but not do the actual stat. This property allows you to generate the lof files without triggering the stat in kickstart invocations.

10.1.11.5. pegasus.gridstart.invoke.always

System: Pegasus
Since: 2.0
Type: Boolean
Default: false
See also: pegasus.gridstart.invoke.length

Condor has a limit in it, that restricts the length of arguments to an executable to 4K. To get around this limit, you can trigger Kickstart to be invoked with the -I option. In this case, an arguments file is prepared per job that is transferred to the remote end via the Condor file transfer mechanism. This way the arguments to the executable are not specified in the condor submit file for the job. This property specifies whether you want to use the invoke option always for all jobs, or want it to be triggered only when the argument string is determined to be greater than a certain limit.

10.1.11.6. pegasus.gridstart.invoke.length

System: Pegasus
Since: 2.0
Type: Long
Default: 4000
See also: pegasus.gridstart.invoke.always

Gridstart is automatically invoked with the -I option, if it is determined that the length of the arguments to be specified is going to be greater than a certain limit. By default this limit is set to 4K. However, it can overriden by specifying this property.

10.1.12. Interface To Condor And Condor Dagman

The Condor DAGMan facility is usually activate using the condor_submit_dag command. However, many shapes of workflows have the ability to either overburden the submit host, or overflow remote gatekeeper hosts. While DAGMan provides throttles, unfortunately these can only be supplied on the command-line. Thus,PEGASUS provides a versatile wrapper to invoke DAGMan, called pegasus-submit-dag. It can be configured from the command-line, from user- and system properties, and by defaults.

10.1.12.1. pegasus.condor.logs.symlink

System: Condor
Type: Boolean
Default: true
Since: 3.0

By default pegasus has the Condor common log [dagname]-0.log in the submit file as a symlink to a location in /tmp . This is to ensure that condor common log does not get written to a shared filesystem. If the user knows for sure that the workflow submit directory is not on the shared filesystem, then they can opt to turn of the symlinking of condor common log file by setting this property to false.

10.1.12.2. pegasus.condor.arguments.quote

System: Condor
Type: Boolean
Default: true
Since: 2.0
Old Name: pegasus.condor.arguments.quote

This property determines whether to apply the new Condor quoting rules for quoting the argument string. The new argument quoting rules appeared in Condor 6.7.xx series. We have verified it for 6.7.19 version. If you are using an old condor at the submit host, set this property to false.

10.1.12.3. pegasus.dagman.notify

System: DAGman wrapper
Type: Case-insensitive enumeration
Value[0]: Complete
Value[1]: Error
Value[2]: Never
Default: Never
Document: http://www.cs.wisc.edu/condor/manual/v6.9/condor_submit_dag.html
Document: http://www.cs.wisc.edu/condor/manual/v6.9/condor_submit.html

The pegasus.dagman.nofity property has been deprecated in favor of the Pegasus notification framework. Please see the reference manual for details on how to get workflow notifications. pegasus.dagman.nofity will be removed in the an upcoming version of Pegasus.

10.1.12.4. pegasus.dagman.verbose

System: DAGman wrapper
Type: Boolean
Value[0]: false
Value[1]: true
Default: false
Document: http://www.cs.wisc.edu/condor/manual/v6.9/condor_submit_dag.html

The pegasus-submit-dag wrapper processes properties to set DAGMan commandline arguments. If set and true, the argument activates verbose output in case of DAGMan errors.

10.1.12.5. pegasus.dagman.[category].maxjobs

System: DAGman wrapper
Type: Integer
Since: 2.2
Default: no default
Document: http://vtcpc.isi.edu/pegasus/index.php/ChangeLog\#Support_for_DAGMan_node_categories

DAGMan now allows for the nodes in the DAG to be grouped in category. The tuning parameters like maxjobs then can be applied per category instead of being applied to the whole workflow. To use this facility users need to associate the dagman profile key named category with their jobs. The value of the key is the category to which the job belongs to.

You can then use this property to specify the value for a category. For the above example you will set pegasus.dagman.short-running.maxjobs

10.1.13. Monitoring Properties

10.1.13.1. pegasus.monitord.events

System: Pegasus-monitord
Type: Boolean
Default: true
Since: 3.0.2
See Also: pegasus.monitord.output

This property determines whether pegasus-monitord generates log events. If log events are disabled using this property, no bp file, or database will be created, even if the pegasus.monitord.output property is specified.

10.1.13.2. pegasus.monitord.output

System: Pegasus-monitord
Type: String
Since: 3.0.2
See Also: pegasus.monitord.events

This property specifies the destination for generated log events in pegasus-monitord. By default, events are stored in a sqlite database in the workflow directory, which will be created with the workflow's name, and a ".stampede.db" extension. Users can specify an alternative database by using a SQLAlchemy connection string. Details are available at:

http://www.sqlalchemy.org/docs/05/reference/dialects/index.html

It is important to note that users will need to have the appropriate db interface library installed. Which is to say, SQLAlchemy is a wrapper around the mysql interface library (for instance), it does not provide a MySQL driver itself. The Pegasus distribution includes both SQLAlchemy and the SQLite Python driver. As a final note, it is important to mention that unlike when using SQLite databases, using SQLAlchemy with other database servers, e.g. MySQL or Postgres , the target database needs to exist. Users can also specify a file name using this property in order to create a file with the log events.

Example values for the SQLAlchemy connection string for various end points are listed below

SQL Alchemy End Point Example Value
Netlogger BP File file:///submit/dir/myworkflow.bp
SQL Lite Database sqlite:///submit/dir/myworkflow.db
MySQL Database mysql://user:password@host:port/databasename
 

10.1.13.3. pegasus.dashboard.output

System: Pegasus-monitord
Type: String
Since: 4.2
See Also: pegasus.monitord.output

This property specifies the destination for the workflow dashboard database. By default, the workflow dashboard datbase defaults to a sqlite database named workflow.db in the $HOME/.pegasus directory. This is database is shared for all workflows run as a particular user. Users can specify an alternative database by using a SQLAlchemy connection string. Details are available at:

http://www.sqlalchemy.org/docs/05/reference/dialects/index.html

It is important to note that users will need to have the appropriate db interface library installed. Which is to say, SQLAlchemy is a wrapper around the mysql interface library (for instance), it does not provide a MySQL driver itself. The Pegasus distribution includes both SQLAlchemy and the SQLite Python driver. As a final note, it is important to mention that unlike when using SQLite databases, using SQLAlchemy with other database servers, e.g. MySQL or Postgres , the target database needs to exist. Users can also specify a file name using this property in order to create a file with the log events.

Example values for the SQLAlchemy connection string for various end points are listed below

SQL Alchemy End Point Example Value
SQL Lite Database sqlite:///shared/myworkflow.db
MySQL Database mysql://user:password@host:port/databasename
 

10.1.13.4. pegasus.monitord.notifications

System: Pegasus-monitord
Type: Boolean
Default: true
Since: 3.1
See Also: pegasus.monitord.notifications.max
See Also: pegasus.monitord.notifications.timeout

This property determines whether pegasus-monitord processes notifications. When notifications are enabled, pegasus-monitord will parse the .notify file generated by pegasus-plan and will invoke notification scripts whenever conditions matches one of the notifications.

10.1.13.5. pegasus.monitord.notifications.max

System: Pegasus-monitord
Type: Integer
Default: 10
Since: 3.1
See Also: pegasus.monitord.notifications
See Also: pegasus.monitord.notifications.timeout

This property determines how many notification scripts pegasus-monitord will call concurrently. Upon reaching this limit, pegasus-monitord will wait for one notification script to finish before issuing another one. This is a way to keep the number of processes under control at the submit host. Setting this property to 0 will disable notifications completely.

10.1.13.6. pegasus.monitord.notifications.timeout

System: Pegasus-monitord
Type: Integer
Default: 0
Since: 3.1
See Also: pegasus.monitord.notifications
See Also: pegasus.monitord.notifications.max

This property determines how long will pegasus-monitord let notification scripts run before terminating them. When this property is set to 0 (default), pegasus-monitord will not terminate any notification scripts, letting them run indefinitely. If some notification scripts missbehave, this has the potential problem of starving pegasus-monitord's notification slots (see the pegasus.monitord.notifications.max property), and block further notifications. In addition, users should be aware that pegasus-monitord will not exit until all notification scripts are finished.

10.1.13.7. pegasus.monitord.stdout.disable.parsing

System: Pegasus-monitord
Type: Boolean
Default: False
Since: 3.1.1

By default, pegasus-monitord parses the stdout/stderr section of the kickstart to populate the applications captured stdout and stderr in the job instance table for the stampede schema. For large workflows, this may slow down monitord especially if the application is generating a lot of output to it's stdout and stderr. This property, can be used to turn of the database population.

10.1.14. Job Clustering Properties

10.1.14.1. pegasus.clusterer.job.aggregator

System: Job Clustering
Since: 2.0
Type: String
Value[0]: seqexec
Value[1]: mpiexec
Default: seqexec

A large number of workflows executed through the Virtual Data System, are composed of several jobs that run for only a few seconds or so. The overhead of running any job on the grid is usually 60 seconds or more. Hence, it makes sense to collapse small independent jobs into a larger job. This property determines, the executable that will be used for running the larger job on the remote site.

seqexec
In this mode, the executable used to run the merged job is seqexec that runs each of the smaller jobs sequentially on the same node. The executable "seqexec" is a PEGASUS tool distributed in the PEGASUS worker package, and can be usually found at {pegasus.home}/bin/seqexec.
mpiexec
In this mode, the executable used to run the merged job is mpiexec that runs the smaller jobs via mpi on n nodes where n is the nodecount associated with the merged job. The executable "mpiexec" is a PEGASUS tool distributed in the PEGASUS worker package, and can be usually found at {pegasus.home}/bin/mpiexec.

10.1.14.2. pegasus.clusterer.job.aggregator.seqexec.log

System: Job Clustering
Type: Boolean
Default: false
Since: 2.3
See also: pegasus.clusterer.job.aggregator
See also: pegasus.clusterer.job.aggregator.seqexec.log.global

Seqexec logs the progress of the jobs that are being run by it in a progress file on the remote cluster where it is executed.

This property sets the Boolean flag, that indicates whether to turn on the logging or not.

10.1.14.3. pegasus.clusterer.job.aggregator.seqexec.log.global

System: Job Clustering
Type: Boolean
Default: true
Since: 2.3
See also: pegasus.clusterer.job.aggregator
See also: pegasus.clusterer.job.aggregator.seqexec.log
Old Name: pegasus.clusterer.job.aggregator.seqexec.hasgloballog

Seqexec logs the progress of the jobs that are being run by it in a progress file on the remote cluster where it is executed. The progress log is useful for you to track the progress of your computations and remote grid debugging. The progress log file can be shared by multiple seqexec jobs that are running on a particular cluster as part of the same workflow. Or it can be per job.

This property sets the Boolean flag, that indicates whether to have a single global log for all the seqexec jobs on a particular cluster or progress log per job.

10.1.14.4. pegasus.clusterer.job.aggregator.seqexec.firstjobfail

System: Job Clustering
Type: Boolean
Default: true
Since: 2.2
See also: pegasus.clusterer.job.aggregator

By default seqexec does not stop execution even if one of the clustered jobs it is executing fails. This is because seqexec tries to get as much work done as possible.

This property sets the Boolean flag, that indicates whether to make seqexec stop on the first job failure it detects.

10.1.14.5. pegasus.clusterer.label.key

System: Job Clustering
Type: String
Default: label
Since: 2.0
See also: pegasus.partitioner.label.key

While clustering jobs in the workflow into larger jobs, you can optionally label your graph to control which jobs are clustered and to which clustered job they belong. This done using a label based clustering scheme and is done by associating a profile/label key in the PEGASUS namespace with the jobs in the DAX. Each job that has the same value/label value for this profile key, is put in the same clustered job.

This property allows you to specify the PEGASUS profile key that you want to use for label based clustering.

10.1.15. Logging Properties

10.1.15.1. pegasus.log.manager

System: Pegasus
Since: 2.2.0
Type: Enumeration
Value[0]: Default
Value[1]: Log4j
Default: Default
See also: pegasus.log.manager.formatter

This property sets the logging implementation to use for logging.

Default
This implementation refers to the legacy Pegasus logger, that logs directly to stdout and stderr. It however, does have the concept of levels similar to log4j or syslog.
Log4j
This implementation, uses Log4j to log messages. The log4j properties can be specified in a properties file, the location of which is specified by the property
pegasus.log.manager.log4j.conf

10.1.15.2. pegasus.log.manager.formatter

System: Pegasus
Since: 2.2.0
Type: Enumeration
Value[0]: Simple
Value[1]: Netlogger
Default: Simple
See also: pegasus.log.manager.formatter

This property sets the formatter to use for formatting the log messages while logging.

Simple
This formats the messages in a simple format. The messages are logged as is with minimal formatting. Below are sample log messages in this format while ranking a dax according to performance.
event.pegasus.ranking dax.id se18-gda.dax  - STARTED
event.pegasus.parsing.dax dax.id se18-gda-nested.dax  - STARTED
event.pegasus.parsing.dax dax.id se18-gda-nested.dax  - FINISHED
job.id jobGDA
job.id jobGDA query.name getpredicted performace time 10.00
event.pegasus.ranking dax.id se18-gda.dax  - FINISHED
Netlogger

This formats the messages in the Netlogger format , that is based on key value pairs. The netlogger format is useful for loading the logs into a database to do some meaningful analysis. Below are sample log messages in this format while ranking a dax according to performance.

ts=2008-09-06T12:26:20.100502Z event=event.pegasus.ranking.start \
msgid=6bc49c1f-112e-4cdb-af54-3e0afb5d593c \
eventId=event.pegasus.ranking_8d7c0a3c-9271-4c9c-a0f2-1fb57c6394d5 \
dax.id=se18-gda.dax prog=Pegasus
ts=2008-09-06T12:26:20.100750Z event=event.pegasus.parsing.dax.start \
msgid=fed3ebdf-68e6-4711-8224-a16bb1ad2969 \
eventId=event.pegasus.parsing.dax_887134a8-39cb-40f1-b11c-b49def0c5232\
dax.id=se18-gda-nested.dax prog=Pegasus
ts=2008-09-06T12:26:20.100894Z event=event.pegasus.parsing.dax.end \
msgid=a81e92ba-27df-451f-bb2b-b60d232ed1ad \
eventId=event.pegasus.parsing.dax_887134a8-39cb-40f1-b11c-b49def0c5232
ts=2008-09-06T12:26:20.100395Z event=event.pegasus.ranking \
msgid=4dcecb68-74fe-4fd5-aa9e-ea1cee88727d \
eventId=event.pegasus.ranking_8d7c0a3c-9271-4c9c-a0f2-1fb57c6394d5 \
job.id="jobGDA"
ts=2008-09-06T12:26:20.100395Z event=event.pegasus.ranking \
msgid=4dcecb68-74fe-4fd5-aa9e-ea1cee88727d \
eventId=event.pegasus.ranking_8d7c0a3c-9271-4c9c-a0f2-1fb57c6394d5 \
job.id="jobGDA" query.name="getpredicted performace" time="10.00"
ts=2008-09-06T12:26:20.102003Z event=event.pegasus.ranking.end \
msgid=31f50f39-efe2-47fc-9f4c-07121280cd64 \
eventId=event.pegasus.ranking_8d7c0a3c-9271-4c9c-a0f2-1fb57c6394d5

10.1.15.3. pegasus.log.*

System: Pegasus
Since: 2.0
Type: String
Default: No default

This property sets the path to the file where all the logging for Pegasus can be redirected to. Both stdout and stderr are logged to the file specified.

10.1.15.4. pegasus.log.metrics

System: Pegasus
Since: 2.1.0
Type: Boolean
Default: true
See also: pegasus.log.metrics.file

This property enables the logging of certain planning and workflow metrics to a global log file. By default the file to which the metrics are logged is ${pegasus.home}/var/pegasus.log.

10.1.15.5. pegasus.log.metrics.file

System: Pegasus
Since: 2.1.0
Type: Boolean
Default: ${pegasus.home}/var/pegasus.log
See also: pegasus.log.metrics

This property determines the file to which the workflow and planning metrics are logged if enabled.

10.1.16. Miscellaneous Properties

10.1.16.1. pegasus.code.generator

System: Pegasus
Since: 3.0
Type: enumeration
Value[0]: Condor
Value[1]: Shell
Value[2]: PMC
Default: Condor

This property is used to load the appropriate Code Generator to use for writing out the executable workflow.

Condor
This is the default code generator for Pegasus . This generator generates the executable workflow as a Condor DAG file and associated job submit files. The Condor DAG file is passed as input to Condor DAGMan for job execution.
Shell
This Code Generator generates the executable workflow as a shell script that can be executed on the submit host. While using this code generator, all the jobs should be mapped to site local i.e specify --sites local to pegasus-plan.
PMC
This Code Generator generates the executable workflow as a PMC task workflow. This is useful to run on platforms where it not feasible to run Condor such as the new XSEDE machines such as Blue Waters. In this mode, Pegasus will generate the executable workflow as a PMC task workflow and a sample PBS submit script that submits this workflow.

10.1.16.2. pegasus.register

System: Pegasus
Since: 4.1.0
Type: Boolean
Default: true

Pegasus creates registration jobs to register the output files in the replica catalog. An output file is registered only if

1) a user has configured a replica catalog in the properties 2) the register flags for the output files in the DAX are set to true

This property can be used to turn off the creation of the registration jobs even though the files maybe marked to be registered in the replica catalog.

10.1.16.3. pegasus.job.priority.assign

System: Pegasus
Since: 3.0.3
Type: Boolean
Default: true

This property can be used to turn off the default level based condor priorities that are assigned to jobs in the executable workflow.

10.1.16.4. pegasus.file.cleanup.strategy

System: Pegasus
Since: 2.2
Type: enumeration
Value[0]: InPlace
Default: InPlace

This property is used to select the strategy of how the the cleanup nodes are added to the executable workflow.

InPlace
This is the only mode available .

10.1.16.5. pegasus.file.cleanup.impl

System: Pegasus
Since: 2.2
Type: enumeration
Value[0]: Cleanup
Value[1]: RM
Value[2]: S3
Default: Cleanup

This property is used to select the executable that is used to create the working directory on the compute sites.

Cleanup
The default executable that is used to delete files is the dirmanager executable shipped with Pegasus. It is found at $PEGASUS_HOME/bin/dirmanager in the pegasus distribution. An entry for transformation pegasus::dirmanager needs to exist in the Transformation Catalog or the PEGASUS_HOME environment variable should be specified in the site catalog for the sites for this mode to work.
RM
This mode results in the rm executable to be used to delete files from remote directories. The rm executable is standard on *nix systems and is usually found at /bin/rm location.
S3
This mode is used to delete files/objects from the buckets in S3 instead of a directory. This should be set when running workflows on Amazon EC2. This implementation relies on s3cmd command line client to create the bucket. An entry for transformation amazon::s3cmd needs to exist in the Transformation Catalog for this to work.

10.1.16.6. pegasus.file.cleanup.clusters.num

System: Pegasus
Since: 4.2
Type: Integer
Default: 2

In case of the InPlace strategy for adding the cleanup nodes to the workflow, this property specifies the maximum number of cleanup jobs that are added to the executable workflow on each level.

10.1.16.7. pegasus.file.cleanup.scope

System: Pegasus
Since: 2.3.0
Type: enumeration
Value[0]: fullahead
Value[1]: deferred
Default: fullahead

By default in case of deferred planning InPlace file cleanup is turned OFF. This is because the cleanup algorithm does not work across partitions. This property can be used to turn on the cleanup in case of deferred planning.

fullahead
This is the default scope. The pegasus cleanup algorithm does not work across partitions in deferred planning. Hence the cleanup is always turned OFF , when deferred planning occurs and cleanup scope is set to full ahead.
deferred
If the scope is set to deferred, then Pegasus will not disable file cleanup in case of deferred planning. This is useful for scenarios where the partitions themselves are independant ( i.e. dont share files ). Even if the scope is set to deferred, users can turn off cleanup by specifying --nocleanup option to pegasus-plan.

10.1.16.8. pegasus.catalog.transformation.mapper

System: Staging of Executables
Since: 2.0
Type: enumeration
Value[0]: All
Value[1]: Installed
Value[2]: Staged
Value[3]: Submit
Default: All
See also: pegasus.transformation.selector

Pegasus now supports transfer of statically linked executables as part of the concrete workflow. At present, there is only support for staging of executables referred to by the compute jobs specified in the DAX file. Pegasus determines the source locations of the binaries from the transformation catalog, where it searches for entries of type STATIC_BINARY for a particular architecture type. The PFN for these entries should refer to a globus-url-copy valid and accessible remote URL. For transfer of executables, Pegasus constructs a soft state map that resides on top of the transformation catalog, that helps in determining the locations from where an executable can be staged to the remote site.

This property determines, how that map is created.

All
In this mode, all sources with entries of type STATIC_BINARY for a particular transformation are considered valid sources for the transfer of executables. This the most general mode, and results in the constructing the map as a result of the cartesian product of the matches.
Installed
In this mode, only entries that are of type INSTALLED are used while constructing the soft state map. This results in Pegasus never doing any transfer of executables as part of the workflow. It always prefers the installed executables at the remote sites.
Staged
In this mode, only entries that are of type STATIC_BINARY are used while constructing the soft state map. This results in the concrete workflow referring only to the staged executables, irrespective of the fact that the executables are already installed at the remote end.
Submit
In this mode, only entries that are of type STATIC_BINARY and reside at the submit host (pool local), are used while constructing the soft state map. This is especially helpful, when the user wants to use the latest compute code for his computations on the grid and that relies on his submit host.

10.1.16.9. pegasus.selector.transformation

System: Staging of Executables
Since: 2.0
Type: enumeration
Value[0]: Random
Value[1]: Installed
Value[2]: Staged
Value[3]: Submit
Default: Random
See also: pegasus.catalog.transformation

In case of transfer of executables, Pegasus could have various transformations to select from when it schedules to run a particular compute job at a remote site. For e.g it can have the choice of staging an executable from a particular remote pool, from the local (submit host) only, use the one that is installed on the remote site only.

This property determines, how a transformation amongst the various candidate transformations is selected, and is applied after the property pegasus.tc has been applied. For e.g specifying pegasus.tc as Staged and then pegasus.transformation.selector as INSTALLED does not work, as by the time this property is applied, the soft state map only has entries of type STAGED.

Random
In this mode, a random matching candidate transformation is selected to be staged to the remote execution pool.
Installed
In this mode, only entries that are of type INSTALLED are selected. This means that the concrete workflow only refers to the transformations already pre installed on the remote pools.
Staged
In this mode, only entries that are of type STATIC_BINARY are selected, ignoring the ones that are installed at the remote site.
Submit
In this mode, only entries that are of type STATIC_BINARY and reside at the submit host (pool local), are selected as sources for staging the executables to the remote execution pools.

10.1.16.10. pegasus.execute.*.filesystem.local

System: Pegasus
Type: Boolean
Default: false
Since: 2.1.0
See also: pegasus.data.configuration

Normally, Pegasus transfers the data to and from a directory on the shared filesystem on the head node of a compute site. The directory needs to be visible to both the head node and the worker nodes for the compute jobs to execute correctly.

By setting this property to true, you can get Pegasus to execute jobs on the worker node filesystem. In this case, when the jobs are launched on the worker nodes, the jobs grab the input data from the workflow specific execution directory on the compute site and push the output data to the same directory after completion. The transfer of data to and from the worker node directory is referred to as Second Level Staging ( SLS ).

10.1.16.11. pegasus.parser.dax.preserver.linebreaks

System: Pegasus
Type: Boolean
Default: false
Since: 2.2.0

The DAX Parser normally does not preserve line breaks while parsing the CDATA section that appears in the arguments section of the job element in the DAX. On setting this to true, the DAX Parser preserves any line line breaks that appear in the CDATA section.

10.2. Profiles

The Pegasus Workflow Mapper uses the concept of profiles to encapsulate configurations for various aspects of dealing with the Grid infrastructure. Profiles provide an abstract yet uniform interface to specify configuration options for various layers from planner/mapper behavior to remote environment settings. At various stages during the mapping process, profiles may be added associated with the job.

This document describes various types of profiles, levels of priorities for intersecting profiles, and how to specify profiles in different contexts.

10.2.1. Profile Structure Heading

All profiles are triples comprised of a namespace, a name or key, and a value. The namespace is a simple identifier. The key has only meaning within its namespace, and it’s yet another identifier. There are no constraints on the contents of a value

Profiles may be represented with different syntaxes in different context. However, each syntax will describe the underlying triple.

10.2.2. Profile Namespaces

Each namespace refers to a different aspect of a job’s runtime settings. A profile’s representation in the concrete plan (e.g. the Condor submit files) depends its namespace. Pegasus supports the following Namespaces for profiles:

  • env permits remote environment variables to be set.

  • globus sets Globus RSL parameters.

  • condor sets Condor configuration parameters for the submit file.

  • dagman introduces Condor DAGMan configuration parameters.

  • pegasus configures the behaviour of various planner/mapper components.

10.2.2.1. The env Profile Namespace

The env namespace allows users to specify environment variables of remote jobs. Globus transports the environment variables, and ensure that they are set before the job starts.

The key used in conjunction with an env profile denotes the name of the environment variable. The value of the profile becomes the value of the remote environment variable.

Grid jobs usually only set a minimum of environment variables by virtue of Globus. You cannot compare the environment variables visible from an interactive login with those visible to a grid job. Thus, it often becomes necessary to set environment variables like LD_LIBRARY_PATH for remote jobs.

If you use any of the Pegasus worker package tools like transfer or the rc-client, it becomes necessary to set PEGASUS_HOME and GLOBUS_LOCATION even for jobs that run locally

Table 10.1. Table 1: Useful Environment Settings

Environment Variable Description
PEGASUS_HOME Used by auxillary jobs created by Pegasus both on remote site and local site. Should be set usually set in the Site Catalog for the sites
GLOBUS_LOCATION Used by auxillary jobs created by Pegasus both on remote site and local site. Should be set usually set in the Site Catalog for the sites
LD_LIBRARY_PATH Point this to $GLOBUS_LOCATION/lib, except you cannot use the dollar variable. You must use the full path. Applies to both, local and remote jobs that use Globus components and should be usually set in the site catalog for the sites

Even though Condor and Globus both permit environment variable settings through their profiles, all remote environment variables must be set through the means of env profiles.

10.2.2.2. The Globus Profile Namespace

The globus profile namespace encapsulates Globus resource specification language (RSL) instructions. The RSL configures settings and behavior of the remote scheduling system. Some systems require queue name to schedule jobs, a project name for accounting purposes, or a run-time estimate to schedule jobs. The Globus RSL addresses all these issues.

A key in the globus namespace denotes the command name of an RLS instruction. The profile value becomes the RSL value. Even though Globus RSL is typically shown using parentheses around the instruction, the out pair of parentheses is not necessary in globus profile specifications

Table 2 shows some commonly used RSL instructions. For an authoritative list of all possible RSL instructions refer to the Globus RSL specification.

Table 10.2. Table 2: Useful Globus RSL Instructions

Key Description
count the number of times an executable is started.
jobtype specifies how the job manager should start the remote job. While Pegasus defaults to single, use mpi when running MPI jobs.
maxcputime the max cpu time for a single execution of a job.
maxmemory the maximum memory in MB required for the job
maxtime the maximum time or walltime for a single execution of a job.
maxwalltime the maximum walltime for a single execution of a job.
minmemory the minumum amount of memory required for this job
project associates an account with a job at the remote end.
queue the remote queue in which the job should be run. Used when remote scheduler is PBS that supports queues.

Pegasus prevents the user from specifying certain RSL instructions as globus profiles, because they are either automatically generated or can be overridden through some different means. For instance, if you need to specify remote environment settings, do not use the environment key in the globus profiles. Use one or more env profiles instead.

Table 10.3. Table 3: RSL Instructions that are not permissible

Key Reason for Prohibition
arguments you specify arguments in the arguments section for a job in the DAX
directory the site catalog and properties determine which directory a job will run in.
environment use multiple env profiles instead
executable the physical executable to be used is specified in the transformation catalog and is also dependant on the gridstart module being used. If you are launching jobs via kickstart then the executable created is the path to kickstart and the application executable path appears in the arguments for kickstart
stdin you specify in the DAX for the job
stdout you specify in the DAX for the job
stderr you specify in the DAX for the job

10.2.2.3. The Condor Profile Namespace

The Condor submit file controls every detail how and where a job is run. The condor profiles permit to add or overwrite instructions in the Condor submit file.

The condor namespace directly sets commands in the Condor submit file for a job the profile applies to. Keys in the condor profile namespace denote the name of the Condor command. The profile value becomes the command's argument. All condor profiles are translated into key=value lines in the Condor submit file

Some of the common condor commands that a user may need to specify are listed below. For an authoritative list refer to the online condor documentation. Note: Pegasus Workflow Planner/Mapper by default specify a lot of condor commands in the submit files depending upon the job, and where it is being run.

Table 10.4. Table 4: Useful Condor Commands

Key Description
universe Pegasus defaults to either globus or scheduler universes. Set to standard for compute jobs that require standard universe. Set to vanilla to run natively in a condor pool, or to run on resources grabbed via condor glidein.
periodic_release is the number of times job is released back to the queue if it goes to HOLD, e.g. due to Globus errors. Pegasus defaults to 3.
periodic_remove is the number of times a job is allowed to get into HOLD state before being removed from the queue. Pegasus defaults to 3.
filesystemdomain Useful for Condor glide-ins to pin a job to a remote site.
stream_error boolean to turn on the streaming of the stderr of the remote job back to submit host.
stream_output boolean to turn on the streaming of the stdout of the remote job back to submit host.
priority integer value to assign the priority of a job. Higher value means higher priority. The priorities are only applied for vanilla / standard/ local universe jobs. Determines the order in which a users own jobs are executed.
request_cpus New in Condor 7.8.0 . Number of CPU's a job requires.
request_memory New in Condor 7.8.0 . Amount of memory a job requires.
request_disk New in Condor 7.8.0 . Amount of disk a job requires.

Other useful condor keys, that advanced users may find useful and can be set by profiles are

  1. should_transfer_files

  2. transfer_output

  3. transfer_error

  4. whentotransferoutput

  5. requirements

  6. rank

Pegasus prevents the user from specifying certain Condor commands in condor profiles, because they are automatically generated or can be overridden through some different means. Table 5 shows prohibited Condor commands.

Table 10.5. Table 5: Condor commands prohibited in condor profiles

Key Reason for Prohibition
arguments you specify arguments in the arguments section for a job in the DAX
environment use multiple env profiles instead
executable the physical executable to be used is specified in the transformation catalog and is also dependant on the gridstart module being used. If you are launching jobs via kickstart then the executable created is the path to kickstart and the application executable path appears in the arguments for kickstart

10.2.2.4. The Dagman Profile Namespace

DAGMan is Condor's workflow manager. While planners generate most of DAGMan's configuration, it is possible to tweak certain job-related characteristics using dagman profiles. A dagman profile can be used to specify a DAGMan pre- or post-script.

Pre- and post-scripts execute on the submit machine. Both inherit the environment settings from the submit host when pegasus-submit-dag or pegasus-run is invoked.

By default, kickstart launches all jobs except standard universe and MPI jobs. Kickstart tracks the execution of the job, and returns usage statistics for the job. A DAGMan post-script starts the Pegasus application exitcode to determine, if the job succeeded. DAGMan receives the success indication as exit status from exitcode.

If you need to run your own post-script, you have to take over the job success parsing. The planner is set up to pass the file name of the remote job's stdout, usually the output from kickstart, as sole argument to the post-script.

Table 6 shows the keys in the dagman profile domain that are understood by Pegasus and can be associated at a per job basis.

Table 10.6. Table 6: Useful dagman Commands that can be associated at a per job basis

Key Description
PRE is the path to the pre-script. DAGMan executes the pre-script before it runs the job.
PRE.ARGUMENTS are command-line arguments for the pre-script, if any.
POST is the postscript type/mode that a user wants to associate with a job.
  1. pegasus-exitcode - pegasus will by default associate this postscript with all jobs launched via kickstart, as long the POST.SCOPE value is not set to NONE.

  2. none -means that no postscript is generated for the jobs. This is useful for MPI jobs that are not launched via kickstart currently.

  3. any legal identifier - Any other identifier of the form ([_A-Za-z][_A-Za-z0-9]*), than one of the 2 reserved keywords above, signifies a user postscript. This allows the user to specify their own postscript for the jobs in the workflow. The path to the postscript can be specified by the dagman profile POST.PATH.[value] where [value] is this legal identifier specified. The user postscript is passed the name of the .out file of the job as the last argument on the command line.

    For e.g. if the following dagman profiles were associated with a job X

    1. POST with value user_script /bin/user_postscript

    2. POST.PATH.user_script with value /path/to/user/script

    3. POST.ARGUMENTS with value -verbose

    then the following postscript will be associated with the job X in the .dag file

    /path/to/user/script -verbose X.out where X.out contains the stdout of the job X

POST.PATH.* ( where * is replaced by the value of the POST Profile ) the path to the post script on the submit host.
POST.ARGUMENTS are the command line arguments for the post script, if any.
RETRY is the number of times DAGMan retries the full job cycle from pre-script through post-script, if failure was detected.
CATEGORY the DAGMan category the job belongs to.
PRIORITY the priority to apply to a job. DAGMan uses this to select what jobs to release when MAXJOBS is enforced for the DAG.


Table 7 shows the keys in the dagman profile domain that are understood by Pegasus and can be used to apply to the whole workflow. These are used to control DAGMan's behavior at the workflow level, and are recommended to be specified in the properties file.

Table 10.7. Table 7: Useful dagman Commands that can be specified in the properties file.

Key Description
MAXPRE sets the maximum number of PRE scripts within the DAG that may be running at one time
MAXPOST sets the maximum number of PRE scripts within the DAG that may be running at one time
MAXJOBS sets the maximum number of jobs within the DAG that will be submitted to Condor at one time.
MAXIDLE sets the maximum number of idle jobs within the DAG that will be submitted to Condor at one time.
[CATEGORY-NAME].MAXJOBS is the value of maxjobs for a particular category. Users can associate different categories to the jobs at a per job basis. However, the value of a dagman knob for a category can only be specified at a per workflow basis in the properties.
POST.SCOPE scope for the postscripts.
  1. If set to all , means each job in the workflow will have a postscript associated with it.

  2. If set to none , means no job has postscript associated with it. None mode should be used if you are running vanilla / standard/ local universe jobs, as in those cases Condor traps the remote exitcode correctly. None scope is not recommended for grid universe jobs.

  3. If set to essential, means only essential jobs have post scripts associated with them. At present the only non essential job is the replica registration job.


10.2.2.5. The Pegasus Profile Namespace

The pegasus profiles allow users to configure extra options to the Pegasus Workflow Planner that can be applied selectively to a job or a group of jobs. Site selectors may use a sub-set of pegasus profiles for their decision-making.

Table 8 shows some of the useful configuration option Pegasus understands.

Table 10.8. Table 8: Useful pegasus Profiles.

Key Description
workdir Sets the remote initial dir for a Condor-G job. Overrides the work directory algorithm that uses the site catalog and properties.
clusters.num Please refer to the Pegasus Clustering Guide for detailed description. This option determines the total number of clusters per level. Jobs are evenly spread across clusters.
clusters.size Please refer to the Pegasus Clustering Guide for detailed description. This profile determines the number of jobs in each cluster. The number of clusters depends on the total number of jobs on the level.
cores The number of cores, associated with the job. This is solely used for accounting purposes in the database while generating statistics. It corresponds to the multiplier_factor in the job_instance table described here.
runtime Please refer to the Pegasus Clustering Guide for detailed description. This profile specifies the expected runtime of a job.
clusters.maxruntime Please refer to the Pegasus Clustering Guide for detailed description. This profile specifies the maximum runtime of a job.
job.aggregator Indicates the clustering executable that is used to run the clustered job on the remote site.
gridstart Determines the executable for launching a job. Possible values are Kickstart | NoGridStart at the moment.
gridstart.path Sets the path to the gridstart . This profile is best set in the Site Catalog.
gridstart.arguments Sets the arguments with which GridStart is used to launch a job on the remote site.
stagein.clusters This key determines the maximum number of stage-in jobs that are can executed locally or remotely per compute site per workflow. This is used to configure the Bundle Transfer Refiner, which is the Default Refiner used in Pegasus. This profile is best set in the Site Catalog or in the Properties file
stagein.local.clusters This key provides finer grained control in determining the number of stage-in jobs that are executed locally and are responsible for staging data to a particular remote site. This profile is best set in the Site Catalog or in the Properties file
stagein.remote.clusters This key provides finer grained control in determining the number of stage-in jobs that are executed remotely on the remote site and are responsible for staging data to it. This profile is best set in the Site Catalog or in the Properties file
stageout.clusters This key determines the maximum number of stage-out jobs that are can executed locally or remotely per compute site per workflow. This is used to configure the Bundle Transfer Refiner, , which is the Default Refiner used in Pegasus.
stageout.local.clusters This key provides finer grained control in determining the number of stage-out jobs that are executed locally and are responsible for staging data from a particular remote site. This profile is best set in the Site Catalog or in the Properties file
stageout.remote.clusters This key provides finer grained control in determining the number of stage-out jobs that are executed remotely on the remote site and are responsible for staging data from it. This profile is best set in the Site Catalog or in the Properties file
group Tags a job with an arbitrary group identifier. The group site selector makes use of the tag.
change.dir If true, tells kickstart to change into the remote working directory. Kickstart itself is executed in whichever directory the remote scheduling system chose for the job.
create.dir If true, tells kickstart to create the the remote working directory before changing into the remote working directory. Kickstart itself is executed in whichever directory the remote scheduling system chose for the job.
transfer.proxy If true, tells Pegasus to explicitly transfer the proxy for transfer jobs to the remote site. This is useful, when you want to use a full proxy at the remote end, instead of the limited proxy that is transferred by CondorG.
transfer.arguments Allows the user to specify the arguments with which the transfer executable is invoked. However certain options are always generated for the transfer executable(base-uri se-mount-point).
style Sets the condor submit file style. If set to globus, submit file generated refers to CondorG job submissions. If set to condor, submit file generated refers to direct Condor submission to the local Condor pool. It applies for glidein, where nodes from remote grid sites are glided into the local condor pool. The default style that is applied is globus.
pmc_request_memory This key is used to set the -m option for pegasus-mpi-cluster. It specifies the amount of memory in MB that a job requires. This profile is usually set in the DAX for each job.
pmc_request_cpus This key is used to set the -c option for pegasus-mpi-cluster. It specifies the number of cpu's that a job requires. This profile is usually set in the DAX for each job.
pmc_priority This key is used to set the -p option for pegasus-mpi-cluster. It specifies the priority for a job . This profile is usually set in the DAX for each job. Negative values are allowed for priorities.
pmc_task_arguments The key is used to pass any extra arguments to the PMC task during the planning time. They are added to the very end of the argument string constructed for the task in the PMC file. Hence, allows for overriding of any argument constructed by the planner for any particular task in the PMC job.

10.2.3. Sources for Profiles

Profiles may enter the job-processing stream at various stages. Depending on the requirements and scope a profile is to apply, profiles can be associated at

  • as user property settings.

  • dax level

  • in the site catalog

  • in the transformation catalog

Unfortunately, a different syntax applies to each level and context. This section shows the different profile sources and syntaxes. However, at the foundation of each profile lies the triple of namespace, key and value.

10.2.3.1. User Profiles in Properties

Users can specify all profiles in the properties files where the property name is [namespace].key and value of the property is the value of the profile.

Namespace can be env|condor|globus|dagman|pegasus

Any profile specified as a property applies to the whole workflow unless overridden at the DAX level , Site Catalog , Transformation Catalog Level.

Some profiles that they can be set in the properties file are listed below

env.JAVA_HOME "/software/bin/java"

condor.periodic_release 5
condor.periodic_remove  my_own_expression
condor.stream_error true
condor.stream_output fa

globus.maxwalltime  1000
globus.maxtime      900
globus.maxcputime   10
globus.project      test_project
globus.queue        main_queue

dagman.post.arguments --test arguments
dagman.retry  4
dagman.post simple_exitcode
dagman.post.path.simple_exitcode  /bin/exitcode/exitcode.sh
dagman.post.scope all
dagman.maxpre  12
dagman.priority 13

dagman.bigjobs.maxjobs 1


pegasus.clusters.size 5

pegasus.stagein.clusters 3

10.2.3.2. Profiles in DAX

The user can associate profiles with logical transformations in DAX. Environment settings required by a job's application, or a maximum estimate on the run-time are examples for profiles at this stage.

<job id="ID000001" namespace="asdf" name="preprocess" version="1.0"
 level="3" dv-namespace="voeckler" dv-name="top" dv-version="1.0">
  <argument>-a top -T10  -i <filename file="voeckler.f.a"/>
 -o <filename file="voeckler.f.b1"/>
 <filename file="voeckler.f.b2"/></argument>
  <profile namespace="pegasus" key="walltime">2</profile>
  <profile namespace="pegasus" key="diskspace">1</profile>
  &mldr;
</job>

10.2.3.3. Profiles in Site Catalog

If it becomes necessary to limit the scope of a profile to a single site, these profiles should go into the site catalog. A profile in the site catalog applies to all jobs and all application run at the site. Commonly, site catalog profiles set environment settings like the LD_LIBRARY_PATH, or globus rsl parameters like queue and project names.

Currently, there is no tool to manipulate the site catalog, e.g. by adding profiles. Modifying the site catalog requires that you load it into your editor.

The XML version of the site catalog uses the following syntax:

<profile namespace="namespace" key="key">value</profile>

The XML schema requires that profiles are the first children of a pool element. If the element ordering is wrong, the XML parser will produce errors and warnings:

<pool handle="isi_condor" gridlaunch="/home/shared/pegasus/bin/kickstart">
  <profile namespace="env"
   key="GLOBUS_LOCATION">/home/shared/globus/</profile>
  <profile namespace="env"
   key="LD_LIBRARY_PATH" >/home/shared/globus/lib</profile>
  <lrc url="rls://sukhna.isi.edu" />
  &mldr;
</pool>

The multi-line textual version of the site catalog uses the following syntax:

profile namespace "key" "value"

The order within the textual pool definition is not important. Profiles can appear anywhere:

pool isi_condor {
  gridlaunch "/home/shared/pegasus/bin/kickstart"
  profile env "GLOBUS_LOCATION" "/home/shared/globus"
  profile env "LD_LIBRARY_PATH" "/home/shared/globus/lib"
  &mldr;
}

10.2.3.4. Profiles in Transformation Catalog

Some profiles require a narrower scope than the site catalog offers. Some profiles only apply to certain applications on certain sites, or change with each application and site. Transformation-specific and CPU-specific environment variables, or job clustering profiles are good candidates. Such profiles are best specified in the transformation catalog.

Profiles associate with a physical transformation and site in the transformation catalog. The Database version of the transformation catalog also permits the convenience of connecting a transformation with a profile.

The Pegasus tc-client tool is a convenient helper to associate profiles with transformation catalog entries. As benefit, the user does not have to worry about formats of profiles in the various transformation catalog instances.

tc-client -a -P -E -p /home/shared/executables/analyze -t INSTALLED -r isi_condor -e env::GLOBUS_LOCATION=&rdquor;/home/shared/globus&rdquor;

The above example adds an environment variable GLOBUS_LOCATION to the application /home/shared/executables/analyze on site isi_condor. The transformation catalog guide has more details on the usage of the tc-client.

10.2.4. Profiles Conflict Resolution

Irrespective of where the profiles are specified, eventually the profiles are associated with jobs. Multiple sources may specify the same profile for the same job. For instance, DAX may specify an environment variable X. The site catalog may also specify an environment variable X for the chosen site. The transformation catalog may specify an environment variable X for the chosen site and application. When the job is concretized, these three conflicts need to be resolved.

Pegasus defines a priority ordering of profiles. The higher priority takes precedence (overwrites) a profile of a lower priority.

  1. Transformation Catalog Profiles

  2. Site Catalog Profiles

  3. DAX Profiles

  4. Profiles in Properties

10.2.5. Details of Profile Handling

The previous sections omitted some of the finer details for the sake of clarity. To understand some of the constraints that Pegasus imposes, it is required to look at the way profiles affect jobs.

10.2.5.1. Details of env Profiles

Profiles in the env namespace are translated to a semicolon-separated list of key-value pairs. The list becomes the argument for the Condor environment command in the job's submit file.

######################################################################
# Pegasus WMS  SUBMIT FILE GENERATOR
# DAG : black-diamond, Index = 0, Count = 1
# SUBMIT FILE NAME : findrange_ID000002.sub
######################################################################
globusrsl = (jobtype=single)
environment=GLOBUS_LOCATION=/shared/globus;LD_LIBRARY_PATH=/shared/globus/lib;
executable = /shared/software/linux/pegasus/default/bin/kickstart
globusscheduler = columbus.isi.edu/jobmanager-condor
remote_initialdir = /shared/CONDOR/workdir/isi_hourglass
universe = globus
&mldr;
queue
######################################################################
# END OF SUBMIT FILE

Condor-G, in turn, will translate the environment command for any remote job into Globus RSL environment settings, and append them to any existing RSL syntax it generates. To permit proper mixing, all environment setting should solely use the env profiles, and none of the Condor nor Globus environment settings.

If kickstart starts a job, it may make use of environment variables in its executable and arguments setting.

10.2.5.2. Details of globus Profiles

Profiles in the globus Namespaces are translated into a list of paranthesis-enclosed equal-separated key-value pairs. The list becomes the value for the Condor globusrsl setting in the job's submit file:

######################################################################
# Pegasus WMS SUBMIT FILE GENERATOR
# DAG : black-diamond, Index = 0, Count = 1
# SUBMIT FILE NAME : findrange_ID000002.sub
######################################################################
globusrsl = (jobtype=single)(queue=fast)(project=nvo)
executable = /shared/software/linux/pegasus/default/bin/kickstart
globusscheduler = columbus.isi.edu/jobmanager-condor
remote_initialdir = /shared/CONDOR/workdir/isi_hourglass
universe = globus
&mldr;
queue
######################################################################
# END OF SUBMIT FILE

For this reason, Pegasus prohibits the use of the globusrsl key in the condor profile namespace.

10.3. Replica Selection

Each job in the DAX maybe associated with input LFN&rsquor;s denoting the files that are required for the job to run. To determine the physical replica (PFN) for a LFN, Pegasus queries the Replica catalog to get all the PFN&rsquor;s (replicas) associated with a LFN. The Replica Catalog may return multiple PFN's for each of the LFN's queried. Hence, Pegasus needs to select a single PFN amongst the various PFN's returned for each LFN. This process is known as replica selection in Pegasus. Users can specify the replica selector to use in the properties file.

This document describes the various Replica Selection Strategies in Pegasus.

10.3.1. Configuration

The user properties determine what replica selector Pegasus Workflow Mapper uses. The property pegasus.selector.replica is used to specify the replica selection strategy. Currently supported Replica Selection strategies are

  1. Default

  2. Restricted

  3. Regex

The values are case sensitive. For example the following property setting will throw a Factory Exception .

pegasus.selector.replica  default

The correct way to specify is

pegasus.selector.replica  Default

10.3.2. Supported Replica Selectors

The various Replica Selectors supported in Pegasus Workflow Mapper are explained below

10.3.2.1. Default

This is the default replica selector used in the Pegasus Workflow Mapper. If the property pegasus.selector.replica is not defined in properties, then Pegasus uses this selector.

This selector looks at each PFN returned for a LFN and checks to see if

  1. the PFN is a file URL (starting with file:///)

  2. the PFN has a pool attribute matching to the site handle of the site where the compute job that requires the input file is to be run.

If a PFN matching the conditions above exists then that is returned by the selector .

Else, a random PFN is selected amongst all the PFN&rsquor;s that have a pool attribute matching to the site handle of the site where a compute job is to be run.

Else, a random pfn is selected amongst all the PFN&rsquor;s

To use this replica selector set the following property

pegasus.selector.replica                  Default

10.3.2.2. Restricted

This replica selector, allows the user to specify good sites and bad sites for staging in data to a particular compute site. A good site for a compute site X, is a preferred site from which replicas should be staged to site X. If there are more than one good sites having a particular replica, then a random site is selected amongst these preferred sites.

A bad site for a compute site X, is a site from which replica&rsquor;s should not be staged. The reason of not accessing replica from a bad site can vary from the link being down, to the user not having permissions on that site&rsquor;s data.

The good | bad sites are specified by the following properties

pegasus.replica.*.prefer.stagein.sites
pegasus.replica.*.ignore.stagein.sites

where the * in the property name denotes the name of the compute site. A * in the property key is taken to mean all sites. The value to these properties is a comma separated list of sites.

For example the following settings

pegasus.selector.replica.*.prefer.stagein.sites            usc
pegasus.replica.uwm.prefer.stagein.sites                   isi,cit

means that prefer all replicas from site usc for staging in to any compute site. However, for uwm use a tighter constraint and prefer only replicas from site isi or cit. The pool attribute associated with the PFN's tells the replica selector to what site a replica/PFN is associated with.

The pegasus.replica.*.prefer.stagein.sites property takes precedence over pegasus.replica.*.ignore.stagein.sites property i.e. if for a site X, a site Y is specified both in the ignored and the preferred set, then site Y is taken to mean as only a preferred site for a site X.

To use this replica selector set the following property

pegasus.selector.replica                  Restricted

10.3.2.3. Regex

This replica selector allows the user allows the user to specific regex expressions that can be used to rank various PFN&rsquor;s returned from the Replica Catalog for a particular LFN. This replica selector selects the highest ranked PFN i.e the replica with the lowest rank value.

The regular expressions are assigned different rank, that determine the order in which the expressions are employed. The rank values for the regex can expressed in user properties using the property.

pegasus.selector.replica.regex.rank.[value]                  regex-expression

The [value] in the above property is an integer value that denotes the rank of an expression with a rank value of 1 being the highest rank.

For example, a user can specify the following regex expressions that will ask Pegasus to prefer file URL's over gsiftp url's from example.isi.edu

pegasus.selector.replica.regex.rank.1                       file://.*
pegasus.selector.replica.regex.rank.2                       gsiftp://example\.isi\.edu.*

User can specify as many regex expressions as they want.

Since Pegasus is in Java , the regex expression support is what Java supports. It is pretty close to what is supported by Perl. More details can be found at http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html

Before applying any regular expressions on the PFN&rsquor;s for a particular LFN that has to be staged to a site X, the file URL&rsquor;s that don't match the site X are explicitly filtered out.

To use this replica selector set the following property

pegasus.selector.replica                  Regex

10.3.2.4. Local

This replica selector always prefers replicas from the local host ( pool attribute set to local ) and that start with a file: URL scheme. It is useful, when users want to stagein files to a remote site from the submit host using the Condor file transfer mechanism.

To use this replica selector set the following property

pegasus.selector.replica                  Default

10.4. Job Clustering

A large number of workflows executed through the Pegasus Workflow Management System, are composed of several jobs that run for only a few seconds or so. The overhead of running any job on the grid is usually 60 seconds or more. Hence, it makes sense to cluster small independent jobs into a larger job. This is done while mapping an abstract workflow to an executable workflow. Site specific or transformation specific criteria are taken into consideration while clustering smaller jobs into a larger job in the executable workflow. The user is allowed to control the granularity of this clustering on a per transformation per site basis.

10.4.1. Overview

The abstract workflow is mapped onto the various sites by the Site Selector. This semi executable workflow is then passed to the clustering module. The clustering of the workflow can be either be

  • level based (horizontal clustering )

  • label based (label clustering)

The clustering module clusters the jobs into larger/clustered jobs, that can then be executed on the remote sites. The execution can either be sequential on a single node or on multiple nodes using MPI. To specify which clustering technique to use the user has to pass the --cluster option to pegasus-plan .

10.4.1.1. Generating Clustered Executable Workflow

The clustering of a workflow is activated by passing the --cluster|-C option to pegasus-plan. The clustering granularity of a particular logical transformation on a particular site is dependant upon the clustering techniques being used. The executable that is used for running the clustered job on a particular site is determined as explained in section 7.

#Running pegasus-plan to generate clustered workflows

$ pegasus-plan --dax example.dax --dir ./dags -p siteX --output local
               --cluster [comma separated list of clustering techniques]  -verbose

Valid clustering techniques are horizontal and label.

The naming convention of submit files of the clustered jobs is merge_NAME_IDX.sub . The NAME is derived from the logical transformation name. The IDX is an integer number between 1 and the total number of jobs in a cluster. Each of the submit files has a corresponding input file, following the naming convention merge_NAME_IDX.in . The input file contains the respective execution targets and the arguments for each of the jobs that make up the clustered job.

10.4.1.1.1. Horizontal Clustering

In case of horizontal clustering, each job in the workflow is associated with a level. The levels of the workflow are determined by doing a modified Breadth First Traversal of the workflow starting from the root nodes. The level associated with a node, is the furthest distance of it from the root node instead of it being the shortest distance as in normal BFS. For each level the jobs are grouped by the site on which they have been scheduled by the Site Selector. Only jobs of same type (txnamespace, txname, txversion) can be clustered into a larger job. To use horizontal clustering the user needs to set the --cluster option of pegasus-plan to horizontal .

10.4.1.1.1.1. Controlling Clustering Granularity

The number of jobs that have to be clustered into a single large job, is determined by the value of two parameters associated with the smaller jobs. Both these parameters are specified by the use of a PEGASUS namespace profile keys. The keys can be specified at any of the placeholders for the profiles (abstract transformation in the DAX, site in the site catalog, transformation in the transformation catalog). The normal overloading semantics apply i.e. profile in transformation catalog overrides the one in the site catalog and that in turn overrides the one in the DAX. The two parameters are described below.

  • clusters.size factor

    The clusters.size factor denotes how many jobs need to be merged into a single clustered job. It is specified via the use of a PEGASUS namespace profile key &ldquo;clusters.size&rdquor;. for e.g. if at a particular level, say 4 jobs referring to logical transformation B have been scheduled to a siteX. The clusters.size factor associated with job B for siteX is say 3. This will result in 2 clustered jobs, one composed of 3 jobs and another of 2 jobs. The clusters.size factor can be specified in the transformation catalog as follows

    #site   transformation   pfn            type               architecture  profiles
    
    siteX    B     /shared/PEGASUS/bin/jobB INSTALLED       INTEL32::LINUX  PEGASUS::clusters.size=3
    siteX    C     /shared/PEGASUS/bin/jobC INSTALLED       INTEL32::LINUX  PEGASUS::clusters.size=2
    

    Figure 10.1. Clustering by clusters.size

    Clustering by clusters.size

  • clusters.num factor

    The clusters.num factor denotes how many clustered jobs does the user want to see per level per site. It is specified via the use of a PEGASUS namespace profile key &ldquo;clusters.num&rdquor;. for e.g. if at a particular level, say 4 jobs referring to logical transformation B have been scheduled to a siteX. The &ldquo;clusters.num&rdquor; factor associated with job B for siteX is say 3. This will result in 3 clustered jobs, one composed of 2 jobs and others of a single job each. The clusters.num factor in the transformation catalog can be specified as follows

    #site  transformation      pfn           type            architecture    profiles
    
    siteX    B     /shared/PEGASUS/bin/jobB INSTALLED       INTEL32::LINUX  PEGASUS::clusters.num=3
    siteX    C     /shared/PEGASUS/bin/jobC INSTALLED       INTEL32::LINUX  PEGASUS::clusters.num=2
    

    In the case, where both the factors are associated with the job, the clusters.num value supersedes the clusters.size value.

    #site  transformation   pfn             type             architecture   profiles
    
    siteX    B     /shared/PEGASUS/bin/jobB INSTALLED       INTEL32::LINUX PEGASUS::clusters.size=3,clusters.num=3
    

    In the above case the jobs referring to logical transformation B scheduled on siteX will be clustered on the basis of &ldquo;clusters.num&rdquor; value. Hence, if there are 4 jobs referring to logical transformation B scheduled to siteX, then 3 clustered jobs will be created.

    Figure 10.2. Clustering by clusters.num

    Clustering by clusters.num

10.4.1.1.2. Runtime Clustering

Workflows often consist of jobs of same type, but have varying run times. Two or more instances of the same job, with varying inputs can differ significantly in their runtimes. A simple way to think about this is running the same program on two distinct input sets, where one input is smaller (1 MB) as compared to the other which is 10 GB in size. In such a case the two jobs will having significantly differing run times. When such jobs are clustered using horizontal clustering, the benefits of job clustering may be lost if all smaller jobs get clustered together, while the larger jobs are clustered together. In such scenarios it would be beneficial to be able to cluster jobs together such that all clustered jobs have similar runtimes.

In case of runtime clustering, jobs in the workflow are associated with a level. The levels of the workflow are determined in the same manner as in horizontal clustering. For each level the jobs are grouped by the site on which they have been scheduled by the Site Selector. Only jobs of same type (txnamespace, txname, txversion) can be clustered into a larger job. To use runtime clustering the user needs to set the --cluster option of pegasus-plan to horizontal.

Basic Algorithm of grouping jobs into clusters is as follows

// cluster.maxruntime - Is the maximum runtime for which the clustered job should run.
// j.runtime - Is the runtime of the job j.
1. Create a set of jobs of the same type (txnamespace, txname, txversion), and that run on the same site.
2. Sort the jobs in decreasing order of their runtime.
3. For each job j, repeat
  a. If j.runtime > cluster.maxruntime then 
        ignore j.
  // Sum of runtime of jobs already in the bin + j.runtime <= cluster.maxruntime
  b. If j can be added to any existing bin (clustered job) then 
        Add j to bin
     Else
        Add a new bin
        Add job j to newly added bin

The runtime of a job, and maximum runtime for which a clustered jobs should run, is determined by the value of two parameters associated with the jobs.

  • runtime

    expected runtime for a job

  • clusters.maxruntime

    maxruntime for the clustered job

Both these parameters are specified by the use of a PEGASUS namespace profile keys. The keys can be specified at any of the placeholders for the profiles (abstract transformation in the DAX, site in the site catalog, transformation in the transformation catalog). The normal overloading semantics apply i.e. profile in transformation catalog overrides the one in the site catalog and that in turn overrides the one in the DAX. The two parameters are described below.

#site  transformation   pfn             type             architecture   profiles

siteX    B     /shared/PEGASUS/bin/jobB INSTALLED       INTEL32::LINUX PEGASUS::clusters.maxruntime=250,runtime=100
siteX    C     /shared/PEGASUS/bin/jobC INSTALLED       INTEL32::LINUX PEGASUS::clusters.maxruntime=300,runtime=100

Figure 10.3. Clustering by runtime

Clustering by runtime

In the above case the jobs referring to logical transformation B scheduled on siteX will be clustered such that all clustered jobs will run approximately for the same duration specified by the clusters.maxruntime property. In the above case we assume all jobs referring to transformation B run for 100 seconds. For jobs with significantly differing runtime, the runtime property will be associated with the jobs in the DAX.

In addition to the above two profiles, we need to inform pegasus-plan to use runtime clustering. This is done by setting the following property .

 pegasus.clusterer.preference          Runtime 

10.4.1.1.3. Label Clustering

In label based clustering, the user labels the workflow. All jobs having the same label value are clustered into a single clustered job. This allows the user to create clusters or use a clustering technique that is specific to his workflows. If there is no label associated with the job, the job is not clustered and is executed as is

Figure 10.4. Label-based clustering

Label-based clustering


Since, the jobs in a cluster in this case are not independent, it is important the jobs are executed in the correct order. This is done by doing a topological sort on the jobs in each cluster. To use label based clustering the user needs to set the --cluster option of pegasus-plan to label.

10.4.1.1.3.1. Labelling the Workflow

The labels for the jobs in the workflow are specified by associated pegasus profile keys with the jobs during the DAX generation process. The user can choose which profile key to use for labeling the workflow. By default, it is assumed that the user is using the PEGASUS profile key label to associate the labels. To use another key, in the pegasus namespace the user needs to set the following property

  • pegasus.clusterer.label.key

For example if the user sets pegasus.clusterer.label.key to user_label then the job description in the DAX looks as follows

<adag >
...
  <job id="ID000004" namespace="app" name="analyze" version="1.0" level="1" >
    <argument>-a bottom -T60  -i <filename file="user.f.c1"/>  -o <filename file="user.f.d"/></argument>
    <profile namespace="pegasus" key="user_label">p1</profile>
    <uses file="user.f.c1" link="input" dontRegister="false" dontTransfer="false"/>
    <uses file="user.f.c2" link="input" dontRegister="false" dontTransfer="false"/>
    <uses file="user.f.d" link="output" dontRegister="false" dontTransfer="false"/>
  </job>
...
</adag>
  • The above states that the pegasus profiles with key as user_label are to be used for designating clusters.

  • Each job with the same value for pegasus profile key user_label appears in the same cluster.

10.4.1.1.4. Recursive Clustering

In some cases, a user may want to use a combination of clustering techniques. For e.g. a user may want some jobs in the workflow to be horizontally clustered and some to be label clustered. This can be achieved by specifying a comma separated list of clustering techniques to the --cluster option of pegasus-plan. In this case the clustering techniques are applied one after the other on the workflow in the order specified on the command line.

For example

$ pegasus-plan --dax example.dax --dir ./dags --cluster label,horizontal -s siteX --output local --verbose

Figure 10.5. Recursive clustering

Recursive clustering

10.4.1.2. Execution of the Clustered Job

The execution of the clustered job on the remote site, involves the execution of the smaller constituent jobs either

  • sequentially on a single node of the remote site

    The clustered job is executed using pegasus-cluster, a wrapper tool written in C that is distributed as part of the PEGASUS. It takes in the jobs passed to it, and ends up executing them sequentially on a single node. To use pegasus-cluster for executing any clustered job on a siteX, there needs to be an entry in the transformation catalog for an executable with the logical name seqexec and namespace as pegasus.

    #site  transformation   pfn            type                 architecture    profiles
    
    siteX    pegasus::seqexec     /usr/pegasus/bin/pegasus-cluster INSTALLED       INTEL32::LINUX NULL

    If the entry is not specified, Pegasus will attempt create a default path on the basis of the environment profile PEGASUS_HOME specified in the site catalog for the remote site.

  • On multiple nodes of the remote site using MPI based task management tool called Pegasus MPI Cluster (PMC)

    The clustered job is executed using pegasus-mpi-cluster, a wrapper MPI program written in C that is distributed as part of the PEGASUS. A PMC job consists of a single master process (this process is rank 0 in MPI parlance) and several worker processes. These processes follow the standard master-worker architecture. The master process manages the workflow and assigns workflow tasks to workers for execution. The workers execute the tasks and return the results to the master. Communication between the master and the workers is accomplished using a simple text-based protocol implemented using MPI_Send and MPI_Recv. PMC relies on a shared filesystem on the remote site to manage the individual tasks stdout and stderr and stage it back to the submit host as part of it's own stdout/stderr.

    The input format for PMC is a DAG based format similar to Condor DAGMan's. PMC follows the dependencies specified in the DAG to release the jobs in the right order and executes parallel jobs via the workers when possible. The input file for PMC is automatically generated by the Pegasus Planner when generating the executable workflow. PMC allows for a finer grained control on how each task is executed. This can be enabled by associating the following pegasus profiles with the jobs in the DAX

    Table 10.9. Table : Pegasus Profiles that can be associated with jobs in the DAX for PMC

    Key Description
    pmc_request_memory This key is used to set the -m option for pegasus-mpi-cluster. It specifies the amount of memory in MB that a job requires. This profile is usually set in the DAX for each job.
    pmc_request_cpus This key is used to set the -c option for pegasus-mpi-cluster. It specifies the number of cpu's that a job requires. This profile is usually set in the DAX for each job.
    pmc_priority This key is used to set the -p option for pegasus-mpi-cluster. It specifies the priority for a job . This profile is usually set in the DAX for each job. Negative values are allowed for priorities.
    pmc_task_arguments The key is used to pass any extra arguments to the PMC task during the planning time. They are added to the very end of the argument string constructed for the task in the PMC file. Hence, allows for overriding of any argument constructed by the planner for any particular task in the PMC job.

    Refer to the pegasus-mpi-cluster man page in the command line tools chapter to know more about PMC and how it schedules individual tasks.

    It is recommended to have a pegasus::mpiexec entry in the transformation catalog to specify the path to PMC on the remote and specify the relevant globus profiles such as xcount, host_xcount and maxwalltime to control size of the MPI job.

    #site  transformation   pfn            type                 architecture    profiles
    
    siteX    pegasus::mpiexec     /usr/pegasus/bin/pegasus-mpi-cluster INSTALLED       INTEL32::LINUX globus::xcount=32;globus::host_xcount=1

    If the entry is not specified, Pegasus will attempt create a default path on the basis of the environment profile PEGASUS_HOME specified in the site catalog for the remote site.

    Tip

    Users are encouraged to use label based clustering in conjunction with PMC

10.4.1.2.1. Specification of Method of Execution for Clustered Jobs

The method execution of the clustered job(whether to launch via mpiexec or seqexec) can be specified

  1. globally in the properties file

    The user can set a property in the properties file that results in all the clustered jobs of the workflow being executed by the same type of executable.

    #PEGASUS PROPERTIES FILE
    pegasus.clusterer.job.aggregator seqexec|mpiexec

    In the above example, all the clustered jobs on the remote sites are going to be launched via the property value, as long as the property value is not overridden in the site catalog.

  2. associating profile key job.aggregator with the site in the site catalog

    <site handle="siteX" gridlaunch = "/shared/PEGASUS/bin/kickstart">
        <profile namespace="env" key="GLOBUS_LOCATION" >/home/shared/globus</profile>
        <profile namespace="env" key="LD_LIBRARY_PATH">/home/shared/globus/lib</profile>
        <profile namespace="pegasus" key="job.aggregator" >seqexec</profile>
        <lrc url="rls://siteX.edu" />
        <gridftp  url="gsiftp://siteX.edu/" storage="/home/shared/work" major="2" minor="4" patch="0" />
        <jobmanager universe="transfer" url="siteX.edu/jobmanager-fork" major="2" minor="4" patch="0" />
        <jobmanager universe="vanilla" url="siteX.edu/jobmanager-condor" major="2" minor="4" patch="0" />
        <workdirectory >/home/shared/storage</workdirectory>
      </site>

    In the above example, all the clustered jobs on a siteX are going to be executed via seqexec, as long as the value is not overridden in the transformation catalog.

  3. associating profile key job.aggregator with the transformation that is being clustered, in the transformation catalog

    #site  transformation   pfn            type                architecture profiles
    
    siteX    B     /shared/PEGASUS/bin/jobB INSTALLED       INTEL32::LINUX pegasus::clusters.size=3,job.aggregator=mpiexec

    In the above example, all the clustered jobs that consist of transformation B on siteX will be executed via mpiexec.

    Note

    The clustering of jobs on a site only happens only if

    • there exists an entry in the transformation catalog for the clustering executable that has been determined by the above 3 rules

    • the number of jobs being clustered on the site are more than 1

10.4.1.3. Outstanding Issues

  1. Label Clustering

    More rigorous checks are required to ensure that the labeling scheme applied by the user is valid.

10.5. Data Transfers

As part of the Workflow Mapping Process, Pegasus does data management for the executable workflow . It queries a Replica Catalog to discover the locations of the input datasets and adds data movement and registration nodes in the workflow to

  1. stage-in input data to the staging sites ( a site associated with the compute job to be used for staging. In the shared filesystem setup, staging site is the same as the execution sites where the jobs in the workflow are executed )

  2. stage-out output data generated by the workflow to the final storage site.

  3. stage-in intermediate data between compute sites if required.

  4. data registration nodes to catalog the locations of the output data on the final storage site into the replica catalog.

The separate data movement jobs that are added to the executable workflow are responsible for staging data to a workflow specific directory accessible to the staging server on a staging site associated with the compute sites. Depending on the data staging configuration, the staging site for a compute site is the compute site itself. In the default case, the staging server is usually on the headnode of the compute site and has access to the shared filesystem between the worker nodes and the head node. Pegasus adds a directory creation job in the executable workflow that creates the workflow specific directory on the staging server.

In addition to data, Pegasus does transfer user executables to the compute sites if the executables are not installed on the remote sites before hand. This chapter gives an overview of how transfers of data and executables is managed in Pegasus.

10.5.1. Data Staging Configuration

Pegasus can be broadly setup to run workflows in the following configurations

  • Shared File System

    This setup applies to where the head node and the worker nodes of a cluster share a filesystem. Compute jobs in the workflow run in a directory on the shared filesystem.

  • NonShared FileSystem

    This setup applies to where the head node and the worker nodes of a cluster don't share a filesystem. Compute jobs in the workflow run in a local directory on the worker node

  • Condor Pool Without a shared filesystem

    This setup applies to a condor pool where the worker nodes making up a condor pool don't share a filesystem. All data IO is achieved using Condor File IO. This is a special case of the non shared filesystem setup, where instead of using pegasus-transfer to transfer input and output data, Condor File IO is used.

For the purposes of data configuration various sites, and directories are defined below.

  1. Submit Host

    The host from where the workflows are submitted . This is where Pegasus and Condor DAGMan are installed. This is referred to as the "local" site in the site catalog .

  2. Compute Site

    The site where the jobs mentioned in the DAX are executed. There needs to be an entry in the Site Catalog for every compute site. The compute site is passed to pegasus-plan using --sites option

  3. Staging Site

    A site to which the separate transfer jobs in the executable workflow ( jobs with stage_in , stage_out and stage_inter prefixes that Pegasus adds using the transfer refiners) stage the input data to and the output data from to transfer to the final output site. Currently, the staging site is always the compute site where the jobs execute.

  4. Output Site

    The output site is the final storage site where the users want the output data from jobs to go to. The output site is passed to pegasus-plan using the --output option. The stageout jobs in the workflow stage the data from the staging site to the final storage site.

  5. Input Site

    The site where the input data is stored. The locations of the input data are catalogued in the Replica Catalog, and the pool attribute of the locations gives us the site handle for the input site.

  6. Workflow Execution Directory

    This is the directory created by the create dir jobs in the executable workflow on the Staging Site. This is a directory per workflow per staging site. Currently, the Staging site is always the Compute Site.

  7. Worker Node Directory

    This is the directory created on the worker nodes per job usually by the job wrapper that launches the job.

10.5.1.1. Shared File System

By default Pegasus is setup to run workflows in the shared file system setup, where the worker nodes and the head node of a cluster share a filesystem.

Figure 10.6. Shared File System Setup

Shared File System Setup

The data flow is as follows in this case

  1. Stagein Job executes ( either on Submit Host or Head Node ) to stage in input data from Input Sites ( 1---n) to a workflow specific execution directory on the shared filesystem.

  2. Compute Job starts on a worker node in the workflow execution directory. Accesses the input data using Posix IO

  3. Compute Job executes on the worker node and writes out output data to workflow execution directory using Posix IO

  4. Stageout Job executes ( either on Submit Host or Head Node ) to stage out output data from the workflow specific execution directory to a directory on the final output site.

Tip

Set pegasus.data.configuration to sharedfs to run in this configuration.

10.5.1.2. Non Shared Filesystem

In this setup , Pegasus runs workflows on local file-systems of worker nodes with the the worker nodes not sharing a filesystem. The data transfers happen between the worker node and a staging / data coordination site. The staging site server can be a file server on the head node of a cluster or can be on a separate machine.

Setup

  • compute and staging site are the different

  • head node and worker nodes of compute site don't share a filesystem

  • Input Data is staged from remote sites.

  • Remote Output Site i.e site other than compute site. Can be submit host.

Figure 10.7. Non Shared Filesystem Setup

Non Shared Filesystem Setup

The data flow is as follows in this case

  1. Stagein Job executes ( either on Submit Host or on staging site ) to stage in input data from Input Sites ( 1---n) to a workflow specific execution directory on the staging site.

  2. Compute Job starts on a worker node in a local execution directory. Accesses the input data using pegasus transfer to transfer the data from the staging site to a local directory on the worker node

  3. The compute job executes in the worker node, and executes on the worker node.

  4. The compute Job writes out output data to the local directory on the worker node using Posix IO

  5. Output Data is pushed out to the staging site from the worker node using pegasus-transfer.

  6. Stageout Job executes ( either on Submit Host or staging site ) to stage out output data from the workflow specific execution directory to a directory on the final output site.

In this case, the compute jobs are wrapped as PegasusLite instances.

This mode is especially useful for running in the cloud environments where you don't want to setup a shared filesystem between the worker nodes. Running in that mode is explained in detail here.

Tip

Set pegasus.data.configuration to nonsharedfs to run in this configuration. The staging site can be specified using the --staging-site option to pegasus-plan.

10.5.1.3. Condor Pool Without a Shared Filesystem

This setup applies to a condor pool where the worker nodes making up a condor pool don't share a filesystem. All data IO is achieved using Condor File IO. This is a special case of the non shared filesystem setup, where instead of using pegasus-transfer to transfer input and output data, Condor File IO is used.

Setup

  • Submit Host and staging site are same

  • head node and worker nodes of compute site don't share a filesystem

  • Input Data is staged from remote sites.

  • Remote Output Site i.e site other than compute site. Can be submit host.

Figure 10.8. Condor Pool Without a Shared Filesystem

Condor Pool Without a Shared Filesystem

The data flow is as follows in this case

  1. Stagein Job executeson the submit host to stage in input data from Input Sites ( 1---n) to a workflow specific execution directory on the submit host

  2. Compute Job starts on a worker node in a local execution directory. Before the compute job starts, Condor transfers the input data for the job from the workflow execution directory on thesubmit host to the local execution directory on the worker node.

  3. The compute job executes in the worker node, and executes on the worker node.

  4. The compute Job writes out output data to the local directory on the worker node using Posix IO

  5. When the compute job finishes, Condor transfers the output data for the job from the local execution directory on the worker node to the workflow execution directory on the submit host.

  6. Stageout Job executes ( either on Submit Host or staging site ) to stage out output data from the workflow specific execution directory to a directory on the final output site.

In this case, the compute jobs are wrapped as PegasusLite instances.

This mode is especially useful for running in the cloud environments where you don't want to setup a shared filesystem between the worker nodes. Running in that mode is explained in detail here.

Tip

Set pegasus.data.configuration to condorio to run in this configuration. In this mode, the staging site is automatically set to site local

10.5.2. Local versus Remote Transfers

As far as possible, Pegasus will ensure that the transfer jobs added to the executable workflow are executed on the submit host. By default, Pegasus will schedule a transfer to be executed on the remote staging site only if there is no way to execute it on the submit host. For e.g if the file server specified for the staging site/compute site is a file server, then Pegasus will schedule all the stage in data movement jobs on the compute site to stage-in the input data for the workflow. Another case would be if a user has symlinking turned on. In that case, the transfer jobs that symlink against the input data on the compute site, will be executed remotely ( on the compute site ).

Users can specify the property pegasus.transfer.*.remote.sites to change the default behaviour of Pegasus and force pegasus to run different types of transfer jobs for the sites specified on the remote site. The value of the property is a comma separated list of compute sites for which you want the transfer jobs to run remotely.

The table below illustrates all the possible variations of the property.

Table 10.10. Property Variations for pegasus.transfer.*.remote.sites

Property Name Applies to
pegasus.transfer.stagein.remote.sites the stage in transfer jobs
pegasus.transfer.stageout.remote.sites the stage out transfer jobs
pegasus.transfer.inter.remote.sites the inter site transfer jobs
pegasus.transfer.*.remote.sites all types of transfer jobs

The prefix for the transfer job name indicates whether the transfer job is to be executed locallly ( on the submit host ) or remotely ( on the compute site ). For example stage_in_local_ in a transfer job name stage_in_local_isi_viz_0 indicates that the transfer job is a stage in transfer job that is executed locally and is used to transfer input data to compute site isi_viz. The prefix naming scheme for the transfer jobs is [stage_in|stage_out|inter]_[local|remote]_ .

10.5.3. Symlinking Against Input Data

If input data for a job already exists on a compute site, then it is possible for Pegasus to symlink against that data. In this case, the remote stage in transfer jobs that Pegasus adds to the executable workflow will symlink instead of doing a copy of the data.

Pegasus determines whether a file is on the same site as the compute site, by inspecting the pool attribute associated with the URL in the Replica Catalog. If the pool attribute of an input file location matches the compute site where the job is scheduled, then that particular input file is a candidate for symlinking.

For Pegasus to symlink against existing input data on a compute site, following must be true

  1. Property pegasus.transfer.links is set to true

  2. The input file location in the Replica Catalog has the pool attribute matching the compute site.

Tip

To confirm if a particular input file is symlinked instead of being copied, look for the destination URL for that file in stage_in_remote*.in file. The destination URL will start with symlink:// .

In the symlinking case, Pegasus strips out URL prefix from a URL and replaces it with a file URL.

For example if a user has the following URL catalogued in the Replica Catalog for an input file f.input

f.input   gsiftp://server.isi.edu/shared/storage/input/data/f.input pool="isi"

and the compute job that requires this file executes on a compute site named isi , then if symlinking is turned on the data stage in job (stage_in_remote_viz_0 ) will have the following source and destination specified for the file

#viz viz
file:///shared/storage/input/data/f.input  symlink://shared-scratch/workflow-exec-dir/f.input

10.5.4. Addition of Separate Data Movement Nodes to Executable Workflow

Pegasus relies on a Transfer Refiner that comes up with the strategy on how many data movement nodes are added to the executable workflow. All the compute jobs scheduled to a site share the same workflow specific directory. The transfer refiners ensure that only one copy of the input data is transferred to the workflow execution directory. This is to prevent data clobbering . Data clobbering can occur when compute jobs of a workflow share some input files, and have different stage in transfer jobs associated with them that are staging the shared files to the same destination workflow execution directory.

The default Transfer Refiner used in Pegasus is the Bundle Refiner that allows the user to specify how many local|remote stagein|stageout jobs are created per execution site.

The behavior of the refiner is controlled by specifying certain pegasus profiles

  1. either with the execution sites in the site catalog

  2. OR globally in the properties file

Table 10.11. Pegasus Profile Keys For the Cluster Transfer Refiner

Profile Key Description
stagein.clusters This key determines the maximum number of stage-in jobs that are can executed locally or remotely per compute site per workflow.
stagein.local.clusters This key provides finer grained control in determining the number of stage-in jobs that are executed locally and are responsible for staging data to a particular remote site.
stagein.remote.clusters This key provides finer grained control in determining the number of stage-in jobs that are executed remotely on the remote site and are responsible for staging data to it.
stageout.clusters This key determines the maximum number of stage-out jobs that are can executed locally or remotely per compute site per workflow.
stageout.local.clusters This key provides finer grained control in determining the number of stage-out jobs that are executed locally and are responsible for staging data from a particular remote site.
stageout.remote.clusters This key provides finer grained control in determining the number of stage-out jobs that are executed remotely on the remote site and are responsible for staging data from it.

Figure 10.9. Default Transfer Case : Input Data To Workflow Specific Directory on Shared File System

Default Transfer Case : Input Data To Workflow Specific Directory on Shared File System

10.5.5. Executable Used for Transfer Jobs

Pegasus refers to a python script called pegasus-transfer as the executable in the transfer jobs to transfer the data. pegasus-transfer is a python based wrapper around various transfer clients . pegasus-transfer looks at source and destination url and figures out automatically which underlying client to use. pegasus-transfer is distributed with the PEGASUS and can be found at $PEGASUS_HOME/bin/pegasus-transfer.

Currently, pegasus-transfer interfaces with the following transfer clients

Table 10.12. Transfer Clients interfaced to by pegasus-transfer

Transfer Client Used For
globus-url-copy staging files to and from a gridftp server.
lcg-copy staging files to and from a SRM server.
wget staging files from a HTTP server.
cp copying files from a POSIX filesystem .
ln symlinking against input files.
pegasus-s3/s3cmd staging files to and from s3 bucket in the amazon cloud
scp staging files using scp
iget staging files to and from a irods server.

For remote sites, Pegasus constructs the default path to pegasus-transfer on the basis of PEGASUS_HOME env profile specified in the site catalog. To specify a different path to the pegasus-transfer client , users can add an entry into the transformation catalog with fully qualified logical name as pegasus::pegasus-transfer

10.5.6. Executables used for Directory Creation and Cleanup Jobs

Starting 4.0, Pegasus has changed the way how the scratch directories are created on the staging site. The planner now prefers to schedule the directory creation and cleanup jobs locally. The jobs refer to python based tools, that call out to protocol specific clients to determine what client is picked up. For protocols, where specific remote cleanup and directory creation clients don't exist ( for example gridftp ), the python tools rely on the corresponding transfer tool to create a directory by initiating a transfer of an empty file. The python clients used to create directories and remove files are called

  • pegasus-create-dir

  • pegasus-cleanup

Both these clients inspect the URL's to to determine what underlying client to pick up.

Table 10.13. Clients interfaced to by pegasus-create-dir

Client Used For
globus-url-copy to create directories against a gridftp/ftp server
srm-mkdir to create directories against a SRM server.
mkdir to create a directory on the local filesystem
pegasus-s3 to create a s3 bucket in the amazon cloud
scp staging files using scp
imkdir to create a directory against an IRODS server

Table 10.14. Clients interfaced to by pegasus-cleanup

Client Used For
globus-url-copy to remove a file against a gridftp/ftp server. In this case a zero byte file is created
srm-rm to remove files against a SRM server.
rm to remove a file on the local filesystem
pegasus-s3 to remove a file from the s3 bucket.
scp to remove a file against a scp server. In this case a zero byte file is created.
irm to remove a file against an IRODS server

The only case, where the create dir and cleanup jobs are scheduled to run remotely is when for the staging site, a file server is specified.

10.5.7. Credentials Staging

Pegasus tries to do data staging from localhost by default, but some data scenarios makes some remote jobs do data staging. An example of such a case is when running in nonsharedfs mode. Depending on the transfer protocols used, the job may have to carry credentials to enable these datat transfers. To specify where which credential to use and where Pegasus can find it, use environment variable profiles in your site catalog. The supported credential types are X.509 grid proxies, Amazon AWS S3 keys, iRods password and SSH keys.

10.5.7.1. X.509 Grid Proxies

If the grid proxy is required by transfer jobs, and the proxy is in the standard location, Pegasus will pick the proxy up automatically. For non-standard proxy locations, you can use the X509_USER_PROXY environment variable. Site catalog example:

<profile namespace="env" key="X509_USER_PROXY" >/some/location/x509up</profile>

10.5.7.2. Amazon AWS S3

If a workflow is using s3 URLs, Pegasus has to be told where to find the .s3cfg file. This format of the file is described in the pegaus-s3 command line client's man page. For the file to be picked up by the workflow, set the S3CFG environment profile to the location of the file. Site catalog example:

<profile namespace="env" key="S3CFG" >/home/user/.s3cfg</profile>

10.5.7.3. iRods Password

If a workflow is using irods URLs, Pegasus has to be given an irodsEnv file. It is a standard file, with the addtion of an password attribute. Example:

# iRODS personal configuration file.
#
# iRODS server host name:
irodsHost 'iren.renci.org'
# iRODS server port number:
irodsPort 1259

# Default storage resource name:
irodsDefResource 'renResc'
# Home directory in iRODS:
irodsHome '/tip-renci/home/mats'
# Current directory in iRODS:
irodsCwd '/tip-renci/home/mats'
# Account name:
irodsUserName 'mats'
# Zone:
irodsZone 'tip-renci' 

# this is used with Pegasus
irodsPassword 'somesecretpassword'

The location of the file can be given to the workflow using the irodsEnvFile environment profile. Site catalog example:

<profile namespace="env" key="irodsEnvFile" >/home/user/.irods/.irodsEnv</profile>

10.5.7.4. SSH Keys

New in Pegasus 4.0 is the support for data staging with scp using ssh public/private key authentication. In this mode, Pegasus transports a private key with the jobs. The storage machines will have to have the public part of the key listed in ~/.ssh/authorized_keys.

Warning

SSH keys should be handled in a secure manner. In order to keep your personal ssh keys secure, It is recommended that a special set of keys are created for use with the workflow. Note that Pegasus will not pick up ssh keys automatically. The user will have to specify which key to use with SSH_PRIVATE_KEY.

The location of the ssh private key can be specified with the SSH_PRIVATE_KEY environment profile. Site catalog example:

<profile namespace="env" key="SSH_PRIVATE_KEY" >/home/user/wf/wfsshkey</profile>

10.5.8. Staging of Executables

Users can get Pegasus to stage the user executables ( executables that the jobs in the DAX refer to ) as part of the transfer jobs to the workflow specific execution directory on the compute site. The URL locations of the executables need to be specified in the transformation catalog as the PFN and the type of executable needs to be set to STAGEABLE .

The location of a transformation can be specified either in

  • DAX in the executables section. More details here .

  • Transformation Catalog. More details here .

A particular transformation catalog entry of type STAGEABLE is compatible with a compute site only if all the System Information attributes associated with the entry match with the System Information attributes for the compute site in the Site Catalog. The following attributes make up the System Information attributes

  1. arch

  2. os

  3. osrelease

  4. osversion

10.5.8.1. Transformation Mappers

Pegasus has a notion of transformation mappers that determines what type of executables are picked up when a job is executed on a remote compute site. For transfer of executables, Pegasus constructs a soft state map that resides on top of the transformation catalog, that helps in determining the locations from where an executable can be staged to the remote site.

Users can specify the following property to pick up a specific transformation mapper

pegasus.catalog.transformation.mapper 

Currently, the following transformation mappers are supported.

Table 10.15. Transformation Mappers Supported in Pegasus

Transformation Mapper Description
Installed This mapper only relies on transformation catalog entries that are of type INSTALLED to construct the soft state map. This results in Pegasus never doing any transfer of executables as part of the workflow. It always prefers the installed executables at the remote sites
Staged This mapper only relies on matching transformation catalog entries that are of type STAGEABLE to construct the soft state map. This results in the executable workflow referring only to the staged executables, irrespective of the fact that the executables are already installed at the remote end
All This mapper relies on all matching transformation catalog entries of type STAGEABLE or INSTALLED for a particular transformation as valid sources for the transfer of executables. This the most general mode, and results in the constructing the map as a result of the cartesian product of the matches.
Submit This mapper only on matching transformation catalog entries that are of type STAGEABLE and reside at the submit host (pool local), are used while constructing the soft state map. This is especially helpful, when the user wants to use the latest compute code for his computations on the grid and that relies on his submit host.

10.5.9. Staging of Pegasus Worker Package

Pegasus can optionally stage the pegasus worker package as part of the executable workflow to remote workflow specific execution directory. The pegasus worker package contains the pegasus auxillary executables that are required on the remote site. If the worker package is not staged as part of the executable workflow, then Pegasus relies on the installed version of the worker package on the remote site. To determine the location of the installed version of the worker package on a remote site, Pegasus looks for an environment profile PEGASUS_HOME for the site in the Site Catalog.

Users can set the following property to true to turn on worker package staging

pegasus.transfer.worker.package          true 

By default, when worker package staging is turned on pegasus pulls the compatible worker package from the Pegasus Website. To specify a different worker package location, users can specify the transformation pegasus::worker in the transformation catalog with

  • type set to STAGEABLE

  • System Information attributes of the transformation catalog entry match the System Information attributes of the compute site.

  • the PFN specified should be a remote URL that can be pulled to the compute site.

10.5.9.1. Worker Package Staging in Non Shared Filesystem setup

Worker package staging is automatically set to true , when workflows are setup to run in a non shared filesystem setup i.e. pegasus.data.configuration is set to nonsharedfs or condorio . In these configurations, a stage_worker job is created that brings in the worker package to the submit directory of the workflow. For each job, the worker package is then transferred with the job using Condor File Transfers ( transfer_input_files ) . This transfer always happens unless, PEGASUS_HOME is specified in the site catalog for the site on which the job is scheduled to run.

Users can explicitly set the following property to false, to turn off worker package staging by the Planner. This is applicable , when running in the cloud and virtual machines / worker nodes already have the pegasus worker tools installed.

pegasus.transfer.worker.package          false 

10.5.10. Using Amazon S3 as a Staging Site

Pegasus can be configured to use Amazon S3 as a staging site. In this mode, Pegasus transfers workflow inputs from the input site to S3. When a job runs, the inputs for that job are fetched from S3 to the worker node, the job is executed, then the output files are transferred from the worker node back to S3. When the jobs are complete, Pegasus transfers the output data from S3 to the output site.

In order to use S3, it is necessary to create a config file for the S3 transfer client, pegasus-s3. See the man page for details on how to create the config file. You also need to specify S3 as a staging site.

Next, you need to modify your site catalog to tell the location of your s3cfg file. See the section on credential staging.

The following site catalog shows how to specify the location of the s3cfg file on the local site and how to specify an Amazon S3 staging site:

<sitecatalog xmlns="http://pegasus.isi.edu/schema/sitecatalog"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://pegasus.isi.edu/schema/sitecatalog
             http://pegasus.isi.edu/schema/sc-3.0.xsd" version="3.0">
    <site handle="local" arch="x86_64" os="LINUX">
        <head-fs>
            <scratch>
                <shared>
                    <file-server protocol="file" url="file://" mount-point="/tmp/wf/work"/>
                    <internal-mount-point mount-point="/tmp/wf/work"/>
                </shared>
            </scratch>
            <storage>
                <shared>
                    <file-server protocol="file" url="file://" mount-point="/tmp/wf/storage"/>
                    <internal-mount-point mount-point="/tmp/wf/storage"/>
                </shared>
            </storage>
        </head-fs>
        <profile namespace="env" key="S3CFG">/home/username/.s3cfg</profile>
    </site>
    <site handle="s3" arch="x86_64" os="LINUX">
        <head-fs>
            <scratch>
                <shared>
                    <!-- wf-scratch is the name of the S3 bucket that will be used -->
                    <file-server protocol="s3" url="s3://user@amazon" mount-point="/wf-scratch"/>
                    <internal-mount-point mount-point="/wf-scratch"/>
                </shared>
            </scratch>
        </head-fs>
    </site>
    <site handle="condorpool" arch="x86_64" os="LINUX">
        <head-fs>
            <scratch/>
            <storage/>
        </head-fs>
        <profile namespace="pegasus" key="style">condor</profile>
        <profile namespace="condor" key="universe">vanilla</profile>
        <profile namespace="condor" key="requirements">(Target.Arch == "X86_64")</profile>
    </site>
</sitecatalog>

10.6. Hierarchical Workflows

10.6.1. Introduction

The Abstract Workflow in addition to containing compute jobs, can also contain jobs that refer to other workflows. This is useful for running large workflows or ensembles of workflows.

Users can embed two types of workflow jobs in the DAX

  1. daxjob - refers to a sub workflow represented as a DAX. During the planning of a workflow, the DAX jobs are mapped to condor dagman jobs that have pegasus plan invocation on the dax ( referred to in the DAX job ) as the prescript.

    Figure 10.10. Planning of a DAX Job

    Planning of a DAX Job

  2. dagjob - refers to a sub workflow represented as a DAG. During the planning of a workflow, the DAG jobs are mapped to condor dagman and refer to the DAG file mentioned in the DAG job.

    Figure 10.11. Planning of a DAG Job

    Planning of a DAG Job

10.6.2. Specifying a DAX Job in the DAX

Specifying a DAXJob in a DAX is pretty similar to how normal compute jobs are specified. There are minor differences in terms of the xml element name ( dax vs job ) and the attributes specified. DAXJob XML specification is described in detail in the chapter on DAX API . An example DAX Job in a DAX is shown below

  <dax id="ID000002" name="black.dax" node-label="bar" >
    <profile namespace="dagman" key="maxjobs">10</profile>
    <argument>-Xmx1024 -Xms512 -Dpegasus.dir.storage=storagedir  -Dpegasus.dir.exec=execdir -o local -vvvvv --force -s dax_site </argument>
  </dax>

10.6.2.1. DAX File Locations

The name attribute in the dax element refers to the LFN ( Logical File Name ) of the dax file. The location of the DAX file can be catalogued either in the

  1. Replica Catalog

  2. Replica Catalog Section in the DAX .

    Note

    Currently, only file url's on the local site ( submit host ) can be specified as DAX file locations.

10.6.2.2. Arguments for a DAX Job

Users can specify specific arguments to the DAX Jobs. The arguments specified for the DAX Jobs are passed to the pegasus-plan invocation in the prescript for the corresponding condor dagman job in the executable workflow.

The following options for pegasus-plan are inherited from the pegasus-plan invocation of the parent workflow. If an option is specified in the arguments section for the DAX Job then that overrides what is inherited.

Table 10.16. Options inherited from parent workflow

Option Name Description
--sites list of execution sites.

It is highly recommended that users dont specify directory related options in the arguments section for the DAX Jobs. Pegasus assigns values to these options for the sub workflows automatically.

  1. --relative-dir

  2. --dir

  3. --relative-submit-dir

10.6.2.3. Profiles for DAX Job

Users can choose to specify dagman profiles with the DAX Job to control the behavior of the corresponding condor dagman instance in the executable workflow. In the example above maxjobs is set to 10 for the sub workflow.

10.6.2.4. Execution of the PRE script and Condor DAGMan instance

The pegasus plan that is invoked as part of the prescript to the condor dagman job is executed on the submit host. The log from the output of pegasus plan is redirected to a file ( ending with suffix pre.log ) in the submit directory of the workflow that contains the DAX Job. The path to pegasus-plan is automatically determined.

The DAX Job maps to a Condor DAGMan job. The path to condor dagman binary is determined according to the following rules -

  1. entry in the transformation catalog for condor::dagman for site local, else

  2. pick up the value of CONDOR_HOME from the environment if specified and set path to condor dagman as $CONDOR_HOME/bin/condor_dagman , else

  3. pick up the value of CONDOR_LOCATION from the environment if specified and set path to condor dagman as $CONDOR_LOCATION/bin/condor_dagman , else

  4. pick up the path to condor dagman from what is defined in the user's PATH

Tip

It is recommended that user dagman.maxpre in their properties file to control the maximum number of pegasus plan instances launched by each running dagman instance.

10.6.3. Specifying a DAG Job in the DAX

Specifying a DAGJob in a DAX is pretty similar to how normal compute jobs are specified. There are minor differences in terms of the xml element name ( dag vs job ) and the attributes specified. For DAGJob XML details,see the API Reference chapter . An example DAG Job in a DAX is shown below

  <dag id="ID000003" name="black.dag" node-label="foo" >
    <profile namespace="dagman" key="maxjobs">10</profile>
    <profile namespace="dagman" key="DIR">/dag-dir/test</profile>
  </dag>

10.6.3.1. DAG File Locations

The name attribute in the dag element refers to the LFN ( Logical File Name ) of the dax file. The location of the DAX file can be catalogued either in the

  1. Replica Catalog

  2. Replica Catalog Section in the DAX.

    Note

    Currently, only file url's on the local site ( submit host ) can be specified as DAG file locations.

10.6.3.2. Profiles for DAG Job

Users can choose to specify dagman profiles with the DAX Job to control the behavior of the corresponding condor dagman instance in the executable workflow. In the example above, maxjobs is set to 10 for the sub workflow.

The dagman profile DIR allows users to specify the directory in which they want the condor dagman instance to execute. In the example above black.dag is set to be executed in directory /dag-dir/test . The /dag-dir/test should be created beforehand.

10.6.4. File Dependencies Across DAX Jobs

In hierarchal workflows , if a sub workflow generates some output files required by another sub workflow then there should be an edge connecting the two dax jobs. Pegasus will ensure that the prescript for the child sub-workflow, has the path to the cache file generated during the planning of the parent sub workflow. The cache file in the submit directory for a workflow is a textual replica catalog that lists the locations of all the output files created in the remote workflow execution directory when the workflow executes.

This automatic passing of the cache file to a child sub-workflow ensures that the datasets from the same workflow run are used. However, the passing of the locations in a cache file also ensures that Pegasus will prefer them over all other locations in the Replica Catalog. If you need the Replica Selection to consider locations in the Replica Catalog also, then set the following property.

pegasus.catalog.replica.cache.asrc  true

The above is useful in the case, where you are staging out the output files to a storage site, and you want the child sub workflow to stage these files from the storage output site instead of the workflow execution directory where the files were originally created.

10.6.5. Recursion in Hierarchal Workflows

It is possible for a user to add a dax jobs to a dax that already contain dax jobs in them. Pegasus does not place a limit on how many levels of recursion a user can have in their workflows. From Pegasus perspective recursion in hierarchal workflows ends when a DAX with only compute jobs is encountered . However, the levels of recursion are limited by the system resources consumed by the DAGMan processes that are running (each level of nesting produces another DAGMan process) .

The figure below illustrates an example with recursion 2 levels deep.

Figure 10.12. Recursion in Hierarchal Workflows

Recursion in Hierarchal Workflows

The execution time-line of the various jobs in the above figure is illustrated below.

Figure 10.13. Execution Time-line for Hierarchal Workflows

Execution Time-line for Hierarchal Workflows

10.6.6. Example

The Galactic Plane workflow is a Hierarchical workflow of many Montage workflows. For details, see Workflow of Workflows.

10.7. Notifications

The Pegasus Workflow Mapper now supports job and workflow level notifications. You can specify in the DAX with the job or the workflow

  • the event when the notification needs to be sent

  • the executable that needs to be invoked.

The notifications are issued from the submit host by the pegasus-monitord daemon that monitors the Condor logs for the workflow. When a notification is issued, pegasus-monitord while invoking the notifying executable sets certain environment variables that contain information about the job and workflow state.

The Pegasus release comes with default notification clients that send notifications via email or jabber.

10.7.1. Specifying Notifications in the DAX

Currently, you can specify notifications for the jobs and the workflow by the use of invoke elements.

Invoke elements can be sub elements for the following elements in the DAX schema.

  • job - to associate notifications with a compute job in the DAX.

  • dax - to associate notifications with a dax job in the DAX.

  • dag - to associate notifications with a dag job in the DAX.

  • executable - to associate notifications with a job that uses a particular notification

The invoke element can be specified at the root element level of the DAX to indicate workflow level notifications.

The invoke element may be specified multiple times, as needed. It has a mandatory when attribute with the following value set

Table 10.17. Table 1. Invoke Element attributes and meaning.

Enumeration of Values for when attribute Meaning
never (default). Never notify of anything. This is useful to temporarily disable an existing notifications.
start create a notification when the job is submitted.
on_error after a job finishes with failure (exitcode != 0).
on_success after a job finishes with success (exitcode == 0).
at_end after a job finishes, regardless of exitcode.
all like start and at_end combined.

You can specify multiple invoke elements corresponding to same when attribute value in the DAX. This will allow you to have multiple notifications for the same event.

Here is an example that illustrates that.

<job id="ID000001" namespace="example" name="mDiffFit" version="1.0" 
       node-label="preprocess" >
    <argument>-a top -T 6  -i <file name="f.a"/>  -o <file name="f.b1"/></argument>

    <!-- profiles are optional -->
    <profile namespace="execution" key="site">isi_viz</profile>
    <profile namespace="condor" key="getenv">true</profile>

    <uses name="f.a" link="input"  register="false" transfer="true" type="data" />
    <uses name="f.b" link="output" register="false" transfer="true" type="data" />
    
    <!-- 'WHEN' enumeration: never, start, on_error, on_success, on_end, all -->
    <invoke when="start">/path/to/notify1 arg1 arg2</invoke>
    <invoke when="start">/path/to/notify1 arg3 arg4</invoke>
    <invoke when="on_success">/path/to/notify2 arg3 arg4</invoke>
  </job>

In the above example the executable notify1 will be invoked twice when a job is submitted ( when="start" ), once with arguments arg1 and arg2 and second time with arguments arg3 and arg4.

The DAX Generator API chapter has information about how to add notifications to the DAX using the DAX api's.

10.7.2. Notify File created by Pegasus in the submit directory

Pegasus while planning a workflow writes out a notify file in the submit directory that contains all the notifications that need to be sent for the workflow. pegasus-monitord picks up this notifications file to determine what notifications need to be sent and when.

  1. ENTITY_TYPE ID NOTIFICATION_CONDITION ACTION

    • ENTITY_TYPE can be either of the following keywords

      • WORKFLOW - indicates workflow level notification

      • JOB - indicates notifications for a job in the executable workflow

      • DAXJOB - indicates notifications for a DAX Job in the executable workflow

      • DAGJOB - indicates notifications for a DAG Job in the executable workflow

    • ID indicates the identifier for the entity. It has different meaning depending on the entity type - -

      • workflow - ID is wf_uuid

      • JOB|DAXJOB|DAGJOB - ID is the job identifier in the executable workflow ( DAG ).

    • NOTIFICATION_CONDITION is the condition when the notification needs to be sent. The notification conditions are enumerated in Table 1

    • ACTION is what needs to happen when condition is satisfied. It is executable + arguments

  2. INVOCATION JOB_IDENTIFIER INV.ID NOTIFICATION_CONDITION ACTION

    The INVOCATION lines are only generated for clustered jobs, to specifiy the finer grained notifications for each constitutent job/invocation .

    • JOB IDENTIFIER is the job identifier in the executable workflow ( DAG ).

    • INV.ID indicates the index of the task in the clustered job for which the notification needs to be sent.

    • NOTIFICATION_CONDITION is the condition when the notification needs to be sent. The notification conditions are enumerated in Table 1

    • ACTION is what needs to happen when condition is satisfied. It is executable + arguments

A sample notifications file generated is listed below.

WORKFLOW d2c4f79c-8d5b-4577-8c46-5031f4d704e8 on_error /bin/date1

INVOCATION merge_vahi-preprocess-1.0_PID1_ID1 1 on_success /bin/date_executable
INVOCATION merge_vahi-preprocess-1.0_PID1_ID1 1 on_success /bin/date_executable
INVOCATION merge_vahi-preprocess-1.0_PID1_ID1 1 on_error /bin/date_executable

INVOCATION merge_vahi-preprocess-1.0_PID1_ID1 2 on_success /bin/date_executable
INVOCATION merge_vahi-preprocess-1.0_PID1_ID1 2 on_error /bin/date_executable

DAXJOB subdax_black_ID000003 on_error /bin/date13
JOB    analyze_ID00004    on_success /bin/date

10.7.3. Configuring pegasus-monitord for notifications

Whenever pegasus-monitord enters a workflow (or sub-workflow) directory, it will read the notifications file generated by Pegasus. Pegasus-monitord will match events in the running workflow against the notifications specified in the notifications file and will initiate the script specified in a notification when that notification matches an event in the workflow. It is important to note that there will be a delay between a certain event happening in the workflow, and pegasus-monitord processing the log file and executing the corresponding notification script.

The following command line options (and properties) can change how pegasus-monitord handles notifications:

  • --no-notifications (pegasus.monitord.notifications=False): Will disable notifications completely.

  • --notifications-max=nn (pegasus.monitord.notifications.max=nn): Will limit the number of concurrent notification scripts to nn. Once pegasus-monitord reaches this number, it will wait until one notification script finishes before starting a new one. Notifications happening during this time will be queued by the system. The default number of concurrent notification scripts for pegasus-monitord is 10.

  • --notifications-timeout=nn (pegasus.monitord.notifications.timeout=nn): This setting is used to change how long will pegasus-monitord wait for a notification script to finish. By default pegasus-monitord will wait for as long as it takes (possibly indefinitely) until a notification script ends. With this option, pegasus-monitord will wait for at most nn seconds before killing the notification script.

It is also important to understand that pegasus-monitord will not issue any notifications when it is executed in replay mode.

10.7.3.1. Environment set for the notification scripts

Whenever a notification in the notifications file matches an event in the running workflow, pegasus-monitord will run the corresponding script specified in the ACTION field of the notifications file. Pegasus-monitord will set the following environment variables for each notification script is starts:

  • PEGASUS_EVENT: The NOTIFICATION_CONDITION that caused the notification. In the case of the "all" condition, pegasus-monitord will substitute it for the actual event that caused the match (e.g. "start" or "at_end").

  • PEGASUS_EVENT_TIMESTAMP: Timestamp in EPOCH format for the event (better for automated processing).

  • PEGASUS_EVENT_TIMESTAMP_ISO: Same as above, but in ISO format (better for human readability).

  • PEGASUS_SUBMIT_DIR: The submit directory for the workflow (usually the value from "submit_dir" in the braindump.txt file)

  • PEGASUS_STDOUT: For workflow notifications, this will correspond to the dagman.out file for that workflow. For job and invocation notifications, this field will contain the output file (stdout) for that particular job instance.

  • PEGASUS_STDERR: For job and invocation notifications, this field will contain the error file (stderr) for the particular executable job instance. This field does not exist in case of workflow notifications.

  • PEGASUS_WFID: Contains the workflow id for this notification in the form of DAX_LABEL + DAX_INDEX (from the braindump.txt file).

  • PEGASUS_JOBID: For workflow notifications, this contains the worfkflow wf_uuid (from the braindump.txt file). For job and invocation notifications, this field contains the job identifier in the executable workflow ( DAG ) for the particular notification.

  • PEGASUS_INVID: Contains the index of the task in the clustered job for the notification.

  • PEGASUS_STATUS: For workflow notifications, this contains DAGMan's exit code. For job and invocation notifications, this field contains the exit code for the particular job/task. Please note that this field is not present for 'start' notification events.

10.7.4. Default Notification Scripts

Pegasus ships with two reference notification scripts. These can be used as starting point when creating your own notification scripts, or if the default one is all you need, you can use them directly in your workflows. The scripts are:

  • libexec/notification/email - sends email, including the output from pegasus-status (default) or pegasus-analyzer.

    $ ./libexec/notification/email --help
    Usage: email [options]
    
    Options:
      -h, --help            show this help message and exit
      -t TO_ADDRESS, --to=TO_ADDRESS
                            The To: email address. Defines the recipient for the
                            notification.
      -f FROM_ADDRESS, --from=FROM_ADDRESS
                            The From: email address. Defaults to the required To:
                            address.
      -r REPORT, --report=REPORT
                            Include workflow report. Valid values are: none
                            pegasus-analyzer pegasus-status (default)
    
  • libexec/notification/jabber - sends simple notifications to Jabber/GTalk. This can be useful for job failures.

    $ ./libexec/notification/jabber --help
    Usage: jabber [options]
    
    Options:
      -h, --help            show this help message and exit
      -i JABBER_ID, --jabberid=JABBER_ID
                            Your jabber id. Example: user@jabberhost.com
      -p PASSWORD, --password=PASSWORD
                            Your jabber password
      -s HOST, --host=HOST  Jabber host, if different from the host in your jabber
                            id. For Google talk, set this to talk.google.com
      -r RECIPIENT, --recipient=RECIPIENT
                            Jabber id of the recipient. Not necessary if you want
                            to send to your own jabber id
    

For example, if the DAX generator is written in Python and you want notifications on 'at_end' events (successful or failed):

# job level notifications - in this case for at_end events
job.invoke('at_end', pegasus_home + "/libexec/notifications/email --to me@somewhere.edu")

Please see the notifications example to see a full workflow using notifications.

10.8. Monitoring

Pegasus launches a monitoring daemon called pegasus-monitord per workflow ( a single daemon is launched if a user submits a hierarchal workflow ) . pegasus-monitord parses the workflow and job logs in the submit directory and populates to a database. This chapter gives an overview of the pegasus-monitord and describes the schema of the runtime database.

10.8.1. pegasus-monitord

Pegasus-monitord is used to follow workflows, parsing the output of DAGMan's dagman.out file. In addition to generating the jobstate.log file, which contains the various states that a job goes through during the workflow execution, pegasus-monitord can also be used to mine information from jobs' submit and output files, and either populate a database, or write a file with NetLogger events containing this information. Pegasus-monitord can also send notifications to users in real-time as it parses the workflow execution logs.

Pegasus-monitord is automatically invoked by pegasus-run, and tracks workflows in real-time. By default, it produces the jobstate.log file, and a SQLite database, which contains all the information listed in the Stampede schema. When a workflow fails, and is re-submitted with a rescue DAG, pegasus-monitord will automatically pick up from where it left previously and continue to write the jobstate.log file and populate the database.

If, after the workflow has already finished, users need to re-create the jobstate.log file, or re-populate the database from scratch, pegasus-monitord's --replay option should be used when running it manually.

10.8.1.1. Populating to different backend databases

In addition to SQLite, pegasus-monitord supports other types of databases, such as MySQL and Postgres. Users will need to install the low-level database drivers, and can use the --dest command-line option, or the pegasus.monitord.output property to select where the logs should go.

As an example, the command:

$ pegasus-monitord -r diamond-0.dag.dagman.out

will launch pegasus-monitord in replay mode. In this case, if a jobstate.log file already exists, it will be rotated and a new file will be created. It will also create/use a SQLite database in the workflow's run directory, with the name of diamond-0.stampede.db. If the database already exists, it will make sure to remove any references to the current workflow before it populates the database. In this case, pegasus-monitord will process the workflow information from start to finish, including any restarts that may have happened.

Users can specify an alternative database for the events, as illustrated by the following examples:

$ pegasus-monitord -r -d mysql://username:userpass@hostname/database_name diamond-0.dag.dagman.out
$ pegasus-monitord -r -d sqlite:////tmp/diamond-0.db diamond-0.dag.dagman.out

In the first example, pegasus-monitord will send the data to the database_name database located at server hostname, using the username and userpass provided. In the second example, pegasus-monitord will store the data in the /tmp/diamond-0.db SQLite database.

Note

For absolute paths four slashes are required when specifying an alternative database path in SQLite.

Users should also be aware that in all cases, with the exception of SQLite, the database should exist before pegasus-monitord is run (as it creates all needed tables but does not create the database itself).

Finally, the following example:

$ pegasus-monitord -r --dest diamond-0.bp diamond-0.dag.dagman.out

sends events to the diamond-0.bp file. (please note that in replay mode, any data on the file will be overwritten).

One important detail is that while processing a workflow, pegasus-monitord will automatically detect if/when sub-workflows are initiated, and will automatically track those sub-workflows as well. In this case, although pegasus-monitord will create a separate jobstate.log file in each workflow directory, the database at the top-level workflow will contain the information from not only the main workflow, but also from all sub-workflows.

10.8.1.2. Monitoring related files in the workflow directory

Pegasus-monitord generates a number of files in each workflow directory:

  • jobstate.log: contains a summary of workflow and job execution.

  • monitord.log: contains any log messages generated by pegasus-monitord. It is not overwritten when it restarts. This file is not generated in replay mode, as all log messages from pegasus-monitord are output to the console. Also, when sub-workflows are involved, only the top-level workflow will have this log file. Starting with release 4.0 and 3.1.1, monitord.log file is rotated if it exists already.

  • monitord.started: contains a timestamp indicating when pegasus-monitord was started. This file get overwritten every time pegasus-monitord starts.

  • monitord.done: contains a timestamp indicating when pegasus-monitord finished. This file is overwritten every time pegasus-monitord starts.

  • monitord.info: contains pegasus-monitord state information, which allows it to resume processing if a workflow does not finish properly and a rescue dag is submitted. This file is erased when pegasus-monitord is executed in replay mode.

  • monitord.recover: contains pegasus-monitord state information that allows it to detect that a previous instance of pegasus-monitord failed (or was killed) midway through parsing a workflow's execution logs. This file is only present while pegasus-monitord is running, as it is deleted when it ends and the monitord.info file is generated.

  • monitord.subwf.db: contains information that aids pegasus-monitord to track when sub-workflows fail and are re-planned/re-tried. It is overwritten when pegasus-monitord is started in replay mode.

  • monitord-notifications.log: contains the log file for notification-related messages. Normally, this file only includes logs for failed notifications, but can be populated with all notification information when pegasus-monitord is run in verbose mode via the -v command-line option.

10.8.2. Overview of the Stampede Database Schema.

Pegasus takes in a DAX which is composed of tasks. Pegasus plans it into a Condor DAG / Executable workflow that consists of Jobs. In case of Clustering, multiple tasks in the DAX can be captured into a single job in the Executable workflow. When DAGMan executes a job, a job instance is populated . Job instances capture information as seen by DAGMan. In case DAGMan retires a job on detecting a failure , a new job instance is populated. When DAGMan finds a job instance has finished , an invocation is associated with job instance. In case of clustered job, multiple invocations will be associated with a single job instance. If a Pre script or Post Script is associated with a job instance, then invocations are populated in the database for the corresponding job instance.

The current schema version is 4.0 that is stored in the schema_info table.

Figure 10.14. Stampede Database Schema

Stampede Database Schema

10.8.2.1. Stampede Schema Upgrade Tool

Starting Pegasus 4.x the monitoring and statistics database schema has changed. If you want to use the pegasus-statistics, pegasus-analyzer and pegasus-plots against a 3.x database you will need to upgrade the schema first using the schema upgrade tool /usr/share/pegasus/sql/schema_tool.py or /path/to/pegasus-4.x/share/pegasus/sql/schema_tool.py

Upgrading the schema is required for people using the MySQL database for storing their monitoring information if it was setup with 3.x monitoring tools.

If your setup uses the default SQLite database then the new databases run with Pegasus 4.x are automatically created with the correct schema. In this case you only need to upgrade the SQLite database from older runs if you wish to query them with the newer clients.

To upgrade the database

For SQLite Database

cd /to/the/workflow/directory/with/3.x.monitord.db

Check the db version

/usr/share/pegasus/sql/schema_tool.py -c connString=sqlite:////to/the/workflow/directory/with/workflow.stampede.db
2012-02-29T01:29:43.330476Z INFO   netlogger.analysis.schema.schema_check.SchemaCheck.init | 
2012-02-29T01:29:43.330708Z INFO   netlogger.analysis.schema.schema_check.SchemaCheck.check_schema.start | 
2012-02-29T01:29:43.348995Z INFO   netlogger.analysis.schema.schema_check.SchemaCheck.check_schema 
                                   | Current version set to: 3.1. 
2012-02-29T01:29:43.349133Z ERROR  netlogger.analysis.schema.schema_check.SchemaCheck.check_schema 
                                   | Schema version 3.1 found - expecting 4.0 - database admin will need to run upgrade tool.


Convert the Database to be version 4.x compliant

/usr/share/pegasus/sql/schema_tool.py -u connString=sqlite:////to/the/workflow/directory/with/workflow.stampede.db
2012-02-29T01:35:35.046317Z INFO   netlogger.analysis.schema.schema_check.SchemaCheck.init | 
2012-02-29T01:35:35.046554Z INFO   netlogger.analysis.schema.schema_check.SchemaCheck.check_schema.start | 
2012-02-29T01:35:35.064762Z INFO   netlogger.analysis.schema.schema_check.SchemaCheck.check_schema 
                                  | Current version set to: 3.1. 
2012-02-29T01:35:35.064902Z ERROR  netlogger.analysis.schema.schema_check.SchemaCheck.check_schema 
                                  | Schema version 3.1 found - expecting 4.0 - database admin will need to run upgrade tool. 
2012-02-29T01:35:35.065001Z INFO   netlogger.analysis.schema.schema_check.SchemaCheck.upgrade_to_4_0 
                                  | Upgrading to schema version 4.0.

Verify if the database has been converted to Version 4.x

/usr/share/pegasus/sql/schema_tool.py -c connString=sqlite:////to/the/workflow/directory/with/workflow.stampede.db
2012-02-29T01:39:17.218902Z INFO   netlogger.analysis.schema.schema_check.SchemaCheck.init | 
2012-02-29T01:39:17.219141Z INFO   netlogger.analysis.schema.schema_check.SchemaCheck.check_schema.start | 
2012-02-29T01:39:17.237492Z INFO   netlogger.analysis.schema.schema_check.SchemaCheck.check_schema | Current version set to: 4.0. 
2012-02-29T01:39:17.237624Z INFO   netlogger.analysis.schema.schema_check.SchemaCheck.check_schema | Schema up to date. 

For upgrading a MySQL database the steps remain the same. The only thing that changes is the connection String to the database
E.g.

/usr/share/pegasus/sql/schema_tool.py -u connString=mysql://username:password@server:port/dbname

After the database has been upgraded you can use either 3.x or 4.x clients to query the database with pegasus-statistics, as well as pegasus-plots and pegasus-analyzer.

10.8.2.2. Storing of Exitcode in the database

Kickstart records capture raw status in addition to the exitcode . The exitcode is derived from the raw status. Starting with Pegasus 4.0 release, all exitcode columns ( i.e invocation and job instance table columns ) are stored with the raw status by pegasus-monitord. If an exitcode is encountered while parsing the dagman log files , the value is converted to the corresponding raw status before it is stored. All user tools, pegasus-analyzer and pegasus-statistics then convert the raw status to exitcode when retrieving from the database.

10.8.2.3. Multiplier Factor

Starting with the 4.0 release, there is a multiplier factor associated with the jobs in the job_instance table. It defaults to one, unless the user associates a Pegasus profile key named cores with the job in the DAX. The factor can be used for getting more accurate statistics for jobs that run on multiple processors/cores or mpi jobs.

The multiplier factor is used for computing the following metrics by pegasus statistics.

  • In the summary, the workflow cumulative job walltime

  • In the summary, the cumulative job walltime as seen from the submit side

  • In the jobs file, the multiplier factor is listed along-with the multiplied kickstart time.

  • In the breakdown file, where statistics are listed per transformation the mean, min , max and average values take into account the multiplier factor.

10.9. API Reference

10.9.1. DAX XML Schema

The DAX format is described by the XML schema instance document dax-3.3.xsd. A local copy of the schema definition is provided in the etc directory. The documentation of the XML schema and its elements can be found in dax-3.3.html as well as locally in doc/schemas/dax-3.3/dax-3.3.html in your Pegasus distribution.

10.9.1.1. DAX XML Schema In Detail

The DAX file format has four major sections, with the second section divided into more sub-sections. The DAX format works on the abstract or logical level, letting you focus on the shape of the workflows, what to do and what to work upon.

  1. Workflow-level Notifications

    Very simple workflow-level notifications. These are defined in the Notification section.

  2. Catalogs

    The first section deals with included catalogs. While we do recommend to use external replica- and transformation catalogs, it is possible to include some replicas and transformations into the DAX file itself. Any DAX-included entry takes precedence over regular replica catalog (RC) and transformation catalog (TC) entries.

    The first section (and any of its sub-sections) is completely optional.

    1. The first sub-section deals with included replica descriptions.

    2. The second sub-section deals with included transformation descriptions.

    3. The third sub-section declares multi-item executables.

  3. Job List

    The jobs section defines the job- or task descriptions. For each task to conduct, a three-part logical name declares the task and aides identifying it in the transformation catalog or one of the executable section above. During planning, the logical name is translated into the physical executable location on the chosen target site. By declaring jobs abstractly, physical layout consideration of the target sites do not matter. The job's id uniquley identifies the job within this workflow.

    The arguments declare what command-line arguments to pass to the job. If you are passing filenames, you should refer to the logical filename using the file element in the argument list.

    Important for properly planning the task is the list of files consumed by the task, its input files, and the files produced by the task, its output files. Each file is described with a uses element inside the task.

    Elements exist to link a logical file to any of the stdio file descriptors. The profile element is Pegasus's way to abstract site-specific data.

    Jobs are nodes in the workflow graph. Other nodes include unplanned workflows (DAX), which are planned and then run when the node runs, and planned workflows (DAG), which are simply executed.

  4. Control-flow Dependencies

    The third section lists the dependencies between the tasks. The relationships are defined as child parent relationships, and thus impacts the order in which tasks are run. No cyclic dependencies are permitted.

    Dependencies are directed edges in the workflow graph.

10.9.1.1.1. XML Intro

If you have seen the DAX schema before, not a lot of new items in the root element. However, we did retire the (old) attributes ending in Count.

<?xml version="1.0" encoding="UTF-8"?>
<!-- generated: 2011-07-28T18:29:57Z -->
<adag xmlns="http://pegasus.isi.edu/schema/DAX" 
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://pegasus.isi.edu/schema/DAX http://pegasus.isi.edu/schema/dax-3.3.xsd" 
      version="3.3" 
      name="diamond" 
      index="0" 
      count="1">

The following attributes are supported for the root element adag.

Table 10.18. 

attribute optional? type meaning
version required VersionPattern Version number of DAX instance document. Must be 3.3.
name required string name of this DAX (or set of DAXes).
count optional positiveInteger size of list of DAXes with this name. Defaults to 1.
index optional nonNegativeInteger current index of DAX with same name. Defaults to 0.
fileCount removed nonNegativeInteger Old 2.1 attribute, removed, do not use.
jobCount removed positiveInteger Old 2.1 attribute, removed, do not use.
childCount removed nonNegativeInteger Old 2.1 attribute, removed, do not use.

The version attribute is restricted to the regular expression \d+(\.\d+(\.\d+)?)?.This expression represents the VersionPattern type that is used in other places, too. It is a more restrictive expression than before, but allows us to compute comparable version number using the following formula:

version1: a.b.c version2: d.e.f
n = a * 1,000,000 + b * 1,000 + c m = d * 1,000,000 + e * 1,000 + f
version1 > version2 if n > m
10.9.1.1.2. Workflow-level Notifications

(something to be said here.)

  <!-- part 1.1: invocations -->
  <invoke when="at_end">/bin/date -Ins &gt;&gt; my.log</invoke>

The above snippet will append the current time to a log file in the current directory. This is with regards to the monitord instance acting on the notification.

10.9.1.1.3. The Catalogs Section

The initial section features three sub-sections:

  1. a catalog of files used,

  2. a catalog of transformations used, and

  3. compound transformation declarations.

10.9.1.1.3.1. The Replica Catalog Section

The file section acts as in in-file replica catalog (RC). Any files declared in this section take precedence over files in external replica catalogs during planning.

  <!-- part 1.2: included replica catalog -->
  <file name="example.a" >
    <!-- profiles are optional -->
    <!-- The "stat" namespace is ONLY AN EXAMPLE -->
    <profile namespace="stat" key="size">/* integer to be defined */</profile>
    <profile namespace="stat" key="md5sum">/* 32 char hex string */</profile>
    <profile namespace="stat" key="mtime">/* ISO-8601 timestamp */</profile>

    <!-- metadata is currently NOT SUPPORTED -->
    <metadata key="timestamp" type="int">/* ISO-8601 *or* 20100417134523:int */</metadata>
    <metadata key="origin" type="string">ocean</metadata>
    
    <!-- PFN to by-pass replica catalog -->
    <!-- The "site attribute is optional -->
    <pfn url="file:///tmp/example.a" site="local">
      <profile namespace="stat" key="owner">voeckler</profile>
    </pfn>
    <pfn url="file:///storage/funky.a" site="local"/>    
  </file>

  <!-- a more typical example from the black diamond -->
  <file name="f.a">
    <pfn url="file:///Users/voeckler/f.a" site="local"/>
  </file>

The first file entry above is an example of a data file with two replicas. The file element requires a logical file name. Each logical filename may have additional information associated with it, enumerated by profile elements. Each file entry may have 0 or more metadata associated with it. Each piece of metadata has a key string and type attribute describing the element's value.

Warning

The metadata element is not support as of this writing! Details may change in the future.

The file element can provide 0 or more pfn locations, taking precedence over the replica catalog. A file element that does not name any pfn children-elements will still require look-ups in external replica catalogs. Each pfn element names a concrete location of a file. Multiple locations constitute replicas of the same file, and are assumed to be usable interchangably. The url attribute is mandatory, and typically would use a file schema URL. The site attribute is optional, and defaults to value local if missing. A pfn element may have profile children-elements, which refer to attributes of the physical file. The file-level profiles refer to attributes of the logical file.

Note

The stat profile namespace is ony an example, and details about stat are not yet implemented. The proper namespaces pegasus, condor, dagman, env, hints, globus and selector enjoy full support.

The second file entry above shows a usage example from the black-diamond example workflow that you are more likely to encouter or write.

The presence of an in-file replica catalog lets you declare a couple of interesting advanced features. The DAG and DAX file declarations are just files for all practical purposes. For deferred planning, the location of the site catalog (SC) can be captured in a file, too, that is passed to the job dealing with the deferred planning as logical filename.

  <file name="black.dax" >
    <!-- specify the location of the DAX file -->
    <pfn url="file:///Users/vahi/Pegasus/work/dax-3.0/blackdiamond_dax.xml" site="local"/>
  </file>

  <file name="black.dag" >
    <!-- specify the location of the DAG file -->
    <pfn url="file:///Users/vahi/Pegasus/work/dax-3.0/blackdiamond.dag" site="local"/>
  </file>
  
  <file name="sites.xml" >
    <!-- specify the location of a site catalog to use for deferred planning -->
    <pfn url="file:///Users/vahi/Pegasus/work/dax-3.0/conf/sites.xml" site="local"/>
  </file>
10.9.1.1.3.. The Transformation Catalog Section

The executable section acts as an in-file transformation catalog (TC). Any transformations declared in this section take precedence over the external transformation catalog during planning.

  <!-- part 1.3: included transformation catalog -->
  <executable namespace="example" name="mDiffFit" version="1.0" 
              arch="x86_64" os="linux" installed="true" >
    <!-- profiles are optional -->
    <!-- The "stat" namespace is ONLY AN EXAMPLE! -->
    <profile namespace="stat" key="size">5000</profile>
    <profile namespace="stat" key="md5sum">AB454DSSDA4646DS</profile>
    <profile namespace="stat" key="mtime">2010-11-22T10:05:55.470606000-0800</profile>

    <!-- metadata is currently NOT SUPPORTED! -->
    <metadata key="timestamp" type="int">/* see above */</metadata>
    <metadata key="origin" type="string">ocean</metadata>
 
    <!-- PFN to by-pass transformation catalog -->
    <!-- The "site" attribute is optional -->
    <pfn url="file:///tmp/mDiffFit"          site="local"/>     
    <pfn url="file:///tmp/storage/mDiffFit"  site="local"/>     
  </executable>

  <!-- to be used in compound transformation later -->
  <executable namespace="example" name="mDiff" version="1.0" 
              arch="x86_64" os="linux" installed="true" >
    <pfn url="file:///tmp/mDiff" site="local"/>        
  </executable>

  <!-- to be used in compound transformation later -->
  <executable namespace="example" name="mFitplane" version="1.0"
              arch="x86_64" os="linux" installed="true" >
    <pfn url="file:///tmp/mDiffFitplane"  site="local">
      <profile namespace="stat" key="md5sum">0a9c38b919c7809cb645fc09011588a6</profile>
    </pfn>
    <invoke when="at_end">/path/to/my_send_email some args</invoke>
  </executable>

  <!-- a more likely example from the black diamond -->
  <executable namespace="diamond" name="preprocess" version="2.0" 
              arch="x86_64"
              os="linux" 
              osversion="2.6.18">
    <pfn url="file:///opt/pegasus/default/bin/keg" site="local" />
  </executable>

Logical filenames pertaining to a single executables in the transformation catalog use the executable element. Any executable element features the optional namespace attribute, a mandatory name attribute, and an optional version attribute. The version attribute defaults to "1.0" when absent. An executable typically needs additional attributes to describe it properly, like the architecture, OS release and other flags typically seen with transformations, or found in the transformation catalog.

Table 10.19. 

attribute optional? type meaning
name required string logical transformation name
namespace optional string namespace of logical transformation, default to null value.
version optional VersionPattern version of logical transformation, defaults to "1.0".
installed optional boolean whether to stage the file (false), or not (true, default).
arch optional Architecture restricted set of tokens, see schema definition file.
os optional OSType restricted set of tokens, see schema definition file.
osversion optional VersionPattern kernel version as beginning of `uname -r`.
glibc optional VersionPattern version of libc.

The rationale for giving these flags in the executable element header is that PFNs are just identical replicas or instances of a given LFN. If you need a different 32/64 bit-ed-ness or OS release, the underlying PFN would be different, and thus the LFN for it should be different, too.

Note

We are still discussing some details and implications of this decision.

The initial examples come with the same caveats as for the included replica catalog.

Warning

The metadata element is not support as of this writing! Details may change in the future.

Similar to the replica catalog, each executable element may have 0 or more profile elements abstracting away site-specific details, zero or more metadata elements, and zero or more pfn elements. If there are no pfn elements, the transformation must still be searched for in the external transformation catalog. As before, the pfn element may have profile children-elements, referring to attributes of the physical filename itself.

Each executable element may also feature invoke elements. These enable notifications at the appropriate point when every job that uses this executable reaches the point of notification. Please refer to the notification section for details and caveats.

The last example above comes from the black diamond example workflow, and presents the kind and extend of attributes you are most likely to see and use in your own workflows.

10.9.1.1.3.3. The Compound Transformation Section

The compound transformation section declares a transformation that comprises multiple plain transformation. You can think of a compound transformation like a script interpreter and the script itself. In order to properly run the application, you must start both, the script interpreter and the script passed to it. The compound transformation helps Pegasus to properly deal with this case, especially when it needs to stage executables.

  <transformation namespace="example" version="1.0" name="mDiffFit" >
    <uses name="mDiffFit" />
    <uses name="mDiff" namespace="example" version="2.0" />
    <uses name="mFitPlane" />
    <uses name="mDiffFit.config" executable="false" />
  </transformation>

A transformation element declares a set of purely logical entities, executables and config (data) files, that are all required together for the same job. Being purely logical entities, the lookup happens only when the transformation element is referenced (or instantiated) by a job element later on.

The namespace and version attributes of the transformation element are optional, and provide the defaults for the inner uses elements. They are also essential for matching the transformation with a job.

The transformation is made up of 1 or more uses element. Each uses has a boolean attribute executable, true by default, or false to indicate a data file. The name is a mandatory attribute, refering to an LFN declared previously in the File Catalog (executable is false), Executable Catalog (executable is true), or to be looked up as necessary at instantiation time. The lookup catalog is determined by the executable attribute.

After uses elements, any number of invoke elements may occur to add a notification each whenever this transformation is instantiated.

The namespace and version attributes' default values inside uses elements are inherited from the transformation attributes of the same name. There is no such inheritance for uses elements with executable attribute of false.

10.9.1.1.4. Graph Nodes

The nodes in the DAX comprise regular job nodes, already instantiated sub-workflows as dag nodes, and still to be instantiated dax nodes. Each of the graph nodes can has a mandatory id attribute. The id attribute is currently a restriction of type NodeIdentifierPattern type, which is a restriction of the xs:NMTOKEN type to letters, digits, hyphen and underscore.

The level attribute is deprecated, as the planner will trust its own re-computation more than user input. Please do not use nor produce any level attribute.

The node-label attribute is optional. It applies to the use-case when every transformation has the same name, but its arguments determine what it really does. In the presence of a node-label value, a workflow grapher could use the label value to show graph nodes to the user. It may also come in handy while debugging.

Any job-like graph node has the following set of children elements, as defined in the AbstractJobType declaration in the schema definition:

  • 0 or 1 argument element to declare the command-line of the job's invocation.

  • 0 or more profile elements to abstract away site-specific or job-specific details.

  • 0 or 1 stdin element to link a logical file the the job's standard input.

  • 0 or 1 stdout element to link a logical file to the job's standard output.

  • 0 or 1 stderr element to link a logical file to the job's standard error.

  • 0 or more uses elements to declare consumed data files and produced data files.

  • 0 or more invoke elements to solicit notifications whence a job reaches a certain state in its life-cycle.

10.9.1.1.4.1. Job Nodes

A job element has a number of attributes. In addition to the id and node-label described in (Graph Nodes)above, the optional namespace, mandatory name and optional version identify the transformation, and provide the look-up handle: first in the DAX's transformation elements, then in the executable elements, and finally in an external transformation catalog.

  <!-- part 2: definition of all jobs (at least one) -->
  <job id="ID000001" namespace="example" name="mDiffFit" version="1.0" 
       node-label="preprocess" >
    <argument>-a top -T 6  -i <file name="f.a"/>  -o <file name="f.b1"/></argument>

    <!-- profiles are optional -->
    <profile namespace="execution" key="site">isi_viz</profile>
    <profile namespace="condor" key="getenv">true</profile>

    <uses name="f.a" link="input"  register="false" transfer="true" type="data" />
    <uses name="f.b" link="output" register="false" transfer="true" type="data" />
    
    <!-- 'WHEN' enumeration: never, start, on_error, on_success, on_end, all -->
    <!-- PEGASUS_* env-vars: event, status, submit dir, wf/job id, stdout, stderr -->
    <invoke when="start">/path/to arg arg</invoke>
    <invoke when="on_success"><![CDATA[/path/to arg arg]]></invoke>
    <invoke when="on_end"><![CDATA[/path/to arg arg]]></invoke>
  </job>

The argument element contains the complete command-line that is needed to invoke the executable. The only variable components are logical filenames, as included file elements.

The profile argument lets you encapsulate site-specific knowledge .

The stdin, stdout and stderr element permits you to connect a stdio file descriptor to a logical filename. Note that you will still have to declare these files in the uses section below.

The uses element enumerates all the files that the task consumes or produces. While it is not necessary nor required to have all files appear on the command-line, it is imperative that you declare even hidden files that your task requires in this section, so that the proper ancilliary staging- and clean-up tasks can be generated during planning.

The invoke element may be specified multiple times, as needed. It has a mandatory when attribute with the following value set:

Table 10.20. 

keyword job life-cycle state meaning
never never (default). Never notify of anything. This is useful to temporarily disable an existing notifications.
start submit create a notification when the job is submitted.
on_error end after a job finishes with failure (exitcode != 0).
on_success end after a job finishes with success (exitcode == 0).
at_end end after a job finishes, regardless of exitcode.
all always like start and at_end combined.

Warning

In clustered jobs, a notification can only be sent at the start or end of the clustered job, not for each member.

Each invoke is a simple local invocation of an executable or script with the specified arguments. The executable inside the invoke body will see the following environment variables:

Table 10.21. 

variable job life-cycle state meaning
PEGASUS_EVENT always The value of the when attribute
PEGASUS_STATUS end The exit status of the graph node. Only available for end notifications.
PEGASUS_SUBMIT_DIR always In which directory to find the job (or workflow).
PEGASUS_JOBID always The job (or workflow) identifier. This is potentially more than merely the value of the id attribute.
PEGASUS_STDOUT always The filename where stdout goes. Empty and possibly non-existent at submit time (though we still have the filename). The kickstart record for job nodes.
PEGASUS_STDERR always The filename where stderr goes. Empty and possibly non-existent at submit time (though we still have the filename).

Generators should use CDATA encapsulated values to the invoke element to minimize interference. Unfortunately, CDATA cannot be nested, so if the user invocation contains a CDATA section, we suggest that they use careful XML-entity escaped strings. The notifications section describes these in further detail.

10.9.1.1.4.2. DAG Nodes

A workflow that has already been concretized, either by an earlier run of Pegasus, or otherwise constructed for DAGMan execution, can be included into the current workflow using the dag element.

  <dag id="ID000003" name="black.dag" node-label="foo" >
    <profile namespace="dagman" key="DIR">/dag-dir/test</profile>
    <invoke> <!-- optional, should be possible --> </invoke>
    <uses file="sites.xml" link="input" register="false" transfer="true" type="data"/>     
  </dag>

The id and node-label attributes were described previously. The name attribute refers to a file from the File Catalog that provides the actual DAGMan DAG as data content. The dag element features optional profile elements. These would most likely pertain to the dagman and env profile namespaces. It should be possible to have the optional notify element in the same manner as for jobs.

A graph node that is a dag instead of a job would just use a different submit file generator to create a DAGMan invocation. There can be an argument element to modify the command-line passed to DAGMan.

10.9.1.1.4.3. DAX Nodes

A still to be planned workflow incurs an invocation of the Pegasus planner as part of the workflow. This still abstract sub-workflow uses the dax element.

  <dax id="ID000002" name="black.dax" node-label="bar" >
    <profile namespace="env" key="foo">bar</profile>
    <argument>-Xmx1024 -Xms512 -Dpegasus.dir.storage=storagedir  -Dpegasus.dir.exec=execdir -o local --dir ./datafind -vvvvv --force -s dax_site </argument>
    <invoke> <!-- optional, may not be possible here --> </invoke>
    <uses file="sites.xml" link="input" register="false" transfer="true" type="data" />
  </dax>

In addition to the id and node-label attributes, See Graph Nodes. The name attribute refers to a file from the File Catalog that provides the to be planned DAX as external file data content. The dax element features optional profile elements. These would most likely pertain to the pegasus, dagman and env profile namespaces. It may be possible to have the optional notify element in the same manner as for jobs.

A graph node that is a dax instead of a job would just use yet another submit file and pre-script generator to create a DAGMan invocation. The argument string pertains to the command line of the to-be-generated DAGMan invocation.

10.9.1.1.4.4. Inner ADAG Nodes

While completeness would argue to have a recursive nesting of adag elements, such recursive nestings are currently not supported, not even in the schema. If you need to nest workflows, please use the dax or dag element to achieve the same goal.

10.9.1.1.5. The Dependency Section

This section describes the dependencies between the jobs.

  <!-- part 3: list of control-flow dependencies -->
  <child ref="ID000002">
    <parent ref="ID000001" edge-label="edge1" />
  </child>
  <child ref="ID000003">
    <parent ref="ID000001" edge-label="edge2" />
  </child>
  <child ref="ID000004">
    <parent ref="ID000002" edge-label="edge3" />
    <parent ref="ID000003" edge-label="edge4" />
  </child>

Each child element contains one or more parent element. Either element refers to a job, dag or dax element id attribute using the ref attribute. In this version, we relaxed the xs:IDREF constraint in favor of a restriction on the xs:NMTOKEN type to permit a larger set of identifiers.

The parent element has an optional edge-label attribute.

Warning

The edge-label attribute is currently unused.

Its goal is to annotate edges when drawing workflow graphs.

10.9.1.1.6. Closing

As any XML element, the root element needs to be closed.

</adag>

10.9.1.2. DAX XML Schema Example

The following code example shows the XML instance document representing the diamond workflow.

<?xml version="1.0" encoding="UTF-8"?>
<adag xmlns="http://pegasus.isi.edu/schema/DAX"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://pegasus.isi.edu/schema/DAX http://pegasus.isi.edu/schema/dax-3.3.xsd"
 version="3.3" name="diamond" index="0" count="1">
  <!-- part 1.1: invocations -->
  <invoke when="on_error">/bin/mailx -s &apos;diamond failed&apos; use@some.domain</invoke>

  <!-- part 1.2: included replica catalog -->
  <file name="f.a">
    <pfn url="file:///lfs/voeckler/src/svn/pegasus/trunk/examples/grid-blackdiamond-perl/f.a" site="local" />
  </file>

  <!-- part 1.3: included transformation catalog -->
  <executable namespace="diamond" name="preprocess" version="2.0" arch="x86_64" os="linux" installed="false">
    <profile namespace="globus" key="maxtime">2</profile>
    <profile namespace="dagman" key="RETRY">3</profile>
    <pfn url="file:///opt/pegasus/latest/bin/keg" site="local" />
  </executable>
  <executable namespace="diamond" name="analyze" version="2.0" arch="x86_64" os="linux" installed="false">
    <profile namespace="globus" key="maxtime">2</profile>
    <profile namespace="dagman" key="RETRY">3</profile>
    <pfn url="file:///opt/pegasus/latest/bin/keg" site="local" />
  </executable>
  <executable namespace="diamond" name="findrange" version="2.0" arch="x86_64" os="linux" installed="false">
    <profile namespace="globus" key="maxtime">2</profile>
    <profile namespace="dagman" key="RETRY">3</profile>
    <pfn url="file:///opt/pegasus/latest/bin/keg" site="local" />
  </executable>

  <!-- part 2: definition of all jobs (at least one) -->
  <job namespace="diamond" name="preprocess" version="2.0" id="ID000001">
    <argument>-a preprocess -T60 -i <file name="f.a" /> -o <file name="f.b1" /> <file name="f.b2" /></argument>
    <uses name="f.b2" link="output" register="false" transfer="true" />
    <uses name="f.b1" link="output" register="false" transfer="true" />
    <uses name="f.a" link="input" />
  </job>
  <job namespace="diamond" name="findrange" version="2.0" id="ID000002">
    <argument>-a findrange -T60 -i <file name="f.b1" /> -o <file name="f.c1" /></argument>
    <uses name="f.b1" link="input" register="false" transfer="true" />
    <uses name="f.c1" link="output" register="false" transfer="true" />
  </job>
  <job namespace="diamond" name="findrange" version="2.0" id="ID000003">
    <argument>-a findrange -T60 -i <file name="f.b2" /> -o <file name="f.c2" /></argument>
    <uses name="f.b2" link="input" register="false" transfer="true" />
    <uses name="f.c2" link="output" register="false" transfer="true" />
  </job>
  <job namespace="diamond" name="analyze" version="2.0" id="ID000004">
    <argument>-a analyze -T60 -i <file name="f.c1" /> <file name="f.c2" /> -o <file name="f.d" /></argument>
    <uses name="f.c2" link="input" register="false" transfer="true" />
    <uses name="f.d" link="output" register="false" transfer="true" />
    <uses name="f.c1" link="input" register="false" transfer="true" />
  </job>

  <!-- part 3: list of control-flow dependencies -->
  <child ref="ID000002">
    <parent ref="ID000001" />
  </child>
  <child ref="ID000003">
    <parent ref="ID000001" />
  </child>
  <child ref="ID000004">
    <parent ref="ID000002" />
    <parent ref="ID000003" />
  </child>
</adag>

The above workflow defines the black diamond from the abstract workflow section of the Introduction chapter. It will require minimal configuration, because the catalog sections include all necessary declarations.

The file element defines the location of the required input file in terms of the local machine. Please note that

  • The file element declares the required input file "f.a" in terms of the local machine. Please note that if you plan the workflow for a remote site, the has to be some way for the file to be staged from the local site to the remote site. While Pegasus will augment the workflow with such ancillary jobs, the site catalog as well as local and remote site have to be set up properlyl. For a locally run workflow you don't need to do anything.

  • The executable elements declare the same executable keg that is to be run for each the logical transformation in terms of the remote site futuregrid. To declare it for a local site, you would have to adjust the site attribute's value to local. This section also shows that the same executable may come in different guises as transformation.

  • The job elements define the workflow's logical constituents, the way to invoke the keg command, where to put filenames on the commandline, and what files are consumed or produced. In addition to the direction of files, further attributes determine whether to register the file with a replica catalog and whether to transfer it to the output site in case of a product. We are only interested in the final data product "f.d" in this workflow, and not any intermediary files. Typically, you would also want to register the data products in the replica catalog, especially in larger scenarios.

  • The child elements define the control flow between the jobs.

10.9.2. DAX Generator API

The DAX generating APIs support Java, Perl and Python. This section will show in each language the necessary code, using Pegasus-provided libraries, to generate the diamond DAX example above. There may be minor differences in details, e.g. to show-case certain features, but effectively all generate the same basic diamond.

10.9.2.1. The Java DAX Generator API

The Java DAX API provided with the Pegasus distribution allows easy creation of complex and huge workflows. This API is used by several applications to generate their abstract DAX. SCEC, which is Southern California Earthquake Center, uses this API in their CyberShake workflow generator to generate huge DAX containing 10&rsquor;s of thousands of tasks with 100&rsquor;s of thousands of input and output files. The Java API is well documented using Javadoc for ADAGs .

The steps involved in creating a DAX using the API are

  1. Create a new ADAG object

  2. Add any Workflow notification elements

  3. Create File objects as necessary. You can augment the files with physical information, if you want to include them into your DAX. Otherwise, the physical information is determined from the replica catalog.

  4. (Optional) Create Executable objects, if you want to include your transformation catalog into your DAX. Otherwise, the translation of a job/task into executable location happens with the transformation catalog.

  5. Create a new Job object.

  6. Add arguments, files, profiles, notifications and other information to the Job object

  7. Add the job object to the ADAG object

  8. Repeat step 4-6 as necessary.

  9. Add all dependencies to the ADAG object.

  10. Call the writeToFile() method on the ADAG object to render the XML DAX file.

An example Java code that generates the diamond dax show above is listed below. This same code can be found in the Pegasus distribution in the examples/grid-blackdiamond-java directory as BlackDiamonDAX.java:

/**
 *  Copyright 2007-2008 University Of Southern California
 *
 *  Licensed under the Apache License, Version 2.0 (the "License");
 *  you may not use this file except in compliance with the License.
 *  You may obtain a copy of the License at
 *
 *  http://www.apache.org/licenses/LICENSE-2.0
 *
 *  Unless required by applicable law or agreed to in writing,
 *  software distributed under the License is distributed on an "AS IS" BASIS,
 *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 *  See the License for the specific language governing permissions and
 *  limitations under the License.
 */

import edu.isi.pegasus.planner.dax.*;

public class BlackDiamondDAX {

    /**
     * Create an example DIAMOND DAX
     * @param args
     */
    public static void main(String[] args) {
        if (args.length != 3) {
            System.out.println("Usage: java ADAG <site_handle> <pegasus_location> <filename.dax>");
            System.exit(1);
        }

        try {
            Diamond(args[0], args[1]).writeToFile(args[2]);
        }
        catch (Exception e) {
            e.printStackTrace();
        }

    }

    private static ADAG Diamond(String site_handle, String pegasus_location) throws Exception {

        java.io.File cwdFile = new java.io.File (".");
        String cwd = cwdFile.getCanonicalPath(); 

        ADAG dax = new ADAG("blackdiamond");
        dax.addNotification(When.start,"/pegasus/libexec/notification/email -t notify@example.com");
        dax.addNotification(When.at_end,"/pegasus/libexec/notification/email -t notify@example.com");
        File fa = new File("f.a");
        fa.addPhysicalFile("file://" + cwd + "/f.a", "local");
        dax.addFile(fa);

        File fb1 = new File("f.b1");
        File fb2 = new File("f.b2");
        File fc1 = new File("f.c1");
        File fc2 = new File("f.c2");
        File fd = new File("f.d");
        fd.setRegister(true);

        Executable preprocess = new Executable("pegasus", "preprocess", "4.0");
        preprocess.setArchitecture(Executable.ARCH.X86).setOS(Executable.OS.LINUX);
        preprocess.setInstalled(true);
        preprocess.addPhysicalFile("file://" + pegasus_location + "/bin/keg", site_handle);

        Executable findrange = new Executable("pegasus", "findrange", "4.0");
        findrange.setArchitecture(Executable.ARCH.X86).setOS(Executable.OS.LINUX);
        findrange.setInstalled(true);
        findrange.addPhysicalFile("file://" + pegasus_location + "/bin/keg", site_handle);

        Executable analyze = new Executable("pegasus", "analyze", "4.0");
        analyze.setArchitecture(Executable.ARCH.X86).setOS(Executable.OS.LINUX);
        analyze.setInstalled(true);
        analyze.addPhysicalFile("file://" + pegasus_location + "/bin/keg", site_handle);

        dax.addExecutable(preprocess).addExecutable(findrange).addExecutable(analyze);

        // Add a preprocess job
        Job j1 = new Job("j1", "pegasus", "preprocess", "4.0");
        j1.addArgument("-a preprocess -T 60 -i ").addArgument(fa);
        j1.addArgument("-o ").addArgument(fb1);
        j1.addArgument(" ").addArgument(fb2);
        j1.uses(fa, File.LINK.INPUT);
        j1.uses(fb1, File.LINK.OUTPUT);
        j1.uses(fb2, File.LINK.OUTPUT);
        j1.addNotification(When.start,"/pegasus/libexec/notification/email -t notify@example.com");
        j1.addNotification(When.at_end,"/pegasus/libexec/notification/email -t notify@example.com");
        dax.addJob(j1);

        // Add left Findrange job
        Job j2 = new Job("j2", "pegasus", "findrange", "4.0");
        j2.addArgument("-a findrange -T 60 -i ").addArgument(fb1);
        j2.addArgument("-o ").addArgument(fc1);
        j2.uses(fb1, File.LINK.INPUT);
        j2.uses(fc1, File.LINK.OUTPUT);
        j2.addNotification(When.start,"/pegasus/libexec/notification/email -t notify@example.com");
        j2.addNotification(When.at_end,"/pegasus/libexec/notification/email -t notify@example.com");
        dax.addJob(j2);

        // Add right Findrange job
        Job j3 = new Job("j3", "pegasus", "findrange", "4.0");
        j3.addArgument("-a findrange -T 60 -i ").addArgument(fb2);
        j3.addArgument("-o ").addArgument(fc2);
        j3.uses(fb2, File.LINK.INPUT);
        j3.uses(fc2, File.LINK.OUTPUT);
        j3.addNotification(When.start,"/pegasus/libexec/notification/email -t notify@example.com");
        j3.addNotification(When.at_end,"/pegasus/libexec/notification/email -t notify@example.com");
        dax.addJob(j3);

        // Add analyze job
        Job j4 = new Job("j4", "pegasus", "analyze", "4.0");
        j4.addArgument("-a analyze -T 60 -i ").addArgument(fc1);
        j4.addArgument(" ").addArgument(fc2);
        j4.addArgument("-o ").addArgument(fd);
        j4.uses(fc1, File.LINK.INPUT);
        j4.uses(fc2, File.LINK.INPUT);
        j4.uses(fd, File.LINK.OUTPUT);
        j4.addNotification(When.start,"/pegasus/libexec/notification/email -t notify@example.com");
        j4.addNotification(When.at_end,"/pegasus/libexec/notification/email -t notify@example.com");
        dax.addJob(j4);

        dax.addDependency("j1", "j2");
        dax.addDependency("j1", "j3");
        dax.addDependency("j2", "j4");
        dax.addDependency("j3", "j4");
        return dax;
    }
}

Of course, you will have to set up some catalogs and properties to run this example. The details are catpured in the examples directory examples/grid-blackdiamond-java.

10.9.2.2. The Python DAX Generator API

Refer to the auto-generated python documentation explaining this API.

#!/usr/bin/env python

from Pegasus.DAX3 import *
import sys
import os

if len(sys.argv) != 2:
        print "Usage: %s PEGASUS_HOME" % (sys.argv[0])
        sys.exit(1)

# Create a abstract dag
diamond = ADAG("diamond")

# Add input file to the DAX-level replica catalog
a = File("f.a")
a.addPFN(PFN("file://" + os.getcwd() + "/f.a", "local"))
diamond.addFile(a)
        
# Add executables to the DAX-level replica catalog
# In this case the binary is keg, which is shipped with Pegasus, so we use
# the remote PEGASUS_HOME to build the path.
e_preprocess = Executable(namespace="diamond", name="preprocess", version="4.0", os="linux", arch="x86_64")
e_preprocess.addPFN(PFN("file://" + sys.argv[1] + "/bin/keg", "TestCluster"))
diamond.addExecutable(e_preprocess)
        
e_findrange = Executable(namespace="diamond", name="findrange", version="4.0", os="linux", arch="x86_64")
e_findrange.addPFN(PFN("file://" + sys.argv[1] + "/bin/keg", "TestCluster"))
diamond.addExecutable(e_findrange)
        
e_analyze = Executable(namespace="diamond", name="analyze", version="4.0", os="linux", arch="x86_64")
e_analyze.addPFN(PFN("file://" + sys.argv[1] + "/bin/keg", "TestCluster"))
diamond.addExecutable(e_analyze)

# Add a preprocess job
preprocess = Job(namespace="diamond", name="preprocess", version="4.0")
b1 = File("f.b1")
b2 = File("f.b2")
preprocess.addArguments("-a preprocess","-T60","-i",a,"-o",b1,b2)
preprocess.uses(a, link=Link.INPUT)
preprocess.uses(b1, link=Link.OUTPUT)
preprocess.uses(b2, link=Link.OUTPUT)
diamond.addJob(preprocess)

# Add left Findrange job
frl = Job(namespace="diamond", name="findrange", version="4.0")
c1 = File("f.c1")
frl.addArguments("-a findrange","-T60","-i",b1,"-o",c1)
frl.uses(b1, link=Link.INPUT)
frl.uses(c1, link=Link.OUTPUT)
diamond.addJob(frl)

# Add right Findrange job
frr = Job(namespace="diamond", name="findrange", version="4.0")
c2 = File("f.c2")
frr.addArguments("-a findrange","-T60","-i",b2,"-o",c2)
frr.uses(b2, link=Link.INPUT)
frr.uses(c2, link=Link.OUTPUT)
diamond.addJob(frr)

# Add Analyze job
analyze = Job(namespace="diamond", name="analyze", version="4.0")
d = File("f.d")
analyze.addArguments("-a analyze","-T60","-i",c1,c2,"-o",d)
analyze.uses(c1, link=Link.INPUT)
analyze.uses(c2, link=Link.INPUT)
analyze.uses(d, link=Link.OUTPUT, register=True)
diamond.addJob(analyze)

# Add control-flow dependencies
diamond.depends(parent=preprocess, child=frl)
diamond.depends(parent=preprocess, child=frr)
diamond.depends(parent=frl, child=analyze)
diamond.depends(parent=frr, child=analyze)

# Add notification for analyze job
analyze.invoke(When.ON_ERROR, '/home/user/bin/email -s "Analyze job failed" user@example.com')

# Add notification for workflow
diamond.invoke(When.AT_END, '/home/user/bin/email -s "Workflow finished" user@example.com')
diamond.invoke(When.ON_SUCCESS, '/home/user/bin/publish_workflow_result')

# Write the DAX to stdout
diamond.writeXML(sys.stdout)

10.9.2.3. The Perl DAX Generator

The Perl API example below can be found in file blackdiamond.pl in directory examples/grid-blackdiamond-perl. It requires that you set the environment variable PEGASUS_HOME to the installation directory of Pegasus, and include into PERL5LIB the path to the directory lib/perl of the Pegasus installation. The actual code is longer, and will not require these settings, only the example below does. The Perl API is documented using perldoc. For each of the modules you can invoke perldoc, if your PERL5LIB variable is set.

The steps to generate a DAX from Perl are similar to the Java steps. However, since most methods to the classes are deeply within the Perl class modules, the convenience module Perl::DAX::Factory makes most constructors accessible without you needing to type your fingers raw:

  1. Create a new ADAG object.

  2. Create Job objects as necessary.

  3. As example, the required input file "f.a" is declared as File object and linked to the ADAG object.

  4. The first job arguments and files are filled into the job, and the job is added to the ADAG object.

  5. Repeat step 4 for the remaining jobs.

  6. Add dependencies for all jobs. You have the option of assigning label text to edges, though these are not used (yet).

  7. To generate the DAX file, invoke the toXML() method on the ADAG object. The first argument is an opened file handle or IO::Handle descriptor scalar to write to, the second the default indentation for the root element, and the third the XML namespace to use for elements and attributes. The latter is typically unused unless you want to include your output into another XML document.

#!/usr/bin/env perl
#
use 5.006;
use strict;
use IO::Handle;
use Cwd;
use File::Spec;
use File::Basename;
use Sys::Hostname;
use POSIX ();

BEGIN { $ENV{'PEGASUS_HOME'} ||= `pegasus-config --nocrlf --home` }
use lib File::Spec->catdir( $ENV{'PEGASUS_HOME'}, 'lib', 'perl' );

use Pegasus::DAX::Factory qw(:all);
use constant NS => 'diamond';

my $adag = newADAG( name => NS );
my $job1 = newJob( namespace => NS, name => 'preprocess', version => '2.0' );
my $job2 = newJob( namespace => NS, name => 'findrange', version => '2.0' );
my $job3 = newJob( namespace => NS, name => 'findrange', version => '2.0' );
my $job4 = newJob( namespace => NS, name => 'analyze', version => '2.0' );

# create "f.a" locally
my $fn = "f.a";
open( F, ">$fn" ) || die "FATAL: Unable to open $fn: $!\n";
my @now = gmtime();
printf F "%04u-%02u-%02u %02u:%02u:%02uZ\n",
        $now[5]+1900, $now[4]+1, @now[3,2,1,0];
close F;

my $file = newFile( name => 'f.a' );
$file->addPFN( newPFN( url => 'file://' . Cwd::abs_path($fn),
                       site => 'local' ) );
$adag->addFile($file);

# follow this path, if the PEGASUS_HOME was determined
if ( exists $ENV{'PEGASUS_HOME'} ) {
    my $keg = File::Spec->catfile( $ENV{'PEGASUS_HOME'}, 'bin', 'keg' );
    my @os = POSIX::uname();
    # $os[2] =~ s/^(\d+(\.\d+(\.\d+)?)?).*/$1/;  ## create a proper osversion
    $os[4] =~ s/i.86/x86/;

    # add Executable instances to DAX-included TC. This will only work,
    # if we know how to access the keg executable. HOWEVER, for a grid
    # workflow, these entries are not used, and you need to
    # [1] install the work tools remotely
    # [2] create a TC with the proper entries
    if ( -x $keg ) {
        for my $j ( $job1, $job2, $job4 ) {
            my $app = newExecutable( namespace => $j->namespace,
                                     name => $j->name,
                                     version => $j->version,
                                     installed => 'false',
                                     arch => $os[4],
                                     os => lc($^O) );
            $app->addProfile( 'globus', 'maxtime', '2' );
            $app->addProfile( 'dagman', 'RETRY', '3' );
            $app->addPFN( newPFN( url => "file://$keg", site => 'local' ) );
            $adag->addExecutable($app);
        }
    }
}

my %hash = ( link => LINK_OUT, register => 'false', transfer => 'true' );
my $fna = newFilename( name => $file->name, link => LINK_IN );
my $fnb1 = newFilename( name => 'f.b1', %hash );
my $fnb2 = newFilename( name => 'f.b2', %hash );
$job1->addArgument( '-a', $job1->name, '-T60', '-i', $fna,
                    '-o', $fnb1, $fnb2 );
$adag->addJob($job1);

my $fnc1 = newFilename( name => 'f.c1', %hash );
$fnb1->link( LINK_IN );
$job2->addArgument( '-a', $job2->name, '-T60', '-i', $fnb1,
                    '-o', $fnc1 );
$adag->addJob($job2);

my $fnc2 = newFilename( name => 'f.c2', %hash );
$fnb2->link( LINK_IN );
$job3->addArgument( '-a', $job3->name, '-T60', '-i', $fnb2,
                    '-o', $fnc2 );
$adag->addJob($job3);
# a convenience function -- you can specify multiple dependents
$adag->addDependency( $job1, $job2, $job3 );

my $fnd = newFilename( name => 'f.d', %hash );
$fnc1->link( LINK_IN );
$fnc2->link( LINK_IN );
$job4->separator('');                # just to show the difference wrt default
$job4->addArgument( '-a ', $job4->name, ' -T60 -i ', $fnc1, ' ', $fnc2,
                    ' -o ', $fnd );
$adag->addJob($job4);
# this is a convenience function adding parents to a child.
# it is clearer than overloading addDependency
$adag->addInverse( $job4, $job2, $job3 );

# workflow level notification in case of failure
# refer to Pegasus::DAX::Invoke for details
my $user = $ENV{USER} || $ENV{LOGNAME} || scalar getpwuid($>);
$adag->invoke( INVOKE_ON_ERROR,
               "/bin/mailx -s 'blackdiamond failed' $user" );

my $xmlns = shift;
$adag->toXML( \*STDOUT, '', $xmlns );

10.9.3. DAX Generator without a Pegasus DAX API

If you are using some other scripting or programming environment, you can directly write out the DAX format using the provided schema using any language. For instance, LIGO, the Laser Interferometer Gravitational Wave Observatory, generate their DAX files as XML using their own Python code, not using our provided API.

If you write your own XML, you must ensure that the generated XML is well formed and valid with respect to the DAX schema. You can use the pegasus-dax-validator to verify the validity of your generated file. Typically, you generate a smallish test file to, validate that your generator creates valid XML using the validator, and then ramp it up to produce the full workflow(s) you want to run. At this point the pegasus-dax-validator is a very simple program that will only take exactly one argument, the name of the file to check.The following snippet checks a black-diamond file that uses an improper osversion attribute in its executable element:

$ pegasus-dax-validator blackdiamond.dax
ERROR: cvc-pattern-valid: Value '2.6.18-194.26.1.el5' is not facet-valid
 with respect to pattern '[0-9]+(\.[0-9]+(\.[0-9]+)?)?' for type 'VersionPattern'.
ERROR: cvc-attribute.3: The value '2.6.18-194.26.1.el5' of attribute 'osversion'
 on element 'executable' is not valid with respect to its type, 'VersionPattern'.

0 warnings, 2 errors, and 0 fatal errors detected.

We are working on improving this program, e.g. provide output with regards to the line number where the issue occurred. However, it will return with a non-zero exit code whenever errors were detected.

10.10. Command Line Tools

pegasus-analyzer — debugs a workflow.
pegasus-cleanup — Removes files during Pegasus workflows enactment.
pegasus-cluster — run a list of applications
pegasus-config — The authority for where parts of the Pegasus system exists on the filesystem. pegasus-config can be used to find libraries such as the DAX generators.
pegasus-create-dir — Creates work directories in Pegasus workflows.
pegasus-dagman — Wrapper around *condor_dagman*. Not to be run by user.
pegasus-dax-validator — determines if a given DAX file is valid.
pegasus-exitcode — Checks the stdout/stderr files of a workflow job for any indication that an error occurred in the job. This script is intended to be invoked automatically by DAGMan as the POST script of a job.
pegasus-gridftp — Perform file and directory operations on remote GridFTP servers
pegasus-invoke — invokes a command from a file
pegasus-keg — kanonical executable for grids
pegasus-kickstart — run an executable in a more universal environment.
pegasus-monitord — tracks a workflow progress, mining information
pegasus-mpi-cluster — a tool for running computational workflows expressed as DAGs (Directed Acyclic Graphs) on computational clusters using MPI.
pegasus-plan — runs Pegasus to generate the executable workflow
pegasus-plots — A tool to generate graphs and charts to visualize workflow run.
pegasus-rc-client — shell client for replica implementations
pegasus-remove — removes a workflow that has been planned and submitted using pegasus-plan and pegasus-run
pegasus-run — executes a workflow that has been planned using *pegasus-plan*.
pegasus-s3 — Upload, download, delete objects in Amazon S3
pegasus-sc-client — generates a site catalog by querying sources.
pegasus-sc-converter — A client to convert site catalog from one format to another format.
pegasus-statistics — A tool to generate statistics about the workflow run.
pegasus-status — Pegasus workflow- and run-time status
pegasus-submit-dag — Wrapper around *condor_submit_dag*. Not to be run by user.
pegasus-tc-client — A full featured generic client to handle adds, deletes and queries to the Transformation Catalog (TC).
pegasus-tc-converter — A client to convert transformation catalog from one format to another format.
pegasus-transfer — Handles data transfers in Pegasus workflows.
pegasus-version — print or match the version of the toolkit.