18.4. Migrating From Pegasus 2.X to Pegasus 3.X

With Pegasus 3.0 effort has been made to simplify configuration. This chapter is for existing users of Pegasus who use Pegasus 2.x to run their workflows and walks through the steps to move to using Pegasus 3.0

18.4.1. PEGASUS_HOME and Setup Scripts

Earlier versions of Pegasus required users to have the environment variable PEGASUS_HOME set and to source a setup file $PEGASUS_HOME/setup.sh | $PEGASUS_HOME/setup.csh before running Pegasus to setup CLASSPATH and other variables.

Starting with Pegasus 3.0 this is no longer required. The above paths are automaticallly determined by the Pegasus tools when they are invoked.

All the users need to do is to set the PATH variable to pick up the pegasus executables from the bin directory.

$ export PATH=/some/install/pegasus-3.0.0/bin:$PATH

18.4.2. Changes to Schemas and Catalog Formats

18.4.2.1. DAX Schema

Pegasus 3.0 by default now parses DAX documents conforming to the DAX Schema 3.2 available here and is explained in detail in the chapter on API references.

Starting Pegasus 3.0 , DAX generation API's are provided in Java/Python and Perl for users to use in their DAX Generators. The use of API's is highly encouraged. Support for the old DAX schema's has been deprecated and will be removed in a future version.

For users, who still want to run using the old DAX formats i.e 3.0 or earlier, can for the time being set the following property in the properties and point it to dax-3.0 xsd of the installation.

pegasus.schema.dax  /some/install/pegasus-3.0/etc/dax-3.0.xsd

18.4.2.2. Site Catalog Format

Pegasus 3.0 by default now parses Site Catalog format conforming to the SC schema 3.0 ( XML3 ) available here and is explained in detail in the chapter on Catalogs.

Pegasus 3.0 comes with a pegasus-sc-converter that will convert users old site catalog ( XML ) to the XML3 format. Sample usage is given below.

$ pegasus-sc-converter -i sample.sites.xml -I XML -o sample.sites.xml3 -O XML3

2010.11.22 12:55:14.169 PST:   Written out the converted file to sample.sites.xml3

To use the converted site catalog, in the properties do the following

  1. unset pegasus.catalog.site or set pegasus.catalog.site to XML3

  2. point pegasus.catalog.site.file to the converted site catalog

18.4.2.3. Transformation Catalog Format

Pegasus 3.0 by default now parses a file based multiline textual format of a Transformation Catalog. The new Text format is explained in detail in the chapter on Catalogs.

Pegasus 3.0 comes with a pegasus-tc-converter that will convert users old transformation catalog ( File ) to the Text format. Sample usage is given below.

$ pegasus-tc-converter -i sample.tc.data -I File -o sample.tc.text -O Text

2010.11.22 12:53:16.661 PST:   Successfully converted Transformation Catalog from File to Text
2010.11.22 12:53:16.666 PST:   The output transfomation catalog is in file  /lfs1/software/install/pegasus/pegasus-3.0.0cvs/etc/sample.tc.text

To use the converted transformation catalog, in the properties do the following

  1. unset pegasus.catalog.transformation or set pegasus.catalog.transformation to Text

  2. point pegasus.catalog.transformation.file to the converted transformation catalog

18.4.3. Properties and Profiles Simplification

Starting with Pegasus 3.0 all profiles can be specified in the properties file. Profiles specified in the properties file have the lowest priority. Profiles are explained in the detail in the configuration chapter. As a result of this a lot of existing Pegasus Properties were replaced by profiles. The table below lists the properties removed and the new profile based names.

Table 18.1. Property Keys removed and their Profile based replacement

Old Property Key New Property Key
pegasus.local.env no replacement. Specify env profiles for local site in the site catalog
pegasus.condor.release condor.periodic_release
pegasus.condor.remove condor.periodic_remove
pegasus.job.priority condor.priority
pegasus.condor.output.stream pegasus.condor.output.stream
pegasus.condor.error.stream condor.stream_error
pegasus.dagman.retry dagman.retry
pegasus.exitcode.impl dagman.post
pegasus.exitcode.scope dagman.post.scope
pegasus.exitcode.arguments dagman.post.arguments
pegasus.exitcode.path.* dagman.post.path.*
pegasus.dagman.maxpre dagman.maxpre
pegasus.dagman.maxpost dagman.maxpost
pegasus.dagman.maxidle dagman.maxidle
pegasus.dagman.maxjobs dagman.maxjobs
pegasus.remote.scheduler.min.maxwalltime globus.maxwalltime
pegasus.remote.scheduler.min.maxtime globus.maxtime
pegasus.remote.scheduler.min.maxcputime globus.maxcputime
pegasus.remote.scheduler.queues globus.queue

18.4.3.1. Profile Keys for Clustering

The pegasus profile keys for job clustering were renamed. The following table lists the old and the new names for the profile keys.

Table 18.2. Old and New Names For Job Clustering Profile Keys

Old Pegasus Profile Key New Pegasus Profile Key
collapse clusters.size
bundle clusters.num

18.4.4. Transfers Simplification

Pegasus 3.0 has a new default transfer client pegasus-transfer that is invoked by default for first level and second level staging. The pegasus-transfer client is a python based wrapper around various transfer clients like globus-url-copy, lcg-copy, wget, cp, ln . pegasus-transfer looks at source and destination url and figures out automatically which underlying client to use. pegasus-transfer is distributed with the PEGASUS and can be found in the bin subdirectory .

Also, the Bundle Transfer refiner has been made the default for pegasus 3.0. Most of the users no longer need to set any transfer related properties. The names of the profiles keys that control the Bundle Transfers have been changed . The following table lists the old and the new names for the Pegasus Profile Keys and are explained in details in the Profiles Chapter.

Table 18.3. Old and New Names For Transfer Bundling Profile Keys

Old Pegasus Profile Key New Pegasus Profile Keys
bundle.stagein stagein.clusters | stagein.local.clusters | stagein.remote.clusters
bundle.stageout stageout.clusters | stageout.local.clusters | stageout.remote.clusters

18.4.4.1. Worker Package Staging

Starting Pegasus 3.0 there is a separate boolean property pegasus.transfer.worker.package to enable worker package staging to the remote compute sites. Earlier it was bundled with user executables staging i.e if pegasus.catalog.transformation.mapper property was set to Staged .

18.4.5. Clients in bin directory

Starting with Pegasus 3.0 the pegasus clients in the bin directory have a pegasus prefix. The table below lists the old client names and new names for the clients that replaced them

Table 18.4. Old Client Names and their New Names

Old Client New Client
rc-client pegasus-rc-client
tc-client pegasus-tc-client
pegasus-get-sites pegasus-sc-client
sc-client pegasus-sc-converter
tailstatd pegasus-monitord
genstats and genstats-breakdown pegasus-statistics
show-job pegasus-plots
dirmanager pegasus-dirmanager
exitcode pegasus-exitcode
rank-dax pegasus-rank-dax
transfer pegasus-transfer