Pegasus 4.0.0 Released

This is a major release of Pegasus that introduces new advanced data handling capabilities, and contains improved support for running workflows in non-shared filesystem scenarios such as clouds and Condor pools. Pegasus now optionally separates the data staging site from the workflow execution site for more flexible data management. A new feature is PegasusLite – an autonomous lightweight execution environment to manage jobs on the compute nodes and handles data movement to/from such jobs against the workflow staging site. The RPM and Debian packages conform to the Filesystem Hierarchy Standard (FHS).

New features:

PegasusLite
- Pegasus 4.0 has improved support for running workflows in a non shared fileysystem setup . This is useful for running in cloud environments and allows for more dynamic placement of jobs. Pegasus 4.0 introduces the concept of a staging site for the worklfows, that can be different from an execution site. The planner places the data on the staging site for the workflow. When the jobs start on the remote compute nodes, they are launched by a lightweight component called PegasusLite, that stages in the data from the staging site to a local directory on the worker node filesystem . The output data generated by the compute job is similarly pushed back to the staging site, when a job completes.
  
  Users can now setup Pegasus for different environments by setting the property
  
  pegasus.data.configuration
  
  More details can be found in the data staging configuration chapter.
Move to FHS layout
- Pegasus 4.0 is the first release of Pegasus which is Filesystem Hierarchy Standard (FHS) compliant. The native packages no longer installs under /opt. Instead, pegasus-* binaries are in /usr/bin/ and example workflows can be found under /usr/share/pegasus/examples/.
  To find Pegasus system components, a pegasus-config tool is provided. pegasus-config supports setting up the environment for
  - Python
  - Perl
  - Java
  - Shell
Improved Credential Handling
- Pegasus 4.0 has improved credential handling. The planner while planning the worklfow automatically associates the jobs with the credentials it may require. This is done by inspecting the URL’s for the files a job requires.
  
  More details on how the credentials are set can in the transfers chapter
New clients for directory creation and file cleanup
- Starting 4.0, Pegasus has changed the way how the scratch directories are created on the staging site. The planner now
prefers to schedule the directory creation and cleanup jobs locally. The jobs refer to python based tools, that call out to

protocol specific clients to determine what client is picked up. For protocols, where specific remote cleanup and

directory creation clients don’t exist ( for example gridftp ), the python tools rely on the corresponding transfer tool to

create a directory by initiating a transfer of an empty file. The python clients use to create directories and remove files

are called

– pegasus-create-dir

– pegasus-cleanup

More details about the clients can be found in the transfers chapter
Runtime based clustering
- Users can now do horizontal clustering based on job runtimes. If the jobs in the DAX are annotated with job runtimes ( use of pegasus profile key job.runtime ) , then Pegasus can horizontally cluster the jobs in such a way that the clustered job will not run more than a maxruntime ( specified by use of profile clusters.maxruntime).
More details can be found in the clustering chapter of the online guide.
pegasus-analyzer works with stampede database
- Starting with 4.0 release, by default pegasus analyzer queries the database generated in the workflow submit directory ( unless using mysql) to debug the workflow. If you want it to use files in the submit directory , use the –files option.
CSV formatted output files for pegasus statistcs
- pegasus-statistics now generates it’s output in a csv formatte file also, in addition to the txt files it creates. This is useful, for importing statistics in tools like Mircrosoft Excel.
Tracking of job exitcode in the stampede schema
- The stampede database schema was updated to associate a job exitcode field with the job_instance table. This makes it easier for the user and the mining tools to determine whether a job succeeded for failed at the Condor level.
Earlier that was handled at query time by looking up the last state in the jobstate table.

The updated stampede schema can be found here
Change to how exitcode is stored in the stampede database
- Kickstart records capture raw status in addition to the exitcode. The exitcode is derived from the raw status. Starting with Pegasus 4.0 release, all exitcode columns ( i.e invocation and job instance table columns ) are stored with the raw status by pegasus-monitord. If an exitcode is encountered while parsing the dagman log files , the value is converted to the corresponding raw status before it is stored. All user tools, pegasus-analyzer and pegasus-statistics then convert the raw status to exitcode when retrieving from the database.
Accounting for MPI jobs in the stampede database
- Starting with the 4.0 release, there is a multiplier factor associated with the jobs in the job_instance table. It defaults to one, unless the user associates a Pegasus profile key named cores with the job in the DAX. The factor can be used for getting more accurate accounting statistics for jobs that run on multiple processors/cores or mpi jobs.
Full details in the pegasus-monitord chapter.
Stampede database upgrade tool
- Starting with the 4.0 release, users can use the stampede database upgrade tool to upgrade a 3.1.x database to the latest version.
All statistics and debugging tools will complain if they determine at runtime that the database is from an old version.

More details can be found here

Bugs Fixed

pegasus-plots invocation breakdown
- In certain cases, it was found that the invocation breakdown piechart generated by pegasus-plots was broken. This is now fixed.
  
  More details at
  
  https://jira.isi.edu/browse/PM-566
UUID devices confuse pegasus-keg
- It is possible with Linux to describe devices in /etc/fstab with their UUID instead of the device name, i.e.
  
  UUID=xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx / ext3 defaults 1 1
  
  instead of
  
  /dev/sda1 / ext3 default 1 1
  
  However, some logic in keg relies on the device starting with a slash to recognize a true-file device. This is now fixed