Pegasus 3.1.1 has been released. Binaries and source can be found on the Downloads page.
This is a minor release, that fixes some bugs and has minor enhancements.
New features:
- The jobs file created by pegasus-statistics includes hostname and exitcode. The hostname of the host on which the job ran and the exitcode with which job exited is now included in the jobs file created by pegasus-statistics .(https://jira.isi.edu/browse/PM-487)
- pegasus-monitord truncates stdout/stderr if they exceed the maximum size allowed by the database.The boolean property to disable stdout/stderr parsing is pegasus.monitord.stdout.disable.parsing
- Logging improvements for pegasus-monitord. monitord adds logging information when exiting upon a signal. The netlogger DB backend code is now more verbose when errors happen. pegasus-monitord also prints the start and end of the DB flushing function in its logs. pegasus-monitord prints its pid whenever it outputs the starting/ending messages. This can be used to track if multiple pegasus-monitord instances are running at the same time.
- Switched precedence for env loading for local site. On certain Unix systems like debian, java overrides the LD_LIBRARY_PATH on linux. Now a value specified for the local site in the site catalog is preferred over the one in the environment. This is to ensure, that a user can set LD_LIBRARY_PATH in the site catalog, and it overrides the the JAVA provided one. Details at https://jira.isi.edu/browse/PM-471
- seqexec always fails on first job failure. The default behaviour for the clustered jobs has been changed to make seqexec fail on first job that fails To change it , to run all the jobs in a cluster users need to set the following property to false pegasus.clusterer.job.aggregator.seqexec.firstjobfail
Bugs fixed:
- Kickstart with stdout redirection. The stdout redirection feature of kickstart failed on some systems failed, as newer versions of libc() don’t allow for the realpath() function to point to a non existing filename. This prevented kickstart from redirecting the stdout of a job to a file. More details at https://jira.isi.edu/browse/PM-469
- seqexec incorrectly reported the summary section in its records. seqexec incorrectly reported the summary section in its records, when launched through condor. It reported jobs succeeded even though jobs had failed. More details at https://jira.isi.edu/browse/PM-541
- Fixed a bug in pegasus-monitord that was preventing the database population of certain pre and post script events.
- Fixed some python 2.4 compatibility issues in pegasus-plots. More details at https://jira.isi.edu/browse/PM-534
- SubDAG jobs referred in a DAX were launched by pegasus-dagman. DAX can contain references to condor DAGS, that are not planned through Pegasus. They were launched by pegasus-dagman instead of directly being launched by condor_dagman. More details at https://jira.isi.edu/browse/PM-553
- Invoke option for kickstart -I flag triggered incorrectly for clustered jobs. There was a bug in how pegasus handled invoke to get around the long arguments specified for the jobs in the DAX. In that case, a .arg created that is transferred with the job. This was not happening for clustered jobs. More details at https://jira.isi.edu/browse/PM-526