## Pegasus 4.5.x Series ### Pegasus 4.5.4 **Release Date:** January 27, 2016 We are happy to announce the release of Pegasus 4.5.4. Pegasus 4.5.4 is a minor release, which contains minor enhancements and fixes bugs. This will most likely be the last release in the 4.5 series, and unless you have specific reasons to stay with the 4.5.x series, we recommend to upgrade to 4.6.0. #### New Features 1) [PM-1003] - planner should report information about what options were [\#1120](https://github.com/pegasus-isi/pegasus/issues/1120) used in the planner Planner now reports additional metrics such as command line options, whether PMC was used and number of deleted tasks to the metrics server. 2) [PM-1007] - "undelete" or attach/detach for pegasus-submitdir [\#1124](https://github.com/pegasus-isi/pegasus/issues/1124) pegasus-submit dir has two new commands : attach, which adds the workflow to the dashboard (or corrects the path), and detach, which removes the workflow from the dashboard. 3) [PM-1030] - pegasus-monitord should parse the new dagman output [\#1144](https://github.com/pegasus-isi/pegasus/issues/1144) that reports timestamps from condor user log Starting 8.5.2 , HTCondor DAGMan record sthe condor job log timestamps in the ULOG event messages in the end of the log message. monitord was updated to prefer these timestamps for the job events if present in the DAGMan logs. #### Improvements 1) [PM-896] - Document events that monitord publishes [\#1014](https://github.com/pegasus-isi/pegasus/issues/1014) The netlogger messages generated by monitord that are used for populated the workflow database and master database, are now documented at https://pegasus.isi.edu/wms/docs/4.5.4cvs/stampede_wf_events.php 2) [PM-995] - changes to Pegasus tutorial [\#1112](https://github.com/pegasus-isi/pegasus/issues/1112) Pegasus tutorial was reorganized and simplified to focus more on the pegasus-dashboard, and debugging exercises 3) [PM-1033] - update monitord to handle updated log messages in dagman.out file [\#1147](https://github.com/pegasus-isi/pegasus/issues/1147) Starting 8.5.x series, some of the dagman log messages in dagman.out file were updated to have HTCondor instead of Condor. This broke the monitord parsing regex's and hence it was not able to parse information from the dagman.out file. This is now fixed. 4) [PM-1034] - Make it more difficult for users to break pegasus-submitdir archive [\#1148](https://github.com/pegasus-isi/pegasus/issues/1148) Adding locking mechanism internally, to make pegasus-submitdir more robust , when a user accidently kills an archive operation . 5) [PM-1040] - pegasus-analyzer should be able to handle cases where the workflow failed to start [\#1154](https://github.com/pegasus-isi/pegasus/issues/1154) pegasus-analyzer now detects if a workflow failed to start because of DAGMan fail on NFS error setting, and also displays any errors in *.dag.lib.err files. #### Bugs Fixed 1) [PM-921] - Specified env is not provided to monitord [\#1039](https://github.com/pegasus-isi/pegasus/issues/1039) The environment for pegasus-monitord is now set in the dagman.sub file. The following order is used: pick system environment, override it with env profiles in properties and then from the local site entry in the site catalog. 2) [PM-999] - pegasus-transfer taking too long to finish in case of retries [\#1116](https://github.com/pegasus-isi/pegasus/issues/1116) pegasus-transfer has moved to a exponential back-off: min(5 ** (attempt_current + 1) + random.randint(1, 20), 300) That means that failures for short running transfers will still take time, but is necessary to ensure scalability of real world workflows . 3) [PM-1008] - Dashboard file browser file list breaks with sub-directories [\#1125](https://github.com/pegasus-isi/pegasus/issues/1125) Dashboard filebrowser broke when there were sub directories in the submit directory. this is now fixed. 4) [PM-1009] - File browser just says "Error" if submit_dir in workflow db is incorrect [\#1126](https://github.com/pegasus-isi/pegasus/issues/1126) File browser gives a more informative message when submit directory recorded in the database does not actually exist. 5) [PM-1011] - OSX installer no longer works on El Capitan [\#1128](https://github.com/pegasus-isi/pegasus/issues/1128) El Capitan has a new "feature" that disables root from modifying files in /usr with some exceptions (e.g. /usr/local). Since the earlier installer installed Pegasus in /usr, it no longer worked. Installer was updated to install Pegasus in /usr/local instead. 6) [PM-1012] - pegasus-gridftp fails with "no key" error [\#1129](https://github.com/pegasus-isi/pegasus/issues/1129) The SSL proxies jar was updated . The error was triggered because of following JGlobus issue: https://github.com/jglobus/JGlobus/issues/146 7) [PM-1017] - pegasus-s3 fails with [SSL: CERTIFICATE_VERIFY_FAILED] [\#1132](https://github.com/pegasus-isi/pegasus/issues/1132) s3.amazonaws.com has a cert that was issued by a CA that is not in the cacerts.txt file bundled with boto 2.5.2. Boto bundled with Pegasus was updated to 2.38.0 8) [PM-1021] - kickstart stat for jobs in the workflow does not work for clustered jobs [\#1136](https://github.com/pegasus-isi/pegasus/issues/1136) kickstart stat did not work for clustered jobs. This is now fixed. 9) [PM-1022] - dynamic hierarchy tests failed randomly [\#1137](https://github.com/pegasus-isi/pegasus/issues/1137) The DAX jobs were not considered for cleanup. Because of this, if there was a compute job that generated the DAX the subdax job required, sometimes the cleanup of the dax file happened before the subdax job finished. This is now fixed. 10) [PM-1039] - pegasus-analyzer fails with: TypeError: unsupported operand type(s) for -: 'int' and 'NoneType' [\#1153](https://github.com/pegasus-isi/pegasus/issues/1153) pegasus-analyzer threw a stacktrace when a workflow did not start because of DAGMan NFS settings. This is now fixed. 11) [PM-1041] - pegasus-db-admin 4.5.4 gives a stack trace when run on pegasus 4.6 workflow submit dir [\#1155](https://github.com/pegasus-isi/pegasus/issues/1155) A clean error is displayed, if pegasus-db-admin from 4.5.4 is run against a workflow submit directory from a higher Pegasus version. ### Pegasus 4.5.3 **Release Date:** November 4, 2015 We are happy to annouce the release of Pegasus 4.5.3. Pegasus 4.5.3 is a minor release, which contains minor enhancements and fixes bugs in the Pegasus 4.5.2 release. The following issues were addressed and more information can be found in the Pegasus Jira (https://jira.isi.edu/) #### Bug Fixes 1. [PM-980] - pegasus-plots fails with "-p all" [\#1097](https://github.com/pegasus-isi/pegasus/issues/1097) 2. [PM-982] - MRC replica catalog backend does not work [\#1099](https://github.com/pegasus-isi/pegasus/issues/1099) 3. [PM-987] - noop jobs created by Pegasus don't use DAGMan NOOP keyword [\#1104](https://github.com/pegasus-isi/pegasus/issues/1104) 4. [PM-996] - Pegasus Statistics transformation stats columns getting [\#1113](https://github.com/pegasus-isi/pegasus/issues/1113) larger ad larger with more sub workflows 5. [PM-997] - pyOpenSSL v0.13 does not work with new version of openssl [\#1114](https://github.com/pegasus-isi/pegasus/issues/1114) (1.0.2d) and El Captain #### Improvements 1. [PM-976] - ignore register and transfer flags for input files [\#1093](https://github.com/pegasus-isi/pegasus/issues/1093) 2. [PM-981] - register only based names for output files with deep LFN's [\#1098](https://github.com/pegasus-isi/pegasus/issues/1098) 3. [PM-983] - data reuse algorithm should consider file locations while [\#1100](https://github.com/pegasus-isi/pegasus/issues/1100) cascading deletion upwards 4. [PM-984] - condor_rm on a pegasus-kickstart wrapped job does not [\#1101](https://github.com/pegasus-isi/pegasus/issues/1101) return stdout back 5. [PM-988] - pegasus-transfer should handle file://localhost/ URL's [\#1105](https://github.com/pegasus-isi/pegasus/issues/1105) 6. [PM-989] - pegasus-analyzer debug job option should have a hard check [\#1106](https://github.com/pegasus-isi/pegasus/issues/1106) for output files 7. [PM-993] - Show dax/dag planning jobs in [\#1110](https://github.com/pegasus-isi/pegasus/issues/1110) failed/succesfull/running/failing tabs in dashboard 8. [PM-1000] - turn off concurrency limits by default [\#1117](https://github.com/pegasus-isi/pegasus/issues/1117) #### New Features 1. [PM-985] - separate input and output replica catalog [\#1102](https://github.com/pegasus-isi/pegasus/issues/1102) 2. [PM-986] - input-dir option to pegasus-plan should be a comma [\#1103](https://github.com/pegasus-isi/pegasus/issues/1103) separated list ### Pegasus 4.5.2 **Release Date:** October 15, 2015 We are happy to annouce the release of Pegasus 4.5.2. Pegasus 4.5.2 is a minor release, which contains minor enhancements and fixes bugs in the Pegasus 4.5.1 release. The release addresses a critical fix for systems running HTCondor 8.2.9 , whereby all dagman jobs for Pegasus workflows fail on startup. #### Enhancements 1) File locations in the DAX treated as a Replica Catalog By default, file locations listed in the DAX override entries listed in the Replica Catalog. Users can now set the boolean property pegasus.catalog.replica.dax.asrc to treat the dax locations along with the entries listed in the Replica Catalog for Replica Selection. Associated JIRA item PM-973 [\#1090](https://github.com/pegasus-isi/pegasus/issues/1090) 2) Pegasus auxillary tools now have support for iRods 4.x #### Bugs Fixed 1) pegasus-dagman setpgid fails under HTCondor 8.2.9 Starting with version 8.2.9, HTCondor sets up the process group already to match the pid, and hence the setpgid fails in the pegasus-dagman wrapper around condor-dagman. Because of this all Pegasus workflows fail to start on submit nodes with HTCondor 8.2.9 . If you cannot upgrade to Pegasus version 4.5.2 and are running HTCondor 8.2.9, you can set you can turn off HTCondor's setsid'ing by setting the following in your condor configuration USE_PROCESS_GROUPS = false The pegasus-dagman wrapper now does not fatally fail, if setpgid fails. More details at PM-972 [\#1089](https://github.com/pegasus-isi/pegasus/issues/1089) 2) nonshareddfs execution does not work for Glite if auxiliary jobs are planned to run remotely For nonsharedfs execution to a local PBS|SGE cluster using the GLite interface, Pegasus generated auxillary jobs had incorrect paths to pegasus-kickstart in the submit files, if a job was mapped to run on the remote ( non local ) site. This is now fixed. PM-971 [\#1088](https://github.com/pegasus-isi/pegasus/issues/1088) ### Pegasus 4.5.1 **Release Date:** August 10, 2015 We are happy to annouce the release of Pegasus 4.5.1. Pegasus 4.5.1 is a minor release, which contains minor enhancements and fixes bugs to Pegasus 4.5.0 release. #### Enhancements 1) pegasus-statistics reports workflow badput pegasus-statistics now reports the workflow badput time, which is the sum of all failed kickstart jobs. More details at https://pegasus.isi.edu/wms/docs/4.5.1/plotting_statistics.php Associated JIRA item PM-941 [\#1058](https://github.com/pegasus-isi/pegasus/issues/1058) 2) fast start option for pegasus-monitord By default, when monitord starts tracking a live dagman.out file, it sleeps intermittently, waiting for new lines to be logged in the dagman.out file. This behavior, however causes monitord to lag considerably - when starting for large workflows - when monitord gets restarted due to some failure by pegasus-dagman, or we submit a rescue dag. Users can now set the property pegasus.monitord.fast_start property to enable it. For a future release, it will be the default behavior. Associated JIRA item PM-947 [\#1064](https://github.com/pegasus-isi/pegasus/issues/1064) 3) Support for throttling jobs across workflows using HTCondor concurrency limits Users can now throttle jobs across worklfows using HTCondor concurrency limits. However, this only applies to vanilla universe jobs. Documentation at https://pegasus.isi.edu/wms/docs/4.5.1sjob_throttling.php#job_throttling_across_workflows Associated JIRA item PM-933 [\#1050](https://github.com/pegasus-isi/pegasus/issues/1050) 4) Support for submissions to local SGE cluster using the GLite interfaces Prelimnary support for SGE clusters has been added in Pegasus. To use this you need to copy the sge_local_submit_attributes.sh from the Pegasus share directory and place it in your condor installation. The list of supported keys can be found here https://pegasus.isi.edu/wms/docs/4.5.1/glite.php Associated JIRA item PM-955 [\#1072](https://github.com/pegasus-isi/pegasus/issues/1072) 5) PEGASUS_SCRATCH_DIR set in the job environment for sharedfs deployment Pegasus not sets an environment variable for the job that indicates the PEGASUS scratch directory the job is executed in , in the case of sharedfs deployments. This is the directory that is created by the create dir job on the execution site for the workflow. Associated JIRA item PM-961 [\#1078](https://github.com/pegasus-isi/pegasus/issues/1078) 6) New properties to control read timeout while setting up connections to the database User can now set pegasus.catalog.*.timeout to set the timeout value in seconds. This should be set only if you encounter database locked errors for your installation. Associated JIRA item PM-943 [\#1060](https://github.com/pegasus-isi/pegasus/issues/1060) 7) Ability to prepend to system path before launcing an application executable Users can now associate an env profile named KICKSTART_PREPEND_PATH with their jobs, to specify the PATH where application specific modules are installed. kickstart will take this value and prepend it to system path before launching the executable Associated JIRA item PM-957 [\#1074](https://github.com/pegasus-isi/pegasus/issues/1074) 8) environment variables in condor submit files are specified using the newer condor syntax For GLITE jobs the environment is specified using the key +remote_environment. For all other jobs, the environment is specified using the environment key but the value is in newer format ( i.e key=value separated by whitespace) Associated JIRA item PM-934 [\#1051](https://github.com/pegasus-isi/pegasus/issues/1051) 9) pass options to pegasus-monitord via properties Users can now specify pegasus.monitord.arguments to pass extra options with which pegasus-monitord is launched for the workflow at runtime. Associated JIRA item PM-948 [\#1065](https://github.com/pegasus-isi/pegasus/issues/1065) 10) pegasus-transfer support OSG stashcp pegasus-transfer has support for the latest version of stashcp Associated JIRA item PM-948 [\#1065](https://github.com/pegasus-isi/pegasus/issues/1065) 11) pegasus-dashboard improvements pegasus-dashboard now loads the directory listing via a AJAX calls. Makes the loading of the workflow details page much faster for large workflows. Show working dir. for a job_instance, and invocation in job details and invocation details page. Displays an appropriate error message if pegasus-db-admin update of a database fails. Added a HTML error page for DB Migration error. Configure logging so Flask log messages show up in Apache logs Associated JIRA item PM-940 [\#1057](https://github.com/pegasus-isi/pegasus/issues/1057) 12) PEGASUS_SITE environment variable is set in job's environment PM-907 [\#1025](https://github.com/pegasus-isi/pegasus/issues/1025) #### Bugs Fixed 1) InPlace cleanup failed if an intermediate file when used as input had transfer flag set to false If an intermediate file ( an output file generated by a parent job) was used as an input file to a child job with the transfer flag set to false, then the associated cleanup job did not have a dependency to the child job. As a result, the cleanup job could run before the child job (that required it as input) could be run. This is now fixed. PM-969 [\#1086](https://github.com/pegasus-isi/pegasus/issues/1086) 2) Incorrect ( malformed) rescue dag submitted in case planner dies because of memory related issues For hieararchal workflows, if a sub worklfow fails then a rescue dag for the sub workflow gets submitted on the job retry. The .dag file for the sub workflow is generated by the planner. If the planner fails during code generation an incoplete .dag file can be submitted. This is now fixed. The planner now writes the dag to a tmp file before renaming it to the .dag extension when code completion is done. PM-966 [\#1083](https://github.com/pegasus-isi/pegasus/issues/1083) 3) Mismatched memory units in kickstart records kickstart now reports all memory values in KB. Earlier the procs element in the machine entry was reporting the value in bytes, while the maxrss etc values in the usage elments were in KB. This is now fixed. PM-959 [\#1076](https://github.com/pegasus-isi/pegasus/issues/1076) 4) pegasus-analyzer did not work for sub workflows There was a bug in the 4.5.0 release where pegasus-analyzer did not pick up the stampede database for the sub workflows correctly. This is now fixed. PM-956 [\#1073](https://github.com/pegasus-isi/pegasus/issues/1073) 5) Rescue DAGS not submitted correctly for dag jobs There was a bug in the 4.5.0 release as a result of the .dag.condor.sub file was generated. As a result of that, the force option was propogated for the dag jobs in the DAX ( dag jobs are sub workflows that are not planned by Pegasus). PM-949 [\#1066](https://github.com/pegasus-isi/pegasus/issues/1066) 6) nonsharedfs configuration did not work with Glite style submissions In case of nonsharedfs, transfer_executable is set to true to transfer the PegasusLite script. However, in the Glite case, that was explicity disabled, which was preventing the workflows from running successfully. PM-950 [\#1067](https://github.com/pegasus-isi/pegasus/issues/1067) 7) pegasus-analyzer catches error for wrong directory instead of listing the traceback PM-946 [\#1063](https://github.com/pegasus-isi/pegasus/issues/1063) 8) pegasus-gridftp fails with: Invalid keyword "POSTALCODE" pegasus-gridftp was failing against the XSEDE site stampede, because of change in certificates at TACC. This was fixed by udpating to the latest jglobus jars. PM-945 [\#1062](https://github.com/pegasus-isi/pegasus/issues/1062) 9) pegasus-statistics deletes directories even if -o option is specified By default pegasus-statistics deletes the statistics directory in which the statistics files are generated. However, this had the side affect of deleting user specified directories set by the -o option. that is no longer the case. PM-932 [\#1049](https://github.com/pegasus-isi/pegasus/issues/1049) 10) pegasus-exitcode ignores errors when it gets "-r 0" pegasus-exitcode now only ignores invocation records exitcodes , but does all the other checks specified when the -r option is specified. PM-927 [\#1044](https://github.com/pegasus-isi/pegasus/issues/1044) 12) pegasus-statistics displays a workflow not found error in case of throwing SqlAlchemy error This also happens if pegasus-admin creates an empty workflow database for a new workflow, and nothing is populated ( because events population is turned off). Associated JIRA item PM-942 [\#1059](https://github.com/pegasus-isi/pegasus/issues/1059) 13) InPlace cleanup did not work correctly with mutlisite runs InPlace cleanup did not work correctly with inter site transfer jobs. This is now fixed. PM-936 [\#1053](https://github.com/pegasus-isi/pegasus/issues/1053) ### Pegasus 4.5.0 **Release Date:** May 5, 2015 We are happy to announce the release of Pegasus 4.5.0. Pegasus 4.5.0 is a major release of Pegasus and includes all the bug fixes and improvements in the minor releases 4.4.1 and 4.4.2 . New features and Improvements in 4.5.0 are - ensemble manager for managing collections of workflows - support for job checkpoint files - support for Google storage - improvements to pegasus-dashboard - data management improvements - new tools pegasus-db-admin, pegasus-submitdir , pegasus-halt and pegasus-graphviz Migration guide available http://pegasus.isi.edu/wms/docs/4.5.0/useful_tips.php#migrating_from_leq44 #### NEW FEATURES 1) Ensemble manager for managing collections of workflows The ensemble manager is a service that manages collections of workflows called ensembles. The ensemble manager is useful when you have a set of workflows you need to run over a long period of time. It can throttle the number of concurrent planning and running workflows, and plan and run workflows in priority order. A typical use-case is a user with 100 workflows to run, who needs no more than one to be planned at a time, and needs no more than two to be running concurrently. The ensemble manager also allows workflows to be submitted and monitored programmatically through its RESTful interface. Details about ensemble manager can be found at https://pegasus.isi.edu/wms/docs/4.5.0/service.php 2) Support for Google Storage Pegasus now supports running of workflows in the Google cloud. When running workflows in Google cloud, users can specify Google storage to act as the staging site. More details on how to configure Pegasus to use google storage can be found at pegasus.isi.edu/wms/docs/4.5.0/cloud.php#google_cloud. All the pegasus auxillary clients ( pegasus-transfer, pegasus-create-dir and pegasus-cleanup) were updated to handle google storage URL's ( starting with gs://). The tools call out to google command line tool called gsutils. 3) Support for job checkpoint files Pegasus now supports checkpoint files created by jobs. This allows users to run long running jobs ( where the runtime of a job exceeds the maxwalltime supported on a compute site) to completion, provided the jobs generate a checkpoint file periodically. To use this, checkpoint files with link as checkpoint need to be specified for the jobs in the DAX . Additionally, the jobs need to specify the pegasus profile checkpoint.time that indicates the number of minutes after which pegasus-kickstart sends a TERM signal to the job, signalling it to start the generation of the checkpoint file . Details on this can be found in the userguide https://pegasus.isi.edu/wms/docs/4.5.0/transfer.php#staging_job_checkpoint_files 4) Pegasus Dashboard Improvements Pegasus dashboard can now be deployed in multiuser mode. It is now started by the pegasus-service command. Instructions for starting the pegasus service can be found at https://pegasus.isi.edu/wms/docs/4.5.0/service.php#idp2043968 Changed the look and feel of the dashboard. Users can now track all job instances ( retries ) of a job through the dashboard. Earlier it was only the latest job retry. There is a new tab called failing jobs on the workflows page. The tab lists jobs that have failed at least once and are currently being retried. The submit host is displayed on the workflow's main page. The job details page now shows information about the Host where the job ran, and all the states that the job has gone through. The dashboard also has a file browser which allows users to view files in the worklfow submit directory directly from the dashboard. 5) Data configuration is now supported per site Starting with the 4.5.0 release, users now can associate a pegasus profile key data.configuration per site in the site catalog to specify the data configuration mode (sharedfs, nonsharedfs or condorio) to use for jobs executed on that site. Earlier this was a global configuration, that applied to the whole workflow and had to be specified in the properties file. More details at PM-810 [\#928](https://github.com/pegasus-isi/pegasus/issues/928) 6) Support for sqlite JDBCRC Users can now specify a sqlite backend for their JDBCRC replica catalog. To create the database for the sqlite based replica catalog, use the command pegasus-db-admin pegasus-db-admin create jdbc:sqlite:/shared/jdbcrc.db To setup Pegasus to use sqlite JDBCRC set the following properties pegasus.catalog.replica JDBCRC pegasus.catalog.replica.db.driver sqlite pegasus.catalog.replica.db.url jdbc:sqlite:/shared/jdbcrc.db Users can use the tool pegasus-rc-client to insert, query and delete entires from the catalog. 7) New database management tool called pegasus-db-admin Depending on configuration, Pegasus can refer to three different types of databases during the various stages of workflow planning and execution. master - Usually a sqlite database located at $HOME/.pegasus/workflow.db. This is always populated by pegasus-monitord and is used by pegasus-dashboard to track users top level workflows. workflow - Usually a sqlite database created by pegasus-monitord in the workflow submit directory. This contains detailed information about the workflow execution. jdbcrc - if a user has configured a JDBCRC replica catalog. The tool is automatically invoked by the planner to check for comaptibility and updates the master database if required. The jdbcrc is checked if a user has it configured at planning time or when using the pegasus-rc-client command line tool. This tool should be used by users, when setting up new database catalogs, or to check for compatibility. For more details refer to the migration guide at https://pegasus.isi.edu/wms/docs/4.5.0cvs/useful_tips.php#migrating_from_leq44 8) pegasus-kickstart allows for system calls interposition pegasus-kickstart has new options -z and -Z that get enabled for linux platforms. When enabled, pegasus-kickstart captures information about the files opened and I/O for user applications and includes it in the proc section of it's output. This -z flag causes kickstart to use ptrace() to intercept system calls and report a list of files accessed and I/O performed. The -Z flag causes kickstart to use LD_PRELOAD to intercept library calls and report a list of files accessed and I/O performed. 9) pegasus-kickstart now captures condor job id and LRMS job ids pegasus-kickstart now captures both the condor job id and the local LRMS ( the system through which the job is executed) in the invocation record for the job. PM-866 [\#984](https://github.com/pegasus-isi/pegasus/issues/984) PM-867 [\#985](https://github.com/pegasus-isi/pegasus/issues/985) 10) pegasus-transfer has support for SSHFTP pegasus-transfer now has support for GridFTP over SSH . More details at https://pegasus.isi.edu/wms/docs/4.5.0/transfer.php#idp17066608 11) pegasus-s3 has support for bulk deletes pegasus-s3 now supports batched deletion of keys from a S3 bucket. This improves the performance for deleting keys from a large bucket. PM-791 [\#909](https://github.com/pegasus-isi/pegasus/issues/909) 12) DAGMan metrics reporting enabled Pegasus workflows now have DAGMan metric reporting capability turned on. Details on Pegasus usage tracking policy can be found at https://pegasus.isi.edu/wms/docs/4.5.0/usage_statistics.php As part of this effort the planner now invokes condor_submit_dag at planning time to generate the DAGMan submit file, that is then modified to enable metrics reporting. More details at PM-797 [\#915](https://github.com/pegasus-isi/pegasus/issues/915) 13) Planner reports file distribution counts in metrics report The planner now reports file distribution counts ( number of input, intermediate and output files) in it's metrics report . 14) Notion of scope for data reuse Users can now enable partial data reuse, where only output files of certain jobs are checked for existence in the replica catalog, to trigger data reuse. Three scopes are supported full - full data reuse as is implemented in 4.4 none - no data reuse i.e same as --force option to the planner partial - in this case, only certain jobs ( those that have pegasus profile key enable_for_data_reuse set to true )are checked for presence of output files in the replica catalog 15) New tool called pegasus-submitdir There is a new tool called pegasus-submitdir that allows users to archive, extract , move and delete a workflow submit directory. The tool ensures that master database ( usually in $HOME/.pegasus/workflow.db) is updated accordingly. 16) New tool called pegasus-halt There is a new tool called pegasus-halt , that allows users to gracefully halt running workflows. The tool places DAGMan .halt files (http://research.cs.wisc.edu/htcondor/manual/v8.2/2_10DAGMan_Applications...) for all dags in a workflow. More details at PM-702 [\#820](https://github.com/pegasus-isi/pegasus/issues/820) 17) New tool called pegasus-graphviz Pegasus now has a tool called pegasus-graphviz that allows you to visualize the DAX and DAG files. It creates a dot file as output . 18) New canonical executable pegasus-mpi-keg New executable called pegasus-mpi-keg that can be compiled from source. Useful for creating synthetic workflows containing MPI jobs. It is similar to pegasus-keg and accepts the same command line arguments. The only difference is that it is MPI code. 19) Change in default values By default, pegasus-transfer now launches maximum of 8 threads to manage the transfers of multiple files. The default job retries for a job in case of failure is now 1 instead of 3. The time for removing the job after has entered the HELD state has been reduced from 1 hour to 30 minutes now. 20) Support for DAGMan ABORT-DAG-ON feature Pegasus now supports a dagman profile key named ABORT-DAG-ON , that can be associated with a job. This job can then cause the whole workflow to be aborted if it fails or exits with a specific value. More details at PM-819 [\#937](https://github.com/pegasus-isi/pegasus/issues/937) 21) Deprecated pool attribute in replica catalog Users now can associate a site attribute in their file based replica catalogs to indicate the site where a file resides. The old attribute pool has been deprecated. More details at PM-813 [\#931](https://github.com/pegasus-isi/pegasus/issues/931) 22) Support for pegasus profile glite.arguments Users can now specify a pegasus profile key glite.arguments that gets added to corresponding PBS qsub file that is generated by the Glite layer in HTCondor. For e.g you can set the value to "-N testjob -l walltime=01:23:45 -l nodes=2" . This will get translated to the following in the PBS file #PBS -N testjob -l walltime=01:23:45 -l nodes=2 The values specified for this profile, override any other conflicting directives that are created on the basis of the globus profiles associated with the jobs. More details at PM-880 [\#998](https://github.com/pegasus-isi/pegasus/issues/998) 23) Reorganized documentation The userguide has been reorganized to make it easier for users to identify the right chapter they want to navigate to. The configuration documentation has been streamlined and put into a single chapter, rather than having a separate chapter for profiles and properties. 24) Support for hints namespace Users can now specify the following hints profile keys to control the behavior of the planner. execution.site - the execution site where a job should execute pfn - the path to the remote executable picked up grid.jobtype - the job type to be used while selecting the gridgateway / jobmanager for the job More details at PM-828 [\#946](https://github.com/pegasus-isi/pegasus/issues/946) 25) Added support for HubZero Distribute job wrapper Added support for HubZero specific job launcher Distribute, that submits jobs to a remote PBS cluster. The compute jobs are setup by Pegasus to run in local universe, and are wrapped with Distribute job wrapper, that takes care of the submission and monitoring of the job. More details at PM-796 [\#914](https://github.com/pegasus-isi/pegasus/issues/914) 26) New classad populated for dagman jobs Pegasus now popualtes a +pegasus_execution_sites classad in the dagman submit file. The value is the list of execution sites for which the workflow was planned for. More details at PM-846 [\#964](https://github.com/pegasus-isi/pegasus/issues/964) 27) Python DAX API now bins the file by link type when rendering the workflow Python DAX API now groups the jobs by their link type before rendering them to XML. This improves the readability of the generated DAX. More details at PM-874 [\#992](https://github.com/pegasus-isi/pegasus/issues/992) 28) Better demarcation of various stages in PegasusLite logs The jobs .err file in PegasusLite modes captures the logs from the PegasusLite wrapper that launches users jobs on remote nodes. This log is now clearly demarcated to identify the various stages of a job execution by PegasusLite. 29) Dropped support for Globus RLS replica catalog backends 30) pegasus-plots is deprecated and will be removed in 4.6 #### Bugs Fixed 1) Fixed kickstart handling of environment variables with quotes If an environment variable has quotes, then invalid XML output was produced by pegasus-kickstart. This is now fixed. More details at PM-807 [\#925](https://github.com/pegasus-isi/pegasus/issues/925) 2) Leaking file descriptors for two stage transfers pegasus-transfer opens a temp file for each two stage transfer it has to execute. It was not closing them explicitly. 3) Disabling of chmod jobs triggered an exception Disabling the chmod jobs results in creation of noop jobs instead of the chmod jobs. However, that resulted in planner exceptions when adding create dir and leaf cleanup nodes. This is now fixed. More details at PM-845 [\#963](https://github.com/pegasus-isi/pegasus/issues/963) 4) Incorrect binning of file transfers amongst transfer jobs By default, pair only considered the destination URL of a transfer pair to determine whether the associated transfer job has to run locally on the submit host or on the remote staging site. However, this logic broke when user had input files catalogued in the replica catalog with file urls for files on the submit site and remote execution sites. The logic has now been updated to take into account source URL's also. More details at PM-829 [\#947](https://github.com/pegasus-isi/pegasus/issues/947) 5) pegasus auxillary jobs are never lauched with pegasus-kickstart invoke capability For compute jobs with long command line arguments , the planner triggers the pegasus invoke capability in addition to the -w option. However, this cannot be applied to pegasus auxillary jobs as that interferes with the credential handling. More details at PM-851 [\#969](https://github.com/pegasus-isi/pegasus/issues/969) 6) Everything in the remote job directory gets staged in condorio mode, if a job has no output files If a job has no output files asscociated with it in the DAX, then in condorio data configuration mode the planner added an empty value for classad key transfer_output_files in the job submit file. This results in Condor staging back all the inputs ( all the contents in remote jobs directory) back to the submit host. This is now fixed as the planner now adds a special key +TransferOutput="" , that prevents Condor from staging everything back. More details at PM-820 [\#938](https://github.com/pegasus-isi/pegasus/issues/938) 7) Setting multiple strings for exitcode.successmsg and exitcode.failuremsg Users can now specify multiple pegasus profiles with the key exitcode.successmsg or exticode.failuremsg. Each value gets translated to a corresponding -s or -f argument to pegasus-exitcode invocation for the job. More details at PM-826 [\#944](https://github.com/pegasus-isi/pegasus/issues/944) 8) pegasus-monitord failed when submission of job fails The events SUBMIT_FAILED, GRID_SUBMIT_FAILED, GLOBUS_SUBMIT_FAILED were not handled correctly by pegasus-monitord. As a result, subsequent event insertions for the job resulted in integrity errors. This is now fixed. More details at PM-877 [\#995](https://github.com/pegasus-isi/pegasus/issues/995)