Pegasus 4.8.0 Released

with No Comments

We are pleased to announce release of Pegasus 4.8.0 

Pegasus 4.8.0 is be a major release of Pegasus and includes improvements and bug fixes to the 4.7.4 release.

Pegasus 4.8.0 Release has support for

  • Application Containers – Pegasus now supports containers for user applications. Both Docker and Singularity are supported. More details can be found in the documentation at https://pegasus.isi.edu/docs/4.8.0/containers.php
  • Jupyter Support –  Pegasus now provides a Python API to declare and manage workflows via Jupyter, which allows workflow creation, execution, and monitoring. The API also provides mechanisms to create Pegasus catalogs (sites, replica, and transformation). More details can be found in the documentation at https://pegasus.isi.edu/docs/4.8.0/jupyter.php 
  • Tuning of Transfer and Cleanup jobs – Pegasus now computes the number of transfer and cleanup jobs to be added for a workflow for a particular level, according to number of jobs on that level https://pegasus.isi.edu/docs/4.8.0/data_transfers.php

The release can be downloaded from
http://pegasus.isi.edu/downloads

Exhaustive list of features, improvements and bug fixes can be found below

New Features and Improvements

  1. [PM-1159] – Support for containers #1273
  2. [PM-1177] – API for running Pegasus workflows via Jupyter #1291
  3. [PM-1191] – If available, use GFAL over globus url copy #1305

    JGlobus is no longer actively supported and is not in compliance with RFC 2818 (https://docs.globus.org/security-bulletins/2015-12-strict-mode). As a result cleanup jobs using pegasus-gridftp client would fail against the servers supporting the strict mode. We have removed the pegasus-gridftp client and now use gfal clients as globus-url-copy does not support removes. If gfal is not available, globus-url-copy is used for cleanup by writing out zero bytes files instead of removing them.

  4. [PM-1212] – new defaults for number of transfer and inplace jobs #1326 created
  5. [PM-1134] – Capture the execution site information in pegasus lite #1248
  6. [PM-1109] – dashboard to display errors if a job is killed instead of #1223 exiting with non zero exitcode
  7. [PM-1146] – There doesn’t seem to be a way to get a persistent URL #1260 for a workflow in dashboard
  8. [PM-1155] – remote cleanup jobs should have file url’s if possible #1269
  9. [PM-1158] – Make DAX3 API compatible with Python 2.6+ and Python3+ #1272
  10. [PM-1161] – Update documentation of large databases for handling #1275 mysqldump: Error 2013
  11. [PM-1187] – make scheduler type case insensitive for grid gateway in #1301 site catalog
  12. [PM-1165] – Update Transformation Catalog format to support #1279 containers
  13. [PM-1166] – pegasus-transfer to support transfers from docker hub #1280
  14. [PM-1180] – update monitord to populate checksums #1294
  15. [PM-1183] – monitord plumbing for putting hooks for integrity #1297 checking
  16. [PM-1188] – Add tool to integrity check transferred files #1302
  17. [PM-1190] – planner changes to enable integrity checking #1304
  18. [PM-1194] – update planner pegasus lite mode to support for docker #1308 container wrapper
  19. [PM-1195] – update site selection to handle containers #1309
  20. [PM-1197] – handle symlinks for input files when launching job via #1311 container
  21. [PM-1200] – update pegasus lite mode to support singulartiy #1314
  22. [PM-1201] – Transformation Catalog API should support the container #1315 keywords
  23. [PM-1202] – Move catalog APIs into Pegasus.catalogs and develop #1316 standalone test cases independent from Jupyter
  24. [PM-1210] – update pegasus-transfer to support transfers from #1324 singularity hub
  25. [PM-1214] – Specifying environment for sites and containers #1328
  26. [PM-1215] – Document support for containers for 4.8 #1329
  27. [PM-1178] – kickstart to checksum output files #1292
  28. [PM-1220] – default app name for metrics server based on dax label #1334
  29. The documentation also lists on how to setup pyglidein ( a resouce provisioner framwork) from the IceCube to provision resources for your workflow. More details can be found at https://pegasus.isi.edu/docs/4.8.0/pyglidein.php

Bugs Fixed

  1. [PM-1032] – Handle proper error message for non-standard python #1146 usage
  2. [PM-1162] – Running pegasus-monitord replay created an unreadable #1276 database
  3. [PM-1171] – Monitord regularly produces empty stderr and stdout #1285 files
  4. [PM-1172] – pegasus-rc-client deletes all entries for a lfn #1286
  5. [PM-1173] – cleanup jobs failing against Titan gridftp server due to #1287 RFC 2818 compliance
  6. [PM-1174] – monitord should maintain the permissions on #1288 ~/.pegasus/workflow.db
  7. [PM-1176] – the job notifications on failure and success should #1290 have exitcode from kickstart file
  8. [PM-1181] – monitord fails to exit if database is locked #1295
  9. [PM-1185] – destination in remote file transfers for inter site #1299 jobs point’s to directory
  10. [PM-1189] – Making X86_64 the default arch in the site catalog #1303
  11. [PM-1193] – “pegasus-rc-client list” modifies rc.txt #1307
  12. [PM-1196] – pegasus-statistics is not generating jobs.txt for some #1310 large workflows
  13. [PM-1207] – Investigate error message: Normalizing ‘4.8.0dev’ to #1321 ‘4.8.0.dev0’
  14. [PM-1208] – Improve database is locked error message #1322
  15. [PM-1209] – Analyzer gets confused about retry number in hierarchical #1323 workflows
  16. [PM-1211] – DAX API should tell which lfn was a dup #1325
  17. [PM-1213] – pegasus creates duplicate source URL’s for staged #1327 executables
  18. [PM-1217] – monitord exits prematurely, when in dagman recovery #1331 mode
  19. [PM-1218] – monitord replay against mysql with registration jobs #1332

6,255 views