Pegasus 4.9.0 Released

with No Comments

We are pleased to announce release of Pegasus 4.9.0

Pegasus 4.9.0 is be a major release of Pegasus. Highlights of new features:

  • Integrity Checking – Pegasus now performs integrity checking on files in a workflow for non shared filesystem deployments. More details can be found in the documentation. This work is the result of the Scientific Workflow Integration with Pegasus project (NSF awards 1642070, 1642053, and 1642090). More information on SWIP is available at https://cacr.iu.edu/projects/swip/
  • AWS Batch –  Pegasus now provides a way to execute horizontally clustered jobs on Amazon AWS Batch Service using a new command line tool pegasus-aws-batch. In other words, you can get Pegasus to cluster each level of your workflow into a bag of tasks and run those clustered jobs on Amazon Cloud using AWS Batch Service. InMore details can be found in documentation at https://pegasus.isi.edu/docs/4.9.0/aws-batch.php .
  • Pegasus Tutorial With Containers – We have a version of the tutorial for prospective new users to try out at  https://pegasus.isi.edu/tutorial/isi/ . This tutorial walks through users on how to bundle their application code in containers and use them for execution with Pegasus.

The release can be downloaded from
https://pegasus.isi.edu/downloads

Please note that the Perl DAX API is deprecated starting 4.9.0 Release and will be removed in the 5.0 Release.

Exhaustive list of features, improvements and bug fixes can be found below.

New Features

  1. [PM-1179] – Integrity checking in Pegasus #1293
  2. [PM-1228] – Integrate Pegasus with AWS Batch #1342
  3. [PM-1167] – Dashboard should explain why condor submission failed #1281
  4. [PM-1233] – update pyopen ssl to 0.14 or higher #1347
  5. [PM-1265] – monitord to send events from job stdout #1379
  6. [PM-1277] – pegasus workflows on OSG should appear in OSG gratia #1391
  7. [PM-1280] – incorporate container based example in pegasus-init #1394
  8. [PM-1283] – tutorial vm should be updated to be able to run the #1397 containers tutorial
  9. [PM-1289] – ability to filter events sent to AMQP endpoint. #1403
  10. [PM-1298] – symlinking when running with containers #1412
  11. [PM-1310] – integrity checking documentation #1424

Improvements

  1. [PM-1288] – pegasus.project profile key is not set for SLURM #1402 submissions
  2. [PM-898] – Modify monitord so that it can send AMQP messages AND #1016 populate a DB.
  3. [PM-1243] – Clusters of size 1 should be allowed when using AWS #1357 Batch and Horizontal Clustering
  4. [PM-1244] – analyzer is not showing location of submit file #1358
  5. [PM-1248] – Planning fails in shared-fs mode with cryptic message. #1362
  6. [PM-1258] – Update AMQP support for Panorama #1372
  7. [PM-1270] – transformation selectors should preferred the site #1384 assigned by site selector
  8. [PM-1285] – Exporting environmental variables in containers with #1399 the same technique.
  9. [PM-1294] – update flask dependency #1408
  10. [PM-1299] – Consider capturing which API was used to generate the #1413 DAX in metrics.
  11. [PM-1308] – changes to database connection code #1422
  12. [PM-1313] – allow for direct use of singularity containers on #1427 CVMFS on compute site

Tasks

  1. [PM-1284] – pegasus-mpi-cluster’s test PM848 fails on Debian 10 #1398
  2. [PM-901] – site for OSG #1019
  3. [PM-1200] – update pegasus lite mode to support singulartiy #1314
  4. [PM-1231] – support pegasus-aws-batch as clustering executable #1345
  5. [PM-1246] – add manpage for pegasus-aws-batch #1360
  6. [PM-1250] – update handling for raw input files for integrity #1364 checking
  7. [PM-1251] – pegasus-transfer to checksum files #1365
  8. [PM-1252] – enforce integrity checking for stage-out jobs #1366
  9. [PM-1254] – handling of checkpoint files for integrity checking #1368
  10. [PM-1257] – propagate checksums in hierarchal workflows #1371
  11. [PM-1260] – Integrity data in Stampede / statistics #1374
  12. [PM-1271] – update database schema and pegasus-db-admin to include #1385 integrity meta table
  13. [PM-1272] – pegasus-integrity-check to log integrity data in #1386 monitord parseable event
  14. [PM-1273] – pegasus-integrity-check to check for multiple files #1387
  15. [PM-1274] – update monitord to populate integrity events #1388
  16. [PM-1279] – Show integrity metrics in dashboard statistics page #1393
  17. [PM-1295] – pegasus should be able to report integrity errors #1409
  18. [PM-1296] – update database schema with tag table #1410
  19. [PM-1302] – pegasus-transfer should expose option for verify #1416 source existent for symlinks
  20. [PM-1303] – update jupiter tc api for container mount keyword #1417
  21. [PM-1304] – pegasus-transfer should support docker->singularity #1418 pulls
  22. [PM-1305] – Integrity checking breaks for symlinks in containers #1419 with different mount points
  23. [PM-1306] – dials for integrity checking #1420
  24. [PM-1311] – Update DAX generators to generate workflow metadata #1425 key dax.api
  25. [PM-1312] – track the dax_api key in the normalized database #1426 schema for metrics server

Bugs Fixed

  1. [PM-1219] – Fix Foreign key constraint in workflow_files table #1333
  2. [PM-1221] – source tar balls have .git files #1335
  3. [PM-1222] – condor dagman does not allow . in job names #1336
  4. [PM-1223] – Python DAX API available via PIP #1337
  5. [PM-1224] – sub workflow planning pegasus lite prescript does not #1338 associate credentials
  6. [PM-1225] – hierarchal workflows planning in sharedfs fails with #1339 worker package staging set
  7. [PM-1226] – hierarchal workflow with worker package staging fails #1340
  8. [PM-1234] – bucket creation for S3 as staging site fails #1348
  9. [PM-1245] – Blank space in remote_environment variable generates #1359 blank export command
  10. [PM-1249] – Build fails against newer PostgreSQL #1363
  11. [PM-1253] – Planner should complain for same file designated as #1367 input and output
  12. [PM-1255] – Singularity 2.4.(2?) pull cli has changed #1369
  13. [PM-1256] – rc-client does not strip quotes from PFN while #1370 populating
  14. [PM-1261] – PMC .in files are not generated into the 00/00 pattern #1375 folder.
  15. [PM-1263] – Invalid raise statement in Python #1377
  16. [PM-1266] – Jupyter API does not only plans the workflow without #1380 submitting it
  17. [PM-1275] – kickstart build fails on arm64 #1389
  18. [PM-1276] – unbounded recursion in database loader for dashboard #1390 in monitord
  19. [PM-1281] – pegasus-analyzer does not show task stdout/stderr for #1395 held jobs
  20. [PM-1282] – pegasus.runtime doesn’t affect walltime #1396
  21. [PM-1287] – python3 added as a rpm dep, even though it is not a #1401 true dep
  22. [PM-1290] – Queue is not handled correctly in Glite #1404
  23. [PM-1291] – DB integrity errors in 4.9.0dev #1405
  24. [PM-1292] – incomplete kickstart output #1406
  25. [PM-1293] – stage-in jobs always have generate_checksum set to #1407 true if checksums are not present in RC
  26. [PM-1297] – monitord does not prescript log in case of planning #1411 failure for sub workflows
  27. [PM-1300] – Planning error when image is of type singularity and #1414 source is docker hub
  28. [PM-1301] – Python exception raised when using MySQL for stampede #1415 DB
  29. [PM-1307] – some of the newly generated events are missing #1421 timestamp
  30. [PM-1314] – incorrect way of computing suffix for singularity #1428 images
  31. [PM-1315] – Exception when data reuse removes all jobs. #1429
  32. [PM-1316] – Transformation that uses a regular file, but defines #1430 it in RC raises an exception, unlike in older versions
  33. [PM-1317] – Pegasus plan fails on container task clustering #1431

5,402 views