Pegasus 5.0 Released

with No Comments

We are happy to announce the release of Pegasus 5.0. Pegasus 5.0 Release Slider ImagePegasus 5.0 is be a major release of Pegasus and builds upon the beta version released couple of months back. It also includes all features and bug fixes from the 4.9 branch. We invite our users to give it a try.

 

The release can be downloaded from:
https://pegasus.isi.edu/downloads

If you are an existing user, please carefully follow these instructions to upgrade.

Highlights of the Release

  1. Reworked Python API:

    This new API has been developed from the ground up so that, in addition to generating the abstract workflow and all the catalogs, it now allows you to plan, submit, monitor, analyze and generate statistics of your workflow. To use this new Python API refer to the Moving From DAX3 to Pegasus.api at https://pegasus.isi.edu/docs/5.0.0/user-guide/migration.html#moving-from-dax3

  2. Adoption of YAML formats:

    With Pegasus 5.0, we are moving to adoption of YAML for representation of all major catalogs. We have provided catalog converters for you to convert your existing catalogs to the new formats. In 5.0, the following are now represented in YAML:

    • Abstract Workflow

    • Replica Catalog

    • Transformation Catalog

    • Site Catalog

    • Kickstart Provenance Records

  3. Python3 Support:

    All Pegasus tools are Python 3 compliant. 5.0 release will require Python 3 on workflow submit node Python PIP packages for workflow composition and monitoring

  4. Default data configuration

    In Pegasus 5.0, the default data configuration has been changed to condorio . Up to 4.9.x releases, the default configuration was sharedfs.

  5. Zero configuration required to submit to local HTCondor pool

  6. Data Management Improvements

    • New output replica catalog that registers outputs including file metadata such as size and checksums

    • Ability to do bypass staging of files at a per file, executable and container level

    • Improved support for hierarchal workflows allow you to create data dependencies between sub workflow jobs and compute jobs

    • Support for staging of generated outputs to multiple output sites

    • Support for integrity checking of user executables and application containers in addition to data

    • Support for webdav transfers

    • Easier enabling of data reuse by specifying previous workflow submit directories using –reuse option to pegasus-plan.

    • Stagein transfer jobs are assigned priorities based on the number of child compute jobs. Details can be found in JIRA ticket 1385

  7. New Jupyter Notebook Based Tutorial

    With this release, we are pleased to announce a brand new tutorial based on a Docker container running interactive Jupyter notebooks. You can access the tutorial at PM-1385 #1499

  8. Support for CWL (Common Workflow Language)

    The pegasus-cwl-converter command line tool has been developed to convert a subset of the Common Workflow Language (CWL) to Pegasus’s native YAML format. Given the following three files: a CWL workflow file, a workflow inputs specification file, and a transformation (executable) specification file, pegasus-cwl-converter will do a best-effort translation. This will also work with CWL workflow specifications that refer to Docker containers. The entire CWL language specification is not yet covered by this converter, and as such we can provide additional support in converting your workflows into the Pegasus YAML format.

  9. Support for triggers in Ensemble Manager

    The Pegasus Ensemble Manager is a service that manages collections of workflows. In this latest release of Pegasus, workflow triggering functionality has been added to this service. With the ensemble manager service up and running, the pegasus-em command can now be used to start workflow triggers. Two triggersare currently supported:

    • a cron based trigger: The cron based trigger will, at a given interval, submit a new workflow to your ensemble.

    • a cron based, file pattern trigger: The cron based, file pattern trigger, much like the cron based trigger, will submit a new workflow to your ensemble at a given time interval, with the addition of any new files that are detected based on a given file pattern. This is useful for automatically processing data as it arrives.

  10. Improved events for each job reported to AMQP

    Historically the events reported to AMQP endpoints are normalized events corresponding to the stampede database, which makes correlation hard. Pegasus now also reports a new job composite event (stampede.job_inst.composite) to AMQP end points that have a complete information about a job execution.

  11. Revamped Documentation

    Documentation has been overhauled and broken down into a user guide and a reference guide. In addition, we have moved to readthedocs style documentation using restructured text. The documentation can be found at https://pegasus.isi.edu/docs/5.0.0/index.html .

  12. pegasus-statistics reports memory usage and avg cpu utilization

    pegasus-statistics now reports memory usage and average cpu utilization for your jobs in the transformation statistics file (breakdown.txt).

  13. PegasusLite Improvements

    • Users can now specify environment setup scripts in the site catalog that need to be sourced to setup the environment before the job is launched by PegasusLite. More details at https://pegasus.isi.edu/docs/5.0.0/reference-guide/pegasus-lite.html#setting-the-environment-in-pegasuslite-for-your-job

    • Users can get the transfers in PegasusLite to run on DTN nodes while the jobs run on the the compute nodes. More details at https://pegasus.isi.edu/docs/5.0.0/reference-guide/pegasus-lite.html#specify-compute-job-in-pegasuslite-to-run-on-different-node

  14. Credentials existence is checked upfront by planner

  15. Performance improvements for pegasus-rc-client

    pegasus-rc-client now does bulk inserts when inserting entries into a database backed Replica Catalog.

  16. Pegasus ensures a consistent UTF8 environment across full workflow Details can be found in JIRA ticket 1592 .

2.10.2. New Features

  1. [PM-603] – Enable workflows to reference local input files directly instead of symlinking them #721

  2. [PM-1133] – Kickstart should send a heartbeat so that condor can kill stuck jobs #1247

  3. [PM-1156] – PegasusLite to tar up the contents of the cwd in case of job failure #1270

  4. [PM-1278] – stats should include cpu and memory utilization #1392

  5. [PM-1309] – develop a pip package that only contains the DAX API and the catalogs API #1423

  6. [PM-1335] – YAML based transformation catalog #1449

  7. [PM-1339] – construct a default entry for local site if not present in site catalog #1453

  8. [PM-1345] – Support for Shifter at Nersc #1459

  9. [PM-1351] – YAML based kickstart records #1465

  10. [PM-1352] – Build failure on Debian 10 due to mariadb/MySQL-Python incompatibility #1466

  11. [PM-1354] – pegasus-init to support titan tutorial #1468

  12. [PM-1355] – composite records when sending events to AMQP #1469

  13. [PM-1357] – In lite jobs, chirp durations for stage in, stage out of data #1471

  14. [PM-1367] – Support for retrieval from HPSS tape store using commands htar and hsi #1481

  15. [PM-1390] – ensure all machine parseable information is one file associated with job #1504

  16. [PM-1396] – kickstart yaml parser fails because of : unacceptable character #x001b: special characters are not allowed #1510

  17. [PM-1398] – include machine information in job_instance.composite event #1512

  18. [PM-1402] – pegasus-init to support summit as execution env for tutorial #1516

  19. [PM-1411] – create the schema for a YAML based DAX #1525

  20. [PM-1438] – YAML Based Site Catalog #1552

  21. [PM-1461] – Ability to specify a wrapper/launcher for compute jobs in PegasusLite #1575

  22. [PM-1470] – pegasus-graphviz needs a yaml handler #1584

  23. [PM-1493] – YAML Based Replica Catalog #1607

  24. [PM-1501] – Parse YAML DAX files #1615

  25. [PM-1516] – Planner should create a default condorpool compute site if a user does not have it specified #1630

  26. [PM-1528] – update DAX R API to emit new workflow format in yaml #1642

  27. [PM-1529] – set default data configuration to condorio #1643

  28. [PM-1551] – Update JAVA DAX API to generate yaml formatted DAX #1665

  29. [PM-1552] – move to using pegasus lite for cases, where we transfer pegasus-transfer #1666

  30. [PM-1608] – data dependencies between dax jobs and compute jobs in a workflow #1722

  31. [PM-1620] – enable integrity checking for containers #1734

  32. [PM-1681] – Enable easy data reuse from previous runs #1795

  33. [PM-1685] – Python package to clone github repos for pegasus-init in 5.0 #1799

  34. [PM-1286] – Deprecate Perl DAX API #1400

  35. [PM-1376] – Add LSF local attributes #1490

  36. [PM-1378] – Handle (copy) HPSS credentials when an environment variable is set #1492

  37. [PM-1382] – Add ppc64le to the known architectures #1496

  38. [PM-1383] – Switching to AMQP 0.9.1 in Pegasus Monitord #1497

  39. [PM-1400] – Remove MacOS .pkg builder #1514

  40. [PM-1401] – Deprecate pegasus-plots #1515

  41. [PM-1412] – Upgrade documentation #1526

  42. [PM-1413] – Upgrade Pegasus Databases to be Unicode Compatible #1527

  43. [PM-1416] – add data collection setup instructions to docs under section 6.7.1.1. Monitord, RabbitMQ, ElasticSearch Example #1530

  44. [PM-1446] – create tests for python client #1560

  45. [PM-1467] – Update jupyter notebook code to use new 5.0 api #1581

  46. [PM-1484] – create a default site catalog #1598

  47. [PM-1485] – change default pegasus.data.configuration from sharedfs to condorio #1599

  48. [PM-1486] – update default filenames for catalogs and workflow #1600

  49. [PM-1495] – checksum.value and checksum.type need to be added as optional rc entry fields #1609

  50. [PM-1510] – stageOut and registerReplica fields in Uses to be omitted for Uses of type input #1624

  51. [PM-1513] – Merge DECAF branch to master to support decaf integration #1627

  52. [PM-1524] – Use entrypoint in docker containers #1638

  53. [PM-1533] – Database Schema Cleanup #1647

  54. [PM-1537] – update existing workflow tests to use yaml #1651

  55. [PM-1547] – for hierarchical workflows, sc and tc cannot be inlined into the workflow file #1661

  56. [PM-1602] – Decaf development for the tess_dense example #1716

  57. [PM-1663] – remove old grid types from schema and add slurm to scheduler type #1777

  58. [PM-1679] – pegasus-s3 mkdir should cleanly exit if bucket already exists and is owned by user #1793

  59. [PM-1689] – pegasus-rc-client delete semantics #1803

  60. [PM-1692] – Add any missing options from pegasus-plan cli to Workflow.plan() and Client.plan() #1806

  61. [PM-1697] – Add major and minor number options to pegasus-version #1811

  62. [PM-643] – Better support for stdout of clustered jobs #761

  63. [PM-1049] – Jobs should not be retried immediately, but rather delayed for some time #1163

  64. [PM-1170] – statistics should include memory details #1284

  65. [PM-1232] – Allow for multiple output sites #1346

  66. [PM-1235] – Python 2/3 compatible code #1349

  67. [PM-1247] – Revisit release-tools/get-system-python #1361

  68. [PM-1321] – Move transfer staging into the container rather than the host OS #1435

  69. [PM-1323] – pegasus transfer should not try and transfer a file that does not exist #1437

  70. [PM-1324] – pegasus plan should be able to create site scratch directories with unique names #1438

  71. [PM-1329] – pegasus integrity causes LIGO workflows to fail #1443

  72. [PM-1338] – Add support for TACC wrangler to pegasus-init #1452

  73. [PM-1341] – Transition from vendored configobj to release #1455

  74. [PM-1344] – update pam usage by pamela #1458

  75. [PM-1349] – Improve error message when jobs fail due to deep dir. structure with depth of > 20. #1463

  76. [PM-1356] – Replace Google CDN with a different CDN as China block it #1470

  77. [PM-1359] – Don’t support Chinese character in the file path #1473

  78. [PM-1363] – Condor Configuration MOUNT_UNDER_SCRATCH causes pegasus auxiliary jobs to fail #1477

  79. [PM-1368] – Implement Catch and Release for integrity errors #1482

  80. [PM-1373] – Debian Buster no longer provides openjdk-8-jdk #1487

  81. [PM-1374] – make monitord resilient to dagman logging the debug level in dagman.out #1488

  82. [PM-1375] – Do not run integrity checks on symlinked files #1489

  83. [PM-1385] – Prioritize transfers bases on dependencies #1499

  84. [PM-1386] – bypass should be a per-file option #1500

  85. [PM-1391] – Allow properties to be set via environment variables. #1505

  86. [PM-1392] – YAML based braindump file #1506

  87. [PM-1403] – Support POWER9 nodes in PMC #1517

  88. [PM-1417] – add type field in job, dax, and dag #1531

  89. [PM-1418] – No code to handle postgresql backup. #1532

  90. [PM-1427] – remove STAT profile namespace #1541

  91. [PM-1429] – Introduce PEGASUS_ENV variable to define mode of workflow i.e. development, production, etc #1543

  92. [PM-1431] – Remove -0 suffixes from generated code files. #1545

  93. [PM-1432] – Remove deprecated pegasus-plan CLI args #1546

  94. [PM-1459] – Upgrade Java Unit Test Setup #1573

  95. [PM-1462] – Remove pegasus-submit-dag #1576

  96. [PM-1463] – improve insert performance of pegasus-rc-client #1577

  97. [PM-1472] – Update pegasus-s3 regions #1586

  98. [PM-1474] – Extend command line tools with –json option. #1588

  99. [PM-1476] – Explore deprecation/replacement of Perl codebase #1590

  100. [PM-1481] – add status progress bar to python client code #1595

  101. [PM-1489] – Implement a generic data access credential handler #1603

  102. [PM-1490] – Support Webdav for data transfers #1604

  103. [PM-1508] – in the python api, the method signature to manually add dependencies needs to be updated #1622

  104. [PM-1512] – Removed unused code from planner codebase #1626

  105. [PM-1514] – Move to boto3 and port p-s3 #1628

  106. [PM-1527] – p-status shows failure when state is unknown #1641

  107. [PM-1532] – python client “plan” should accept –sites as a list needs a –staging-site flag #1646

  108. [PM-1539] – Support PANDA GAHP – allow condorio for glite style #1653

  109. [PM-1544] – pegasus-transfer logs skips container verification when retrieving from singularity hub #1658

  110. [PM-1545] – handling metadata in 5.0 #1659

  111. [PM-1550] – resolve relative input-dir and output-dir options for dax jobs in hierarchical workflows #1664

  112. [PM-1580] – build packages for ubuntu 20 for 5.0 #1694

  113. [PM-1581] – pegasus-integrity callout from kickstart #1695

  114. [PM-1584] – 5.0 Python Api Improvements #1698

  115. [PM-1592] – Consistent UTF8 environment across full workflow #1706

  116. [PM-1594] – Short circuit p-transfer in the case of excessive failure in large transfers #1708

  117. [PM-1621] – Basic support for gpus in Docker and Singularity containers #1735

  118. [PM-1622] – /bin/sh compatibility #1736

  119. [PM-1626] – Allow users to specify arbitrary cli arguments for containers #1740

  120. [PM-1647] – Add more profile keys in the add_condor_profile #1761

  121. [PM-1650] – replace –dax with a positional argument #1764

  122. [PM-1651] – Add more profile keys in the add_pegasus_profile #1765

  123. [PM-1654] – pick up workflow api from vendor extensions encoded in the workflow #1768

  124. [PM-1657] – a pre-flight check needs to be done for the python package attrs #1771

  125. [PM-1674] – Deprecate hints profile namespace and use selector namespace #1788

  126. [PM-1687] – disable warnings when validating yaml files #1801

  127. [PM-1691] – Add pre script hook to hub repos #1805

  128. [PM-1236] – Compatibility Module #1350

  129. [PM-1237] – Python 3: Pegasus Dashboard #1351

  130. [PM-1238] – Python 3: Pegasus Monitord #1352

  131. [PM-1239] – Python 2/3 Compatible: Pegasus Transfer #1353

  132. [PM-1240] – Python 3: Pegasus Statistics #1354

  133. [PM-1241] – Python 3: Pegasus DB Admin #1355

  134. [PM-1242] – Python 3: Pegasus Metadata #1356

  135. [PM-1328] – support sharedfs on the compute site as staging site #1442

  136. [PM-1340] – make planner os releases consistent with builds #1454

  137. [PM-1353] – update monitord to parse both xml and yams based ks records #1467

  138. [PM-1361] – create the schema for a YAML based Site Catalog #1475

  139. [PM-1365] – remove __ from event keys wherever possible #1479

  140. [PM-1371] – Python 3 Compatible: DAX #1485

  141. [PM-1372] – Python 3 Compatible: pegasus-exitcode #1486

  142. [PM-1381] – Associated planner changes to handle LSF sites #1495

  143. [PM-1387] – make the netlogger events consistent with the documentation #1501

  144. [PM-1404] – pegasus tutorial for summit from Kubernetes #1518

  145. [PM-1407] – Python 3: Netlogger Monitord Code #1521

  146. [PM-1408] – Python 3: Pegasus Analyzer #1522

  147. [PM-1410] – create the schema for a YAML based Replica Catalog #1524

  148. [PM-1419] – Java #1533

  149. [PM-1420] – add checksum field for Containers in the TransformationCatalog schema #1534

  150. [PM-1422] – integrate the client code into the dax api #1536

  151. [PM-1423] – refactor and improve tests #1537

  152. [PM-1428] – change dax and dag to subworkflow #1542

  153. [PM-1430] – ensure that api can write out unicode characters in utf-8 #1544

  154. [PM-1433] – Remove old SC RC TC catalog versions #1547

  155. [PM-1435] – create a “moving from dax3 to new api” doc #1549

  156. [PM-1436] – implement method chaining using decorators #1550

  157. [PM-1437] – add type annotations #1551

  158. [PM-1439] – Planner support for YAML based Site Catalog #1553

  159. [PM-1441] – Python #1555

  160. [PM-1442] – Integrate Check in CI #1556

  161. [PM-1443] – implement api for pegasus.conf #1557

  162. [PM-1445] – update monitord to parse yaml based brain dump file #1559

  163. [PM-1447] – update pegasus-sc-converter to covert old format catalog to new yaml based one #1561

  164. [PM-1448] – SiteFactory should auto detect version and load the correct implementation #1562

  165. [PM-1450] – Module to load/dump YAML files #1564

  166. [PM-1453] – Module to load/dump Workflow(DAX) files #1567

  167. [PM-1454] – Module to load/dump Replica Catalog files #1568

  168. [PM-1455] – Module to load/dump Transformation Catalog files #1569

  169. [PM-1456] – Module to load/dump Site Catalog files #1570

  170. [PM-1457] – Module to load/dump Properties files #1571

  171. [PM-1460] – add docs to user guide #1574

  172. [PM-1464] – disallow variable expansion while converting site catalog from one format to another #1578

  173. [PM-1465] – Module to load/dump JSON files #1579

  174. [PM-1468] – remove catalog #1582

  175. [PM-1469] – update Pegasus-DAX3-Tutorial.ipynb #1583

  176. [PM-1471] – update jupyter docs #1585

  177. [PM-1473] – fix logging bug #1587

  178. [PM-1475] – Pegasus Plan #1589

  179. [PM-1477] – pegasus-config #1591

  180. [PM-1478] – pegasus-remove #1592

  181. [PM-1479] – pegasus-run #1593

  182. [PM-1480] – pegasus-status #1594

  183. [PM-1482] – Clean up javadoc warnings #1596

  184. [PM-1483] – rename ProfileMixin.add_() to ProfileMixin.add_profile_() #1597

  185. [PM-1487] – update python api to write new default filenames #1601

  186. [PM-1491] – Update pegasus-tc-converter to convert from old format to new YAML format #1605

  187. [PM-1492] – support docker/singularity container usage in cwl #1606

  188. [PM-1494] – Parse YAML Based RC files #1608

  189. [PM-1496] – update rc schema so that rc entries can have checksum fields #1610

  190. [PM-1497] – update rc api to support checksum fields #1611

  191. [PM-1498] – update RC docs to mention checksum values #1612

  192. [PM-1499] – Implement YAML Based RC Backend #1613

  193. [PM-1500] – create workflow test #1614

  194. [PM-1502] – add infer_dependencies=True to wf.write #1616

  195. [PM-1503] – in the python api, write out yml in order #1617

  196. [PM-1504] – flatten out the uses section. remove the extra file property aggregation #1618

  197. [PM-1505] – for job arguments, files should be added as a strings #1619

  198. [PM-1507] – update schema for stdin, stdout, stderr in jobs to s.t only lfn is used #1621

  199. [PM-1509] – update “type” field options in the AbstractJob schema to be “job”, “pegasusWorkflow”, and “condorWorkflow” #1623

  200. [PM-1511] – Update DAXParser Factory to load the right parser based on content of input dax file #1625

  201. [PM-1515] – prefer catalog entries for SC, RC and TC in the DAX over everything else #1629

  202. [PM-1517] – update default –sites option in python client planner wrapper code #1631

  203. [PM-1518] – Update Replica Factory to auto load RC backend based on type of file #1632

  204. [PM-1519] – update schema for job args s.t. types can be strings and scalars #1633

  205. [PM-1520] – Update Transformation Factory to auto load Transformation backend based on type of file #1634

  206. [PM-1523] – work on replica catalog converter #1637

  207. [PM-1525] – support in planner for compound transformations #1639

  208. [PM-1526] – update python api to write out tr requirements in the format “namespace::name:version” #1640

  209. [PM-1535] – default paths picked up should be logged in the properties file in the submit directory #1649

  210. [PM-1538] – convert test 023-sc4-ssh-http to use yaml #1652

  211. [PM-1540] – allow mount point regex to parse shell variable names #1654

  212. [PM-1541] – change the pegasus-plan invocation via pegasus lite in dagman prescripts to handle new worker package organization #1655

  213. [PM-1542] – convert test 024-sc4-gridftp-http to use yaml #1656

  214. [PM-1543] – convert test 025-sc4-file-http to use yaml #1657

  215. [PM-1546] – add metadata to entry in replica catalog #1660

  216. [PM-1548] – preserve case for property keys when properties python api is used #1662

  217. [PM-1549] – registration of outputs in Pegasus 5.0 #1663

  218. [PM-1555] – Chapter 2. Tutorial – Update and review #1669

  219. [PM-1556] – Chapter 3. Installation – Update and review #1670

  220. [PM-1557] – Chapter 4. Creating Workflows – Update and review #1671

  221. [PM-1558] – Chapter 5. Running Workflows – Update and review #1672

  222. [PM-1559] – Chapter 6. Monitoring, Debugging and Statistics – Update and review #1673

  223. [PM-1560] – Chapter 7. Execution Environments – Update and review #1674

  224. [PM-1561] – Chapter 8. Containers – Update and review #1675

  225. [PM-1562] – Chapter 9. Example Workflows – Update and review #1676

  226. [PM-1563] – Chapter 10. Data Management – Update and review #1677

  227. [PM-1564] – Chapter 11. Optimizing Workflows for Efficiency and Scalability – Update and review #1678

  228. [PM-1565] – Chapter 12. Pegasus Service – Update and review #1679

  229. [PM-1566] – Chapter 13. Configuration – Update and review #1680

  230. [PM-1567] – Chapter 14. Submit Directory Details – Update and review #1681

  231. [PM-1568] – Chapter 15. Jupyter Notebooks – Update and review #1682

  232. [PM-1569] – Chapter 16. API Reference – Update and review #1683

  233. [PM-1570] – Chapter Command Line Tools man pages – Update and review #1684

  234. [PM-1571] – rewrite Introduction #1685

  235. [PM-1572] – Chapter “Packages” – Update and review #1686

  236. [PM-1573] – Chapter 17. Useful Tips – Update and review #1687

  237. [PM-1574] – Chapter 18. Funding, citing, and anonymous usage statistics – Update and review #1688

  238. [PM-1575] – Chapter 19. Glossary – Update and review #1689

  239. [PM-1576] – Chapter 20. Tutorial VM – Update and review #1690

  240. [PM-1577] – convert test 045-hierarchy-sharedfs #1691

  241. [PM-1578] – Migration Guide to 5.0 #1692

  242. [PM-1579] – convert test 045-hierarchy-sharedfs-b to use yaml #1693

  243. [PM-1582] – ensure checksum and file metadata also appears in the output replica catalog #1696

  244. [PM-1583] – Parse meta files as a RC backend in the planner #1697

  245. [PM-1585] – remove glibc from schema and api as it is not used anymore #1699

  246. [PM-1586] – add built-in support for pathlib.Path objects where ever paths are used #1700

  247. [PM-1587] – _DirectoryType enums should have underscores in name #1701

  248. [PM-1588] – in add_pegasus_profile(), add data_configuration as a kwarg #1702

  249. [PM-1590] – in the Workflow object set infer_dependencies to be True by default #1704

  250. [PM-1591] – Only require site and pfn in Transformation constructor when automatically creating a TransformationSite #1705

  251. [PM-1596] – pegasus-db-admin should use PEGASUS_HOME to discover pegasus-version etc #1710

  252. [PM-1597] – Pegasus Run #1711

  253. [PM-1598] – Output originating from pegasus-tools should be output by workflow object as is (without any log category) #1712

  254. [PM-1599] – Improve exception handling for failed execution of pegasus client commands #1713

  255. [PM-1600] – Update workflow and client python apis to support multiple output sites #1714

  256. [PM-1601] – Client plan input_dir must take in a list of str #1715

  257. [PM-1604] – Fix deprecation warnings #1718

  258. [PM-1605] – Get ensemble manager running on master (Python3) #1719

  259. [PM-1606] – Merge in/factor in code implemented in add-ensemble-triggers branch #1720

  260. [PM-1607] – Add time interval based triggering on a given directory/file pattern #1721

  261. [PM-1609] – register_replica should be set to True by default #1723

  262. [PM-1610] – ensure pegasus-db-admin downgrade works #1724

  263. [PM-1611] – add pegasus-graphviz functionality into the api #1725

  264. [PM-1612] – update monitord to record avg cpu utilization and maxrss #1726

  265. [PM-1613] – update database schema to track maxrss and avg_cpu #1727

  266. [PM-1614] – update planner for revised 5.0 replica catalog format #1728

  267. [PM-1615] – unique constraint failed error when using multiple output sites #1729

  268. [PM-1616] – add checksums for executables #1730

  269. [PM-1617] – add missing integrity check for executables #1731

  270. [PM-1618] – update 039-black-metadata to use python 5.0 api #1732

  271. [PM-1623] – fix 5.0 python auto generated python api documentation #1737

  272. [PM-1624] – convert 032-black-checkpoint #1738

  273. [PM-1625] – Pegasus specific profile for requesting GPU resources #1739

  274. [PM-1627] – workflow uuid, submit dir, submit hostname, root wf id should be accessible from the workflow object #1741

  275. [PM-1628] – files generated by the api should have comments specifying that they have been auto generated by the api #1742

  276. [PM-1629] – Add metadata to container #1743

  277. [PM-1631] – create 032-kickstart-chkpoint-signal-condorio #1745

  278. [PM-1632] – create 032-kickstart-chkpoint-signal-nonsharedfs #1746

  279. [PM-1633] – CLI: manpage pegasus-plan #1747

  280. [PM-1634] – CLI: manpage pegasus-db-admin #1748

  281. [PM-1635] – CLI: manpage pegasus-status #1749

  282. [PM-1636] – CLI: manpage pegasus-remove #1750

  283. [PM-1637] – CLI: manpage pegasus-anaylzer #1751

  284. [PM-1638] – CLI: manpage pegasus-statistics #1752

  285. [PM-1639] – CLI: manpage pegasus-run #1753

  286. [PM-1640] – CLI: manpage pegasus-transfer #1754

  287. [PM-1641] – CLI: manpage pegasus-s3 #1755

  288. [PM-1642] – CLI: manpage pegasus-integrity #1756

  289. [PM-1643] – CLI: manpage pegasus-rc-converter #1757

  290. [PM-1644] – CLI: manpage pegasus-sc-converter #1758

  291. [PM-1645] – CLI: manpage pegasus-tc-converter #1759

  292. [PM-1646] – update database overview, and update schema picture #1760

  293. [PM-1648] – add unit tests that put/pull to/from aws s3 and ceph #1762

  294. [PM-1649] – remove old config parameters if they are not used #1763

  295. [PM-1652] – update pegasus client api to pass the workflow to be planned at the end #1766

  296. [PM-1655] – the metrics server should pick up key wf_api , and default back to dax_api if not present #1769

  297. [PM-1656] – Add unit tests to ensure YAML is being generated correctly #1770

  298. [PM-1659] – in the python api, add pegasus profile key container.arguments #1773

  299. [PM-1660] – add boolean bypass flag for input files #1774

  300. [PM-1661] – expose bypass parameter for Containers and TransformationSites #1775

  301. [PM-1662] – document behavior in user guide about bypassing file staging #1776

  302. [PM-1665] – expose –randomdir/–randomdir= in client code in the python api #1779

  303. [PM-1666] – get live output from pegasus cli tools when they are called with the client code #1780

  304. [PM-1667] – when client code is called, it should be more obvious what pegasus- is being called #1781

  305. [PM-1668] – fb-nlp workflow generator creates multiple jobs that create same output file #1782

  306. [PM-1669] – remove “pegasus: ” from tc when inlined in a workflow #1783

  307. [PM-1670] – expose a schema validation function that can be used in unit tests #1784

  308. [PM-1682] – add –reuse option to Workflow.plan in python api #1796

  309. [PM-1683] – update section 7.3 supported transfer protocols in data management guide #1797

  310. [PM-1684] – reorg table of contents #1798

  311. [PM-1688] – update pegasus-db-admin as a new ‘trigger’ table has been added to the schema #1802

  312. [PM-1693] – convert test 010-runtime-clustering-CondorIO to use python api #1807

  313. [PM-1694] – convert test 010-runtime-clustering-Non-SharedFS to use python api #1808

  314. [PM-1695] – convert 010-runtime-clustering-SharedFS to use python api #1809

  315. [PM-1696] – convert 010-runtime-clustering-SharedFS Staging and No Kickstart to use python api #1810

  316. [PM-1698] – Release notes for 5.0 release #1812

  317. [PM-1699] – add logo the documentation pages #1813

  318. [PM-1700] – add SIGINT handler to Client.wait() so that it can exit cleanly #1814

  319. [PM-1701] – planner should get chmod jobs to run locally if compute site has auxiliary.local set #1815

2.10.3. Bugs Fixed

  1. [PM-1150] – Pegasus should verify that required credentials exists before starting a workflow #1264

  2. [PM-1192] – User supplied env setup script for lite #1306

  3. [PM-1199] – Notification naming / meta notifications #1313

  4. [PM-1326] – singularity suffix computed incorrectly #1440

  5. [PM-1327] – bypass input file staging broken for container execution #1441

  6. [PM-1330] – .meta files created even when integrity checking is disabled. #1444

  7. [PM-1332] – monitord is failing on a dagman.out file #1446

  8. [PM-1333] – amqp endpoint errors should not disable database population for multiplexed sinks #1447

  9. [PM-1334] – pegasus dagman is not exiting cleanly #1448

  10. [PM-1336] – pegasus-submitdir is broken #1450

  11. [PM-1346] – Pegasus job checkpointing is incompatible with condorio #1460

  12. [PM-1347] – pegasus will always try and transfer output when a code has checkpointed #1461

  13. [PM-1350] – pegasus is ignoring when_to_transfer_output #1464

  14. [PM-1358] – HTCondor 8.8.0/8.8.1 remaps /tmp, and can break access to x509 credentials #1472

  15. [PM-1360] – planner drops transfer_(in|out)put_files if NoGridStart is used #1474

  16. [PM-1362] – Chinese characters in the file path #1476

  17. [PM-1366] – Pegasus Cluster Label – Job Env Not Picked Up in Containers #1480

  18. [PM-1377] – A + in a tc name breaks pegasus-plan #1491

  19. [PM-1379] – Stage out job fails – wrong src location #1493

  20. [PM-1380] – Support for Singularity Library #1494

  21. [PM-1389] – pegasus.cores causes issues on Summit #1503

  22. [PM-1395] – GLite LSF scripts don’t work as intended on OLCF’s DTNs #1509

  23. [PM-1405] – Is Pegasus supposed to build on 32-bit x86 (Debian i386 Stretch)? #1519

  24. [PM-1409] – Python virtual environments not considered first before system wide installation #1523

  25. [PM-1421] – p-lite generated sh files fail, when used with Docker containers, that make use of USER argument #1535

  26. [PM-1466] – Upgrade Python Package Versions #1580

  27. [PM-1488] – Get condorpool worker machine ipaddr error!!! #1602

  28. [PM-1506] – pegasus-python-wrapper locates executable from PEGASUS_HOME instead of dirname of the exec. #1620

  29. [PM-1521] – Decaf Jobs does not generate valid JSON for Decaf anymore #1635

  30. [PM-1522] – output not being capture in pegasus client code #1636

  31. [PM-1530] – site selector does map job correctly when using stageable or all mapper with a containerized job #1644

  32. [PM-1531] – pegasus-db-admin fails on macosx on a clean db #1645

  33. [PM-1534] – integrity checking with bypass input staging if checksums are specified in replica catalog #1648

  34. [PM-1595] – hostname is not generated in case of integrity failures #1709

  35. [PM-1619] – pegasus-analyzer output doesn’t match pegasus-status #1733

  36. [PM-1630] – PATH variable in Docker containers is not preserved in some cases #1744

  37. [PM-1671] – ConfigParser still referenced in pegasus-worker CLI scripts #1785

  38. [PM-1675] – planner throws error when using “pegasus.dir.storage.mapper.replica.file” without specifying “pegasus.dir.storage.mapper.replica” #1789

  39. [PM-1676] – hierarchical workflow failing only when integrity checking is on #1790

  40. [PM-1680] – kickstart needs to quote argument vector when outputting it to yaml #1794

  41. [PM-1702] – registration jobs fail out of a 5.0 binary install #1816

  42. [PM-1705] – priorities are not propagated correctly #1819

11,048 views