Pegasus 5.0 Released

with No Comments

We are happy to announce the release of Pegasus 5.0. Pegasus 5.0 Release Slider ImagePegasus 5.0 is be a major release of Pegasus and builds upon the beta version released couple of months back. It also includes all features and bug fixes from the 4.9 branch. We invite our users to give it a try.


The release can be downloaded from:

If you are an existing user, please carefully follow these instructions to upgrade.

Highlights of the Release

  1. Reworked Python API: This new API has been developed from the ground up so that, in addition to generating the abstract workflow and all the catalogs, it now allows you to plan, submit, monitor, analyze and generate statistics of your workflow. To use this new Python API refer to the Moving From DAX3 to Pegasus.api.
  2. Adoption of YAML formats: With Pegasus 5.0, we are moving to adoption of YAML for representation of all major catalogs. We have provided catalog converters for you to convert your existing catalogs to the new formats. In 5.0, the following are now represented in YAML:
    • Abstract Workflow
    • Replica Catalog
    • Transformation Catalog
    • Site Catalog
    • Kickstart Provenance Records
  3. Python3 Support
    • All Pegasus tools are Python 3 compliant
    • 5.0 release will require Python 3 on workflow submit node
    • Python PIP packages for workflow composition and monitoring
  4. Default data configuration
    • In Pegasus 5.0, the default data configuration has been changed to condorio . Up to 4.9.x releases, the default configuration was sharedfs.
  5. Zero configuration required to submit to local HTCondor pool
  6. Data Management Improvements
    • New output replica catalog that registers outputs including file metadata such as size and checksums
    • Ability to do bypass staging of files at a per file, executable and container level
    • Improved support for hierarchal workflows allow you to create data dependencies between sub workflow jobs and compute jobs
    • Support for staging of generated outputs to multiple output sites
    • Support for integrity checking of user executables and application containers in addition to data
    • Support for webdav transfers
    • Easier enabling of data reuse by specifying previous workflow submit directories using –reuse option to pegasus-plan. 
    • Stagein transfer jobs are assigned priorities based on the number of child compute jobs. Details can be found in JIRA ticket 1385
  7. New Jupyter Notebook Based Tutorial
    • With this release, we are pleased to announce a brand new tutorial based on a Docker container running interactive Jupyter notebooks. Access the tutorial here .
  8. Support for CWL (Common Workflow Language)
    • The pegasus-cwl-converter command line tool has been developed to convert a subset of the Common Workflow Language (CWL) to Pegasus’s native YAML format. Given the following three files: a CWL workflow file, a workflow inputs specification file, and a transformation (executable) specification file, pegasus-cwl-converter
      will do a best-effort translation. This will also work with CWL workflow specifications that refer to Docker containers. The entire CWL language specification is not yet covered by this converter, and as such we can provide additional support in converting
      your workflows into the Pegasus YAML format. More details in the CWL Chapter
  9. Support for triggers in Ensemble Manager
    • The Pegasus Ensemble Manager is a service that manages collections of workflows. In this latest release of Pegasus, workflow triggering functionality has been added to this service. With the ensemble manager service up and running, the pegasus-em command can now be used to start workflow triggers. Two triggers
      are currently supported:
      • a cron based trigger: The cron based trigger will, at a given interval, submit a new workflow to your ensemble.
      • a cron based, file pattern trigger: The cron based, file pattern trigger, much like the cron based trigger, will submit a new workflow to your ensemble at a given time interval, with the addition of any new files that are detected based on a given file pattern. This useful for automatically processing data as it arrives.
  10. Improved events for each job reported to AMQP
    • Historically the events reported to AMQP endpoints are normalized events corresponding to the stampede database, which makes correlation hard. Pegasus now also reports a new job composite event  (stampede.job_inst.composite) to AMQP end points that have a complete information about a job execution.
  11. Revamped Documentation
    • Documentation has been overhauled and broken down into a user guide and a reference guide. In addition, we have moved to readthedocs style documentation using restructured text.  The documentation can be found here.
  12. pegasus-statistics reports memory usage and avg cpu utilization
    • pegasus-statistics now reports memory usage and average cpu utilization for your jobs in the transformation statistics file (breakdown.txt).  
  13. PegasusLite Improvements
    • Users can now specify environment setup scripts in the site catalog that need to be sourced to setup the environment before the job is launched by PegasusLite. More details here
    • Users can get the transfers in PegasusLite to run on DTN nodes while the jobs run on the the compute nodes. More details here.
  14. Credentials existence is checked upfront by planner
  15. Performance improvements for pegasus-rc-client
    • pegasus-rc-client now does bulk inserts when inserting entries into a database backed Replica Catalog.
  16. Pegasus ensures a consistent UTF8 environment across full workflow
    1. Details can be found in JIRA ticket 1592 .

JIRA items

Exhaustive list of features, improvements and bug fixes can be found below.

New Features

  1. [PM-603] – Enable workflows to reference local input files directly instead of symlinking them
  2. [PM-1133] – Kickstart should send a heartbeat so that condor can kill stuck jobs
  3. [PM-1156] – PegasusLite to tar up the contents of the cwd in case of job failure
  4. [PM-1278] – stats should include cpu and memory utilization
  5. [PM-1309] – develop a pip package that only contains the DAX API and the catalogs API
  6. [PM-1335] – YAML based transformation catalog
  7. [PM-1339] – construct a default entry for local site if not present in site catalog
  8. [PM-1345] – Support for Shifter at Nersc
  9. [PM-1351] – YAML based kickstart records
  10. [PM-1352] – Build failure on Debian 10 due to mariadb/MySQL-Python incompatibility
  11. [PM-1354] – pegasus-init to support titan tutorial
  12. [PM-1355] – composite records when sending events to AMQP
  13. [PM-1357] – In lite jobs, chirp durations for stage in, stage out of data
  14. [PM-1367] – Support for retrieval from HPSS tape store using commands htar and hsi
  15. [PM-1390] – ensure all machine parseable information is one file associated with job
  16. [PM-1396] – kickstart yaml parser fails because of : unacceptable character #x001b: special characters are not allowed
  17. [PM-1398] – include machine information in job_instance.composite event
  18. [PM-1402] – pegasus-init to support summit as execution env for tutorial
  19. [PM-1411] – create the schema for a YAML based DAX
  20. [PM-1438] – YAML Based Site Catalog
  21. [PM-1461] – Ability to specify a wrapper/launcher for compute jobs in PegasusLite
  22. [PM-1470] – pegasus-graphviz needs a yaml handler
  23. [PM-1493] – YAML Based Replica Catalog
  24. [PM-1501] – Parse YAML DAX files
  25. [PM-1516] – Planner should create a default condorpool compute site if a user does not have it specified
  26. [PM-1528] – update DAX R API to emit new workflow format in yaml
  27. [PM-1529] – set default data configuration to condorio
  28. [PM-1551] – Update JAVA DAX API to generate yaml formatted DAX
  29. [PM-1552] – move to using pegasus lite for cases, where we transfer pegasus-transfer
  30. [PM-1608] – data dependencies between dax jobs and compute jobs in a workflow
  31. [PM-1620] – enable integrity checking for containers
  32. [PM-1681] – Enable easy data reuse from previous runs
  33. [PM-1685] – Python package to clone github repos for pegasus-init in 5.0
  34. [PM-1286] – Deprecate Perl DAX API
  35. [PM-1376] – Add LSF local attributes
  36. [PM-1378] – Handle (copy) HPSS credentials when an environment variable is set
  37. [PM-1382] – Add ppc64le to the known architectures
  38. [PM-1383] – Switching to AMQP 0.9.1 in Pegasus Monitord
  39. [PM-1400] – Remove MacOS .pkg builder
  40. [PM-1401] – Deprecate pegasus-plots
  41. [PM-1412] – Upgrade documentation
  42. [PM-1413] – Upgrade Pegasus Databases to be Unicode Compatible
  43. [PM-1416] – add data collection setup instructions to docs under section Monitord, RabbitMQ, ElasticSearch Example
  44. [PM-1446] – create tests for python client
  45. [PM-1467] – Update jupyter notebook code to use new 5.0 api
  46. [PM-1484] – create a default site catalog
  47. [PM-1485] – change default from sharedfs to condorio
  48. [PM-1486] – update default filenames for catalogs and workflow
  49. [PM-1495] – checksum.value and checksum.type need to be added as optional rc entry fields
  50. [PM-1510] – stageOut and registerReplica fields in Uses to be omitted for Uses of type input
  51. [PM-1513] – Merge DECAF branch to master to support decaf integration
  52. [PM-1524] – Use entrypoint in docker containers
  53. [PM-1533] – Database Schema Cleanup
  54. [PM-1537] – update existing workflow tests to use yaml
  55. [PM-1547] – for hierarchical workflows, sc and tc cannot be inlined into the workflow file
  56. [PM-1602] – Decaf development for the tess_dense example
  57. [PM-1663] – remove old grid types from schema and add slurm to scheduler type
  58. [PM-1679] – pegasus-s3 mkdir should cleanly exit if bucket already exists and is owned by user
  59. [PM-1689] – pegasus-rc-client delete semantics
  60. [PM-1692] – Add any missing options from pegasus-plan cli to Workflow.plan() and Client.plan()
  61. [PM-1697] – Add major and minor number options to pegasus-version
  62. [PM-643] – Better support for stdout of clustered jobs
  63. [PM-1049] – Jobs should not be retried immediately, but rather delayed for some time
  64. [PM-1170] – statistics should include memory details
  65. [PM-1232] – Allow for multiple output sites
  66. [PM-1235] – Python 2/3 compatible code
  67. [PM-1247] – Revisit release-tools/get-system-python
  68. [PM-1321] – Move transfer staging into the container rather than the host OS
  69. [PM-1323] – pegasus transfer should not try and transfer a file that does not exist
  70. [PM-1324] – pegasus plan should be able to create site scratch directories with unique names
  71. [PM-1329] – pegasus integrity causes LIGO workflows to fail
  72. [PM-1338] – Add support for TACC wrangler to pegasus-init
  73. [PM-1341] – Transition from vendored `configobj` to release
  74. [PM-1344] – update pam usage by pamela
  75. [PM-1349] – Improve error message when jobs fail due to deep dir. structure with depth of > 20.
  76. [PM-1356] – Replace Google CDN with a different CDN as China block it
  77. [PM-1359] – Don’t support Chinese character in the file path
  78. [PM-1363] – Condor Configuration MOUNT_UNDER_SCRATCH causes pegasus auxiliary jobs to fail
  79. [PM-1368] – Implement Catch and Release for integrity errors
  80. [PM-1373] – Debian Buster no longer provides openjdk-8-jdk
  81. [PM-1374] – make monitord resilient to dagman logging the debug level in dagman.out
  82. [PM-1375] – Do not run integrity checks on symlinked files
  83. [PM-1385] – Prioritize transfers bases on dependencies
  84. [PM-1386] – bypass should be a per-file option
  85. [PM-1391] – Allow properties to be set via environment variables.
  86. [PM-1392] – YAML based braindump file
  87. [PM-1403] – Support POWER9 nodes in PMC
  88. [PM-1417] – add type field in job, dax, and dag
  89. [PM-1418] – No code to handle postgresql backup.
  90. [PM-1427] – remove STAT profile namespace
  91. [PM-1429] – Introduce PEGASUS_ENV variable to define mode of workflow i.e. development, production, etc
  92. [PM-1431] – Remove -0 suffixes from generated code files.
  93. [PM-1432] – Remove deprecated pegasus-plan CLI args
  94. [PM-1459] – Upgrade Java Unit Test Setup
  95. [PM-1462] – Remove pegasus-submit-dag
  96. [PM-1463] – improve insert performance of pegasus-rc-client
  97. [PM-1472] – Update pegasus-s3 regions
  98. [PM-1474] – Extend command line tools with –json option.
  99. [PM-1476] – Explore deprecation/replacement of Perl codebase
  100. [PM-1481] – add status progress bar to python client code
  101. [PM-1489] – Implement a generic data access credential handler
  102. [PM-1490] – Support Webdav for data transfers
  103. [PM-1508] – in the python api, the method signature to manually add dependencies needs to be updated
  104. [PM-1512] – Removed unused code from planner codebase
  105. [PM-1514] – Move to boto3 and port p-s3
  106. [PM-1527] – p-status shows failure when state is unknown
  107. [PM-1532] – python client “plan” should accept –sites as a list needs a –staging-site flag
  108. [PM-1539] – Support PANDA GAHP – allow condorio for glite style
  109. [PM-1544] – pegasus-transfer logs skips container verification when retrieving from singularity hub
  110. [PM-1545] – handling metadata in 5.0
  111. [PM-1550] – resolve relative input-dir and output-dir options for dax jobs in hierarchical workflows
  112. [PM-1580] – build packages for ubuntu 20 for 5.0
  113. [PM-1581] – pegasus-integrity callout from kickstart
  114. [PM-1584] – 5.0 Python Api Improvements
  115. [PM-1592] – Consistent UTF8 environment across full workflow
  116. [PM-1594] – Short circuit p-transfer in the case of excessive failure in large transfers
  117. [PM-1621] – Basic support for gpus in Docker and Singularity containers
  118. [PM-1622] – /bin/sh compatibility
  119. [PM-1626] – Allow users to specify arbitrary cli arguments for containers
  120. [PM-1647] – Add more profile keys in the add_condor_profile
  121. [PM-1650] – replace –dax with a positional argument
  122. [PM-1651] – Add more profile keys in the add_pegasus_profile
  123. [PM-1654] – pick up workflow api from vendor extensions encoded in the workflow
  124. [PM-1657] – a pre-flight check needs to be done for the python package attrs
  125. [PM-1674] – Deprecate hints profile namespace and use selector namespace
  126. [PM-1687] – disable warnings when validating yaml files
  127. [PM-1691] – Add pre script hook to hub repos
  128. [PM-1236] – Compatibility Module
  129. [PM-1237] – Python 3: Pegasus Dashboard
  130. [PM-1238] – Python 3: Pegasus Monitord
  131. [PM-1239] – Python 2/3 Compatible: Pegasus Transfer
  132. [PM-1240] – Python 3: Pegasus Statistics
  133. [PM-1241] – Python 3: Pegasus DB Admin
  134. [PM-1242] – Python 3: Pegasus Metadata
  135. [PM-1328] – support sharedfs on the compute site as staging site
  136. [PM-1340] – make planner os releases consistent with builds
  137. [PM-1353] – update monitord to parse both xml and yams based ks records
  138. [PM-1361] – create the schema for a YAML based Site Catalog
  139. [PM-1365] – remove __ from event keys wherever possible
  140. [PM-1371] – Python 3 Compatible: DAX
  141. [PM-1372] – Python 3 Compatible: pegasus-exitcode
  142. [PM-1381] – Associated planner changes to handle LSF sites
  143. [PM-1387] – make the netlogger events consistent with the documentation
  144. [PM-1404] – pegasus tutorial for summit from Kubernetes
  145. [PM-1407] – Python 3: Netlogger Monitord Code
  146. [PM-1408] – Python 3: Pegasus Analyzer
  147. [PM-1410] – create the schema for a YAML based Replica Catalog
  148. [PM-1419] – Java
  149. [PM-1420] – add checksum field for Containers in the TransformationCatalog schema
  150. [PM-1422] – integrate the client code into the dax api
  151. [PM-1423] – refactor and improve tests
  152. [PM-1428] – change dax and dag to subworkflow
  153. [PM-1430] – ensure that api can write out unicode characters in utf-8
  154. [PM-1433] – Remove old SC RC TC catalog versions
  155. [PM-1435] – create a “moving from dax3 to new api” doc
  156. [PM-1436] – implement method chaining using decorators
  157. [PM-1437] – add type annotations
  158. [PM-1439] – Planner support for YAML based Site Catalog
  159. [PM-1441] – Python
  160. [PM-1442] – Integrate Check in CI
  161. [PM-1443] – implement api for pegasus.conf
  162. [PM-1445] – update monitord to parse yaml based brain dump file
  163. [PM-1447] – update pegasus-sc-converter to covert old format catalog to new yaml based one
  164. [PM-1448] – SiteFactory should auto detect version and load the correct implementation
  165. [PM-1450] – Module to load/dump YAML files
  166. [PM-1453] – Module to load/dump Workflow(DAX) files
  167. [PM-1454] – Module to load/dump Replica Catalog files
  168. [PM-1455] – Module to load/dump Transformation Catalog files
  169. [PM-1456] – Module to load/dump Site Catalog files
  170. [PM-1457] – Module to load/dump Properties files
  171. [PM-1460] – add docs to user guide
  172. [PM-1464] – disallow variable expansion while converting site catalog from one format to another
  173. [PM-1465] – Module to load/dump JSON files
  174. [PM-1468] – remove catalog
  175. [PM-1469] – update Pegasus-DAX3-Tutorial.ipynb
  176. [PM-1471] – update jupyter docs
  177. [PM-1473] – fix logging bug
  178. [PM-1475] – Pegasus Plan
  179. [PM-1477] – pegasus-config
  180. [PM-1478] – pegasus-remove
  181. [PM-1479] – pegasus-run
  182. [PM-1480] – pegasus-status
  183. [PM-1482] – Clean up javadoc warnings
  184. [PM-1483] – rename ProfileMixin.add_<ns>() to ProfileMixin.add_profile_<ns>()
  185. [PM-1487] – update python api to write new default filenames
  186. [PM-1491] – Update pegasus-tc-converter to convert from old format to new YAML format
  187. [PM-1492] – support docker/singularity container usage in cwl
  188. [PM-1494] – Parse YAML Based RC files
  189. [PM-1496] – update rc schema so that rc entries can have checksum fields
  190. [PM-1497] – update rc api to support checksum fields
  191. [PM-1498] – update RC docs to mention checksum values
  192. [PM-1499] – Implement YAML Based RC Backend
  193. [PM-1500] – create workflow test
  194. [PM-1502] – add infer_dependencies=True to wf.write
  195. [PM-1503] – in the python api, write out yml in order
  196. [PM-1504] – flatten out the uses section. remove the extra file property aggregation
  197. [PM-1505] – for job arguments, files should be added as a strings
  198. [PM-1507] – update schema for stdin, stdout, stderr in jobs to s.t only lfn is used
  199. [PM-1509] – update “type” field options in the AbstractJob schema to be “job”, “pegasusWorkflow”, and “condorWorkflow”
  200. [PM-1511] – Update DAXParser Factory to load the right parser based on content of input dax file
  201. [PM-1515] – prefer catalog entries for SC, RC and TC in the DAX over everything else
  202. [PM-1517] – update default –sites option in python client planner wrapper code
  203. [PM-1518] – Update Replica Factory to auto load RC backend based on type of file
  204. [PM-1519] – update schema for job args s.t. types can be strings and scalars
  205. [PM-1520] – Update Transformation Factory to auto load Transformation backend based on type of file
  206. [PM-1523] – work on replica catalog converter
  207. [PM-1525] – support in planner for compound transformations
  208. [PM-1526] – update python api to write out tr requirements in the format “namespace::name:version”
  209. [PM-1535] – default paths picked up should be logged in the properties file in the submit directory
  210. [PM-1538] – convert test 023-sc4-ssh-http to use yaml
  211. [PM-1540] – allow mount point regex to parse shell variable names
  212. [PM-1541] – change the pegasus-plan invocation via pegasus lite in dagman prescripts to handle new worker package organization
  213. [PM-1542] – convert test 024-sc4-gridftp-http to use yaml
  214. [PM-1543] – convert test 025-sc4-file-http to use yaml
  215. [PM-1546] – add metadata to entry in replica catalog
  216. [PM-1548] – preserve case for property keys when properties python api is used
  217. [PM-1549] – registration of outputs in Pegasus 5.0
  218. [PM-1555] – Chapter 2. Tutorial – Update and review
  219. [PM-1556] – Chapter 3. Installation – Update and review
  220. [PM-1557] – Chapter 4. Creating Workflows – Update and review
  221. [PM-1558] – Chapter 5. Running Workflows – Update and review
  222. [PM-1559] – Chapter 6. Monitoring, Debugging and Statistics – Update and review
  223. [PM-1560] – Chapter 7. Execution Environments – Update and review
  224. [PM-1561] – Chapter 8. Containers – Update and review
  225. [PM-1562] – Chapter 9. Example Workflows – Update and review
  226. [PM-1563] – Chapter 10. Data Management – Update and review
  227. [PM-1564] – Chapter 11. Optimizing Workflows for Efficiency and Scalability – Update and review
  228. [PM-1565] – Chapter 12. Pegasus Service – Update and review
  229. [PM-1566] – Chapter 13. Configuration – Update and review
  230. [PM-1567] – Chapter 14. Submit Directory Details – Update and review
  231. [PM-1568] – Chapter 15. Jupyter Notebooks – Update and review
  232. [PM-1569] – Chapter 16. API Reference – Update and review
  233. [PM-1570] – Chapter Command Line Tools man pages – Update and review
  234. [PM-1571] – rewrite Introduction
  235. [PM-1572] – Chapter “Packages” – Update and review
  236. [PM-1573] – Chapter 17. Useful Tips – Update and review
  237. [PM-1574] – Chapter 18. Funding, citing, and anonymous usage statistics – Update and review
  238. [PM-1575] – Chapter 19. Glossary – Update and review
  239. [PM-1576] – Chapter 20. Tutorial VM – Update and review
  240. [PM-1577] – convert test 045-hierarchy-sharedfs
  241. [PM-1578] – Migration Guide to 5.0
  242. [PM-1579] – convert test 045-hierarchy-sharedfs-b to use yaml
  243. [PM-1582] – ensure checksum and file metadata also appears in the output replica catalog
  244. [PM-1583] – Parse meta files as a RC backend in the planner
  245. [PM-1585] – remove glibc from schema and api as it is not used anymore
  246. [PM-1586] – add built-in support for pathlib.Path objects where ever paths are used
  247. [PM-1587] – _DirectoryType enums should have underscores in name
  248. [PM-1588] – in add_pegasus_profile(), add data_configuration as a kwarg
  249. [PM-1590] – in the Workflow object set infer_dependencies to be True by default
  250. [PM-1591] – Only require site and pfn in Transformation constructor when automatically creating a TransformationSite
  251. [PM-1596] – pegasus-db-admin should use PEGASUS_HOME to discover pegasus-version etc
  252. [PM-1597] – Pegasus Run
  253. [PM-1598] – Output originating from pegasus-tools should be output by workflow object as is (without any log category)
  254. [PM-1599] – Improve exception handling for failed execution of pegasus client commands
  255. [PM-1600] – Update workflow and client python apis to support multiple output sites
  256. [PM-1601] – Client plan input_dir must take in a list of str
  257. [PM-1604] – Fix deprecation warnings
  258. [PM-1605] – Get ensemble manager running on master (Python3)
  259. [PM-1606] – Merge in/factor in code implemented in add-ensemble-triggers branch
  260. [PM-1607] – Add time interval based triggering on a given directory/file pattern
  261. [PM-1609] – register_replica should be set to True by default
  262. [PM-1610] – ensure pegasus-db-admin downgrade works
  263. [PM-1611] – add pegasus-graphviz functionality into the api
  264. [PM-1612] – update monitord to record avg cpu utilization and maxrss
  265. [PM-1613] – update database schema to track maxrss and avg_cpu
  266. [PM-1614] – update planner for revised 5.0 replica catalog format
  267. [PM-1615] – unique constraint failed error when using multiple output sites
  268. [PM-1616] – add checksums for executables
  269. [PM-1617] – add missing integrity check for executables
  270. [PM-1618] – update 039-black-metadata to use python 5.0 api
  271. [PM-1623] – fix 5.0 python auto generated python api documentation
  272. [PM-1624] – convert 032-black-checkpoint
  273. [PM-1625] – Pegasus specific profile for requesting GPU resources
  274. [PM-1627] – workflow uuid, submit dir, submit hostname, root wf id should be accessible from the workflow object
  275. [PM-1628] – files generated by the api should have comments specifying that they have been auto generated by the api
  276. [PM-1629] – Add metadata to container
  277. [PM-1631] – create 032-kickstart-chkpoint-signal-condorio
  278. [PM-1632] – create 032-kickstart-chkpoint-signal-nonsharedfs
  279. [PM-1633] – CLI: manpage pegasus-plan
  280. [PM-1634] – CLI: manpage pegasus-db-admin
  281. [PM-1635] – CLI: manpage pegasus-status
  282. [PM-1636] – CLI: manpage pegasus-remove
  283. [PM-1637] – CLI: manpage pegasus-anaylzer
  284. [PM-1638] – CLI: manpage pegasus-statistics
  285. [PM-1639] – CLI: manpage pegasus-run
  286. [PM-1640] – CLI: manpage pegasus-transfer
  287. [PM-1641] – CLI: manpage pegasus-s3
  288. [PM-1642] – CLI: manpage pegasus-integrity
  289. [PM-1643] – CLI: manpage pegasus-rc-converter
  290. [PM-1644] – CLI: manpage pegasus-sc-converter
  291. [PM-1645] – CLI: manpage pegasus-tc-converter
  292. [PM-1646] – update database overview, and update schema picture
  293. [PM-1648] – add unit tests that put/pull to/from aws s3 and ceph
  294. [PM-1649] – remove old config parameters if they are not used
  295. [PM-1652] – update pegasus client api to pass the workflow to be planned at the end
  296. [PM-1655] – the metrics server should pick up key wf_api , and default back to dax_api if not present
  297. [PM-1656] – Add unit tests to ensure YAML is being generated correctly
  298. [PM-1659] – in the python api, add pegasus profile key container.arguments
  299. [PM-1660] – add boolean bypass flag for input files
  300. [PM-1661] – expose bypass parameter for Containers and TransformationSites
  301. [PM-1662] – document behavior in user guide about bypassing file staging
  302. [PM-1665] – expose –randomdir/–randomdir=<path> in client code in the python api
  303. [PM-1666] – get live output from pegasus cli tools when they are called with the client code
  304. [PM-1667] – when client code is called, it should be more obvious what pegasus-<tool> is being called
  305. [PM-1668] – fb-nlp workflow generator creates multiple jobs that create same output file
  306. [PM-1669] – remove “pegasus: <version>” from tc when inlined in a workflow
  307. [PM-1670] – expose a schema validation function that can be used in unit tests
  308. [PM-1682] – add –reuse option to Workflow.plan in python api
  309. [PM-1683] – update section 7.3 supported transfer protocols in data management guide
  310. [PM-1684] – reorg table of contents
  311. [PM-1688] – update pegasus-db-admin as a new ‘trigger’ table has been added to the schema
  312. [PM-1693] – convert test 010-runtime-clustering-CondorIO to use python api
  313. [PM-1694] – convert test 010-runtime-clustering-Non-SharedFS to use python api
  314. [PM-1695] – convert 010-runtime-clustering-SharedFS to use python api
  315. [PM-1696] – convert 010-runtime-clustering-SharedFS Staging and No Kickstart to use python api
  316. [PM-1698] – Release notes for 5.0 release
  317. [PM-1699] – add logo the documentation pages
  318. [PM-1700] – add SIGINT handler to Client.wait() so that it can exit cleanly
  319. [PM-1701] – planner should get chmod jobs to run locally if compute site has auxiliary.local set

Bugs Fixed

  1. [PM-1150] – Pegasus should verify that required credentials exists before starting a workflow
  2. [PM-1192] – User supplied env setup script for lite
  3. [PM-1199] – Notification naming / meta notifications
  4. [PM-1326] – singularity suffix computed incorrectly
  5. [PM-1327] – bypass input file staging broken for container execution
  6. [PM-1330] – .meta files created even when integrity checking is disabled.
  7. [PM-1332] – monitord is failing on a dagman.out file
  8. [PM-1333] – amqp endpoint errors should not disable database population for multiplexed sinks
  9. [PM-1334] – pegasus dagman is not exiting cleanly
  10. [PM-1336] – pegasus-submitdir is broken
  11. [PM-1346] – Pegasus job checkpointing is incompatible with condorio
  12. [PM-1347] – pegasus will always try and transfer output when a code has checkpointed
  13. [PM-1350] – pegasus is ignoring when_to_transfer_output
  14. [PM-1358] – HTCondor 8.8.0/8.8.1 remaps /tmp, and can break access to x509 credentials
  15. [PM-1360] – planner drops transfer_(in|out)put_files if NoGridStart is used
  16. [PM-1362] – Chinese characters in the file path
  17. [PM-1366] – Pegasus Cluster Label – Job Env Not Picked Up in Containers
  18. [PM-1377] – A + in a tc name breaks pegasus-plan
  19. [PM-1379] – Stage out job fails – wrong src location
  20. [PM-1380] – Support for Singularity Library
  21. [PM-1389] – pegasus.cores causes issues on Summit
  22. [PM-1395] – GLite LSF scripts don’t work as intended on OLCF’s DTNs
  23. [PM-1405] – Is Pegasus supposed to build on 32-bit x86 (Debian i386 Stretch)?
  24. [PM-1409] – Python virtual environments not considered first before system wide installation
  25. [PM-1421] – p-lite generated sh files fail, when used with Docker containers, that make use of USER argument
  26. [PM-1466] – Upgrade Python Package Versions
  27. [PM-1488] – Get condorpool worker machine ipaddr error!!!
  28. [PM-1506] – pegasus-python-wrapper locates executable from PEGASUS_HOME instead of dirname of the exec.
  29. [PM-1521] – Decaf Jobs does not generate valid JSON for Decaf anymore
  30. [PM-1522] – output not being capture in pegasus client code
  31. [PM-1530] – site selector does map job correctly when using stageable or all mapper with a containerized job
  32. [PM-1531] – pegasus-db-admin fails on macosx on a clean db
  33. [PM-1534] – integrity checking with bypass input staging if checksums are specified in replica catalog
  34. [PM-1595] – hostname is not generated in case of integrity failures
  35. [PM-1619] – pegasus-analyzer output doesn’t match pegasus-status
  36. [PM-1630] – PATH variable in Docker containers is not preserved in some cases
  37. [PM-1671] – ConfigParser still referenced in pegasus-worker CLI scripts
  38. [PM-1675] – planner throws error when using “” without specifying “”
  39. [PM-1676] – hierarchical workflow failing only when integrity checking is on
  40. [PM-1680] – kickstart needs to quote argument vector when outputting it to yaml
  41. [PM-1702] – registration jobs fail out of a 5.0 binary install
  42. [PM-1705] – priorities are not propagated correctly