2. Pegasus 5.0.x Series
2.1. Pegasus 5.0.9
Release Date: December 17, 2024
We are happy to announce the release of Pegasus 5.0.9, which is a minor bug fix release for Pegasus 5.0 branch. The release has a variety of improvements and bug fixes to Pegasus support for AWS Batch.
The release can be downloaded from https://pegasus.isi.edu/downloads
2.1.1. New Features and Improvements
[PM-1960] – Document debugging job submissions to HPC cluster #2072
[PM-1982] – pegasus-aws-batch setup/delete options should not fail at first error #2085
[PM-1983] – AWS Batch Error: ClientException: Maximum number of jobs supported is 100 #2086
[PM-1991] – pegasus-aws-batch fails while interacting with a bucket in us-east-1 #2087
[PM-1992] – AWS Batch Compute Environments and job definitions are not deleted reliably #2088
2.1.2. Bugs Fixed
[PM-1959] – SGE memory parameter need a unit suffix to be added to remote_ce_requirements #2071
[PM-1976] – pegasus-aws-batch –log-file option #2082
[PM-1978] – amazon region is not being picked up correctly when creating a s3 bucket #2084
[PM-1997] – site selection for containerized executables should not consider container image url (whether file or not) #2089
[PM-1998] – pegasus-statistics lists incomplete jobs in a successful aws batch wf run #2090
2.2. Pegasus 5.0.8
Release Date: August 16, 2024
We are happy to announce the release of Pegasus 5.0.8, which is a minor bug fix release for Pegasus 5.0 branch.
The release can be downloaded from https://pegasus.isi.edu/downloads
2.2.1. New Features and Improvements
[PM-1939] – Planner complains on http transfers if the user does not have a credentials file setup #2052
[PM-1940] – NPE for planning a workflow with sub workflows, where sub workflow job requires data from http endpoint #2053
[PM-1944] – Ability to specify a wrapper/launcher for containerized jobs in PegasusLite #2057
[PM-1946] – Improve Auth Token Acquisition For Globus Transfers #2059
[PM-1948] – Document deployment options on HPC centers #2061
2.2.2. Bugs Fixed
[PM-1941] – illegal state exception while using inplace cleanup #2054
[PM-1943] – mixed binary/conda installs broken #2056
[PM-1945] – transformation and container with same name causes error at runtime #2058
[PM-1947] – workflow restart fails because of error reading a tc file in /tmp #2060
[PM-1949] – NPE while planning a sub workflow #2062
[PM-1951] – For CLIs, factor out Pegaus Python module from system dir #2064
[PM-1953] – Monitord Failing with ModuleNotFoundError. #2066
[PM-1954] – Importing six.moves raises ModuleNotFoundError on Python 3.12 #2067
[PM-1958] – pegasus lite jobs fail at CIT if there is a lost+found dir in #2070 the condor scratch dir
2.3. Pegasus 5.0.7
Release Date: February 21, 2024
We are happy to announce the release of Pegasus 5.0.7, which is a minor bug fix release for Pegasus 5.0 branch.
The release can be downloaded from https://pegasus.isi.edu/downloads
This release included improvements such as
users now to specify an input directory with executables to avoid creating a transformation catalog
support for SGE clusters in pegasus-init
support for private tokens while retrieving data from http endpoints
preference of apptainer executables over singularity
2.3.1. New Features and Improvements
[PM-1926] – pegasus should allow users to specify an input directory with executables to avoid creating a TC #2039
[PM-1888] – Apptainer support #2001
[PM-1929] – convenient way to associate profiles for a site in properties file #2042
[PM-1931] – support for local SGE cluster in pegasus-init #2044
[PM-1933] – support for private-token to curl invocations #2046
[PM-1936] – CLONE – support for local SGE cluster in pegasus-init #2049
[PM-1928] – update python api to add the -t|–transformations-dir to the planner #2041
[PM-1930] – add a convenience function to python api for adding site profiles #2043
[PM-1935] – update the planner to optionally use the credentials file #2048 for http transfers
[PM-1937] – support for mixed binary/venv/conda installs #2050
2.3.2. Bugs Fixed
2.4. Pegasus 5.0.6
Release Date: June 30, 2023
We are happy to announce the release of Pegasus 5.0.6, which is a minor bug fix release for Pegasus 5.0 branch.
The release can be downloaded from: https://pegasus.isi.edu/downloads
2.4.1. New Features and Improvements
2.4.2. Bugs Fixed
[PM-1905] File dependencies between sub workflow and compute jobs broken #2018
[PM-1906] Planner container mount point parsing breaks on . in the dir name #2019
[PM-1909] request_disk is incorrectly set to MBs instead of KBs #2022
[PM-1913] +DAGNodeRetry for attrib=value assigment breaks on HTondor 10.0.x when direct submission is disabled #2026
[PM-1916] Data management between parent compute job and a sub workflow job broken #2029
[PM-1918] Inplace cleanup broken when a sub workflow job and a parent compute job has a data dependency #2031
2.5. Pegasus 5.0.5
Release Date: February 17, 2023
We are happy to announce the release of Pegasus 5.0.5, which is a minor bug fix release for Pegasus 5.0 branch. This release corrects a build/packaging problem in 5.0.4, resulting in the planner not finding all classes.
We invite our users to give it a try.
The release can be downloaded from: https://pegasus.isi.edu/downloads
2.5.1. Bugs Fixed
[PM-1904] Incomplete clean between ant targets #2017
2.6. Pegasus 5.0.4
Release Date: February 9, 2023
We are happy to announce the release of Pegasus 5.0.4, which is a minor bug fix release for Pegasus 5.0 branch. This release has some importan updates namely
Support for HTCondor 10.2 series
Improved sub workflow file handling
We invite our users to give it a try.
The release can be downloaded from: https://pegasus.isi.edu/downloads
2.6.1. New Features and Improvements
[PM-1890] pegasus-analyzer should show failing jobs #2003
[PM-1891] pegasus-analyzer should traverse all sub workflows #2004
[PM-1898] File dependencies for sub workflow jobs - differentiate #2011 inputs for planner use and those for sub workflow
[PM-1899] update python api and json schema to expose forPlanning #2012 boolean attribute with files in uses section
[PM-1900] update java wf api to support forPlanner attribute for files #2013
2.6.2. Bugs Fixed
[PM-1895] handle condor_submit updated way of specifying environment #2008 in the .dag.condor.sub file
[PM-1893] need to explicitly mount sharedfilesystem dir into container #2006 when using shared filesystem as staging site for nonsharedfs
[PM-1894] worker package transfer into application containers #2007
[PM-1896] In pegasus lite scripts worker package strict check is #2009 turned off
[PM-1897] update pegasus-configure-glite to use BLAHPD_LOCATION #2010
2.7. Pegasus 5.0.3
Release Date: October 25, 2022
We are happy to announce the release of Pegasus 5.0.3, which is a minor bug fix release for Pegasus 5.0 branch. This release has some important updates namely
Support for Deep LFN’s in CondorIO Mode If you are using bypass of staging for input files, then support for deep LFN’s depends on associated HTCondor ticket 1325 that will be fixed in HTCondor release 10.1.0.
Per Job Symlinking
New Containers exercise in the Pegasus Tutorial
We invite our users to give it a try.
The release can be downloaded from: https://pegasus.isi.edu/downloads
2.7.1. New Features and Improvements
[PM-1873] – Add a containers focussed exercise to the tutorial #1986
[PM-1875] – Support deep LFN’s in CondorIO mode #1988
[PM-1806] – Fix dest filenames for transformations #1920
[PM-1815] – Prevent pegasus-lite failure when user passes -w to docker #1928
[PM-1871] – Remove the “version” parameter from worker package transformation in TC documentation #1984
[PM-1879] – Per job symlinking #1992
[PM-1885] – Allow bypass of input files in CondorIO mode to be similar to behavior for nonsharedfs #1998
[PM-1876] – implement moveto support in pegasus-transfer #1989
[PM-1877] – mimic transfer_output_remaps in pegasus-lite-local.sh for local universe jobs #1990
2.7.2. Bugs Fixed
[PM-1809] – Job fails when a container has a pre-existing group with the same gid as the #1922 one being created with groupadd
[PM-1864] – request_memory and request_disk does not get applied for local universe jobs #1977
[PM-1868] – pegasus-init cli tool not working #1981
[PM-1872] – Unable to locate executable when pegasus::worker transform is overridden #1985
[PM-1878] – hierarchical workflows broken on recent condor install #1991
[PM-1883] – users cannot decrease planner logging if pegasus.mode is set to debug #1996
[PM-1884] – pegasus-remove does does not remove a running workflow #1997
[PM-1887] – update broken bamboo tests #2000
2.8. Pegasus 5.0.2
Release Date: April 22, 2022
We are happy to announce the release of Pegasus 5.0.2, which is a minor bug fix release for Pegasus 5.0 branch. This release has some important updates namely
Updated Pegasus Log4J support to 2.17.
Globus Online Transfers now have support for consent options on endpoints
The release also has important bug fixes related to correctly detecting job failures for grid universe jobs.
We invite our users to give it a try.
The release can be downloaded from: https://pegasus.isi.edu/downloads
2.8.1. New Features and Improvements
[PM-1828] – Doc. mismatch #1941
[PM-1817] – pegasus-run with shell code generator #1930
[PM-1824] – decaf jobs should be associated with pegasus-exitcode postscript #1937
[PM-1825] – update glossary #1938
[PM-1826] – way to pass on additional arguments to clustered jobs #1939
[PM-1830] – Globus Online transfers required GO consent #1943
[PM-1835] – Upcoming changes to DAGMan output logging #1948
[PM-1836] – Update Pegasus Log4J support to 2.16 #1949
[PM-1839] – allow easy clustering of the whole workflow without associating #1952 labels for the jobs
[PM-1673] – passing properties as str to Properties() can be error prone, #1787 add some preliminary checks before writing
[PM-1759] – user facing class/function args shouldn’t be prefixed #1873 with _ such as _id
[PM-1816] – make it easier to add entries to the replica catalog by inferring #1929 site, lfn, pfn from file Path or URL
[PM-1822] – improve parsing of value in Mixins._to_mb(value) #1935
[PM-1827] – type check pfn in ReplicaCatalog.add_replica() #1940
[PM-1831] – planner by default should pick up credentials.conf when pegasus-s3 is used #1944
[PM-1834] – ensure that yaml is serialized in a deterministic manner #1947
[PM-1858] – pegasus-s3 should pick up PEGASUS_CREDENTIAL environment variable #1971
[PM-1859] – Document decaf as a clustering tool in the clustering guide #1972
2.8.2. Bugs Fixed
[PM-1763] – validate all strings that will then be used as filenames or #1877 used in sub files
[PM-1821] – job failures not detected for grid universe jobs #1934
[PM-1823] – Serialization of pegasus.memory results in a floating point no. #1936
[PM-1832] – extraneous whitespace in arguments for sub workflow job with java generator #1945
[PM-1837] – planner throws null pointer exception when invalid staging site is given #1950
[PM-1838] – Intermediate outputs in a clustered job get sent back to staging #1951 site when they are not used by subsequent jobs outside of the cluster and when stage_out has been set to false for those files
[PM-1840] – Workflow class methods will always be None #1953
[PM-1851] – PegasusLite submissions to local cluster (Slurm/PBS/etc) unable #1964 to source pegasus-lite-common.sh
2.9. Pegasus 5.0.1
Release Date: October 7, 2021
We are happy to announce the release of Pegasus 5.0.1. Pegasus 5.0.1 is a minor bug fix release after Pegasus 5.0. We invite our users to give it a try.
The release features improvements to the Pegasus Python API including ability to visualize statically the abstract and generated executable workflows. It also has improved support for DECAF, including an ability to get clustered jobs in a workflow executed using DECAF.
This release has improvements to data access in PegasusLite jobs, if data resides on local site, and job runs on a site where “auxiliary.local” profile is set to true. Users can now use a new Submit Mapper called Named that allows you to specify what sub directory a job’s submit files are placed in.
Release also features updated support for submission of jobs using HubZero Distribute to HPC Clusters and new pegasus.mode called “debug” to enable verbose logging throughout the Pegasus stack.
The release can be downloaded from: https://pegasus.isi.edu/downloads
2.9.1. New Features and Improvements
[PM-1726] – Update support for HubZero Distribute #1840
[PM-1751] – Named Submit Directory Mapper #1865
[PM-1798] – instead of the workflow having explicit data flow jobs, #1912 get pegasus to automatically cluster jobs to a decaf representation
[PM-1753] – add Workflow.get_status() #1867
[PM-1767] – remove the default arguments, output_sites and cleanup in SubWorkflow.add_planner_args() #1881
[PM-1786] – update usage of threading.Thread.isAlive() to be is_alive() in python scripts #1900
[PM-1788] – Add configuration documentation for hierarchical workflows #1902
[PM-1429] – Introduce PEGASUS_ENV variable to define mode of workflow i.e. development, production, etc #1543
[PM-1651] – Add more profile keys in the add_pegasus_profile #1765
[PM-1672] – override add_args for SubWorkflow so that args refer to planner args #1786
[PM-1706] – sphinx has hardcoded versios #1820
[PM-1730] – 5.0.1 Python Api improvements #1844
[PM-1733] – expand on checkpointing documentation #1847
[PM-1739] – expose panda job submissions similar to how we support BOSCO #1853
[PM-1742] – allow a tc to be empty without the planner failing #1856
[PM-1743] – allow catalogs to be embedded into workflow when workflow contains sub workflows #1857
[PM-1747] – 031-montage-condor-io-jdbcrc failing #1861
[PM-1768] – replace GRAM workflow tests with bosco #1882
[PM-1769] – update tests since /nfs/ccg3 is gone now #1883
[PM-1771] – pegasus-db-admin upgrade #1885
[PM-1780] – Refactor Transfer Engine Code #1894
[PM-1787] – auxiliary.local is not considered when triggering symlink in PegasusLite in nonsharedfs mode #1901
[PM-1792] – decaf jobs over bosco #1906
[PM-1794] – put in support for additional keys required by decaf #1908
[PM-1796] – passing properties to be set for sub workflow jobs #1910
[PM-1800] – enable inplace cleanup for hierarchical workflows #1914
[PM-1802] – Add support for Debian 11 #1916
[PM-1803] – use force option when doing a docker rm of the container image #1917
[PM-1810] – Extend debug capabilities for pegasus.mode #1923
[PM-1811] – add pegasus-keg to worker package #1924
[PM-1818] – new pegasus.mode debug #1931
[PM-1723] – add_
_profile() should be plural #1837 [PM-1731] – functions that take in File objects as input parameters should also accept strings for convenience #1845
[PM-1744] – progress bar from wf.wait() should include “UNRDY” as shown in status output #1858
[PM-1755] – catalog write location should be stored upon call to catalog.write() #1869
[PM-1757] – add pegasus profile relative.submit.dir #1871
[PM-1784] – Refactor Stagein Generator code out of Transfer Engine #1898
[PM-1790] – extend site catalog schema to indicate shared file system access for a directory #1904
[PM-1791] – update planner to parse sharedFileSystem attribute from site catalog #1905
[PM-1797] – use logging over print statements #1911
[PM-1804] – add verbose options for development mode #1918
2.9.2. Bugs Fixed
[PM-1709] – the yaml handler in pegasus-graphviz needs to handle ‘checkpoint’ link type #1823
[PM-1722] – Job node_label attribute is not identified by the planner #1836
[PM-1725] – nodeLabel for a job needs to be parsed in yaml handler if it is given #1839
[PM-1736] – Pegasus pollutes the job env when getenv=true #1850
[PM-1737] – monitord fails on divide by 0 error while computing avg cpu utilization #1851
[PM-1745] – time.txt in stats is misformatted #1859
[PM-1746] – jobs aborted by dagman, but with kickstart exitcode as 0 are not marked as failed job #1860
[PM-1748] – planner fails with NPE on empty workflow #1862
[PM-1750] – ensemble mgr workflow priorities need to be reversed #1864
[PM-1752] – fix checkpoint.time in add_pegasus_profile #1866
[PM-1754] – pegasus-db-admin fails to upgrade database #1868
[PM-1761] – pegasus-analyzer showing “failed to send files” error when root cause is exec format error #1875
[PM-1762] – pegasus-analyzer showing no error at all when workflow failed based on status output #1876
[PM-1764] – fix pegasus-analyzer output typo #1878
[PM-1765] – for SubWorkflow jobs, the planner argument, –output-sites, isn’t being set #1879
[PM-1766] – for SubWorkflow jobs, the planner argument, –force, isn’t being set #1880
[PM-1770] – 041-jdbcrc-performance failing #1884
[PM-1772] – db upgrade leaves transient tables #1886
[PM-1777] – pegasus-graphviz producing incorrect dot file when redundant edges removed #1891
[PM-1779] – Stage out job executed on local instead of remote site (donut) #1893
[PM-1783] – bypass input staging in nonsharedfs mode does not work for file URL and auxiliary.local set #1897
[PM-1785] – hostnames missing from elasticsearch job data #1899
[PM-1789] – Scratch dir GET/PUT operations get overridden #1903
[PM-1795] – Output Mapper in conjunction with data dependencies between sub workflow jobs #1909
[PM-1799] – json schema validation fails for selector profiles #1913
[PM-1820] – Deserializing a YAML transformation files always sets the os.type to linux #1933
2.10. Pegasus 5.0.0
Release Date: November 12, 2020
We are happy to announce the release of Pegasus 5.0. Pegasus 5.0 is be a major release of Pegasus and builds upon the beta version released couple of months back. It also includes all features and bug fixes from the 4.9 branch. We invite our users to give it a try.
The release can be downloaded from: https://pegasus.isi.edu/downloads
If you are an existing user, please carefully follow these instructions to upgrade at https://pegasus.isi.edu/docs/5.0.0/user-guide/migration.html#migrating-from-pegasus-4-9-x-to-pegasus-5-0
2.10.1. Highlights of the Release
Reworked Python API:
This new API has been developed from the ground up so that, in addition to generating the abstract workflow and all the catalogs, it now allows you to plan, submit, monitor, analyze and generate statistics of your workflow. To use this new Python API refer to the Moving From DAX3 to Pegasus.api at https://pegasus.isi.edu/docs/5.0.0/user-guide/migration.html#moving-from-dax3
Adoption of YAML formats:
With Pegasus 5.0, we are moving to adoption of YAML for representation of all major catalogs. We have provided catalog converters for you to convert your existing catalogs to the new formats. In 5.0, the following are now represented in YAML:
Abstract Workflow
Replica Catalog
Transformation Catalog
Site Catalog
Kickstart Provenance Records
Python3 Support:
All Pegasus tools are Python 3 compliant. 5.0 release will require Python 3 on workflow submit node Python PIP packages for workflow composition and monitoring
Default data configuration
In Pegasus 5.0, the default data configuration has been changed to condorio . Up to 4.9.x releases, the default configuration was sharedfs.
Zero configuration required to submit to local HTCondor pool
Data Management Improvements
New output replica catalog that registers outputs including file metadata such as size and checksums
Ability to do bypass staging of files at a per file, executable and container level
Improved support for hierarchal workflows allow you to create data dependencies between sub workflow jobs and compute jobs
Support for staging of generated outputs to multiple output sites
Support for integrity checking of user executables and application containers in addition to data
Support for webdav transfers
Easier enabling of data reuse by specifying previous workflow submit directories using –reuse option to pegasus-plan.
Stagein transfer jobs are assigned priorities based on the number of child compute jobs. Details can be found in JIRA ticket 1385
New Jupyter Notebook Based Tutorial
With this release, we are pleased to announce a brand new tutorial based on a Docker container running interactive Jupyter notebooks. You can access the tutorial at PM-1385 #1499
Support for CWL (Common Workflow Language)
The pegasus-cwl-converter command line tool has been developed to convert a subset of the Common Workflow Language (CWL) to Pegasus’s native YAML format. Given the following three files: a CWL workflow file, a workflow inputs specification file, and a transformation (executable) specification file, pegasus-cwl-converter will do a best-effort translation. This will also work with CWL workflow specifications that refer to Docker containers. The entire CWL language specification is not yet covered by this converter, and as such we can provide additional support in converting your workflows into the Pegasus YAML format.
Support for triggers in Ensemble Manager
The Pegasus Ensemble Manager is a service that manages collections of workflows. In this latest release of Pegasus, workflow triggering functionality has been added to this service. With the ensemble manager service up and running, the pegasus-em command can now be used to start workflow triggers. Two triggersare currently supported:
a cron based trigger: The cron based trigger will, at a given interval, submit a new workflow to your ensemble.
a cron based, file pattern trigger: The cron based, file pattern trigger, much like the cron based trigger, will submit a new workflow to your ensemble at a given time interval, with the addition of any new files that are detected based on a given file pattern. This is useful for automatically processing data as it arrives.
Improved events for each job reported to AMQP
Historically the events reported to AMQP endpoints are normalized events corresponding to the stampede database, which makes correlation hard. Pegasus now also reports a new job composite event (stampede.job_inst.composite) to AMQP end points that have a complete information about a job execution.
Revamped Documentation
Documentation has been overhauled and broken down into a user guide and a reference guide. In addition, we have moved to readthedocs style documentation using restructured text. The documentation can be found at https://pegasus.isi.edu/docs/5.0.0/index.html .
pegasus-statistics reports memory usage and avg cpu utilization
pegasus-statistics now reports memory usage and average cpu utilization for your jobs in the transformation statistics file (breakdown.txt).
PegasusLite Improvements
Users can now specify environment setup scripts in the site catalog that need to be sourced to setup the environment before the job is launched by PegasusLite. More details at https://pegasus.isi.edu/docs/5.0.0/reference-guide/pegasus-lite.html#setting-the-environment-in-pegasuslite-for-your-job
Users can get the transfers in PegasusLite to run on DTN nodes while the jobs run on the the compute nodes. More details at https://pegasus.isi.edu/docs/5.0.0/reference-guide/pegasus-lite.html#specify-compute-job-in-pegasuslite-to-run-on-different-node
Credentials existence is checked upfront by planner
Performance improvements for pegasus-rc-client
pegasus-rc-client now does bulk inserts when inserting entries into a database backed Replica Catalog.
Pegasus ensures a consistent UTF8 environment across full workflow Details can be found in JIRA ticket 1592 .
2.10.2. New Features
[PM-603] – Enable workflows to reference local input files directly instead of symlinking them #721
[PM-1133] – Kickstart should send a heartbeat so that condor can kill stuck jobs #1247
[PM-1156] – PegasusLite to tar up the contents of the cwd in case of job failure #1270
[PM-1278] – stats should include cpu and memory utilization #1392
[PM-1309] – develop a pip package that only contains the DAX API and the catalogs API #1423
[PM-1335] – YAML based transformation catalog #1449
[PM-1339] – construct a default entry for local site if not present in site catalog #1453
[PM-1345] – Support for Shifter at Nersc #1459
[PM-1351] – YAML based kickstart records #1465
[PM-1352] – Build failure on Debian 10 due to mariadb/MySQL-Python incompatibility #1466
[PM-1354] – pegasus-init to support titan tutorial #1468
[PM-1355] – composite records when sending events to AMQP #1469
[PM-1357] – In lite jobs, chirp durations for stage in, stage out of data #1471
[PM-1367] – Support for retrieval from HPSS tape store using commands htar and hsi #1481
[PM-1390] – ensure all machine parseable information is one file associated with job #1504
[PM-1396] – kickstart yaml parser fails because of : unacceptable character #x001b: special characters are not allowed #1510
[PM-1398] – include machine information in job_instance.composite event #1512
[PM-1402] – pegasus-init to support summit as execution env for tutorial #1516
[PM-1411] – create the schema for a YAML based DAX #1525
[PM-1438] – YAML Based Site Catalog #1552
[PM-1461] – Ability to specify a wrapper/launcher for compute jobs in PegasusLite #1575
[PM-1470] – pegasus-graphviz needs a yaml handler #1584
[PM-1493] – YAML Based Replica Catalog #1607
[PM-1501] – Parse YAML DAX files #1615
[PM-1516] – Planner should create a default condorpool compute site if a user does not have it specified #1630
[PM-1528] – update DAX R API to emit new workflow format in yaml #1642
[PM-1529] – set default data configuration to condorio #1643
[PM-1551] – Update JAVA DAX API to generate yaml formatted DAX #1665
[PM-1552] – move to using pegasus lite for cases, where we transfer pegasus-transfer #1666
[PM-1608] – data dependencies between dax jobs and compute jobs in a workflow #1722
[PM-1620] – enable integrity checking for containers #1734
[PM-1681] – Enable easy data reuse from previous runs #1795
[PM-1685] – Python package to clone github repos for pegasus-init in 5.0 #1799
[PM-1286] – Deprecate Perl DAX API #1400
[PM-1376] – Add LSF local attributes #1490
[PM-1378] – Handle (copy) HPSS credentials when an environment variable is set #1492
[PM-1382] – Add ppc64le to the known architectures #1496
[PM-1383] – Switching to AMQP 0.9.1 in Pegasus Monitord #1497
[PM-1400] – Remove MacOS .pkg builder #1514
[PM-1401] – Deprecate pegasus-plots #1515
[PM-1412] – Upgrade documentation #1526
[PM-1413] – Upgrade Pegasus Databases to be Unicode Compatible #1527
[PM-1416] – add data collection setup instructions to docs under section 6.7.1.1. Monitord, RabbitMQ, ElasticSearch Example #1530
[PM-1446] – create tests for python client #1560
[PM-1467] – Update jupyter notebook code to use new 5.0 api #1581
[PM-1484] – create a default site catalog #1598
[PM-1485] – change default pegasus.data.configuration from sharedfs to condorio #1599
[PM-1486] – update default filenames for catalogs and workflow #1600
[PM-1495] – checksum.value and checksum.type need to be added as optional rc entry fields #1609
[PM-1510] – stageOut and registerReplica fields in Uses to be omitted for Uses of type input #1624
[PM-1513] – Merge DECAF branch to master to support decaf integration #1627
[PM-1524] – Use entrypoint in docker containers #1638
[PM-1533] – Database Schema Cleanup #1647
[PM-1537] – update existing workflow tests to use yaml #1651
[PM-1547] – for hierarchical workflows, sc and tc cannot be inlined into the workflow file #1661
[PM-1602] – Decaf development for the tess_dense example #1716
[PM-1663] – remove old grid types from schema and add slurm to scheduler type #1777
[PM-1679] – pegasus-s3 mkdir should cleanly exit if bucket already exists and is owned by user #1793
[PM-1689] – pegasus-rc-client delete semantics #1803
[PM-1692] – Add any missing options from pegasus-plan cli to Workflow.plan() and Client.plan() #1806
[PM-1697] – Add major and minor number options to pegasus-version #1811
[PM-643] – Better support for stdout of clustered jobs #761
[PM-1049] – Jobs should not be retried immediately, but rather delayed for some time #1163
[PM-1170] – statistics should include memory details #1284
[PM-1232] – Allow for multiple output sites #1346
[PM-1235] – Python 2/3 compatible code #1349
[PM-1247] – Revisit release-tools/get-system-python #1361
[PM-1321] – Move transfer staging into the container rather than the host OS #1435
[PM-1323] – pegasus transfer should not try and transfer a file that does not exist #1437
[PM-1324] – pegasus plan should be able to create site scratch directories with unique names #1438
[PM-1329] – pegasus integrity causes LIGO workflows to fail #1443
[PM-1338] – Add support for TACC wrangler to pegasus-init #1452
[PM-1341] – Transition from vendored
configobj
to release #1455[PM-1344] – update pam usage by pamela #1458
[PM-1349] – Improve error message when jobs fail due to deep dir. structure with depth of > 20. #1463
[PM-1356] – Replace Google CDN with a different CDN as China block it #1470
[PM-1359] – Don’t support Chinese character in the file path #1473
[PM-1363] – Condor Configuration MOUNT_UNDER_SCRATCH causes pegasus auxiliary jobs to fail #1477
[PM-1368] – Implement Catch and Release for integrity errors #1482
[PM-1373] – Debian Buster no longer provides openjdk-8-jdk #1487
[PM-1374] – make monitord resilient to dagman logging the debug level in dagman.out #1488
[PM-1375] – Do not run integrity checks on symlinked files #1489
[PM-1385] – Prioritize transfers bases on dependencies #1499
[PM-1386] – bypass should be a per-file option #1500
[PM-1391] – Allow properties to be set via environment variables. #1505
[PM-1392] – YAML based braindump file #1506
[PM-1403] – Support POWER9 nodes in PMC #1517
[PM-1417] – add type field in job, dax, and dag #1531
[PM-1418] – No code to handle postgresql backup. #1532
[PM-1427] – remove STAT profile namespace #1541
[PM-1429] – Introduce PEGASUS_ENV variable to define mode of workflow i.e. development, production, etc #1543
[PM-1431] – Remove -0 suffixes from generated code files. #1545
[PM-1432] – Remove deprecated pegasus-plan CLI args #1546
[PM-1459] – Upgrade Java Unit Test Setup #1573
[PM-1462] – Remove pegasus-submit-dag #1576
[PM-1463] – improve insert performance of pegasus-rc-client #1577
[PM-1472] – Update pegasus-s3 regions #1586
[PM-1474] – Extend command line tools with –json option. #1588
[PM-1476] – Explore deprecation/replacement of Perl codebase #1590
[PM-1481] – add status progress bar to python client code #1595
[PM-1489] – Implement a generic data access credential handler #1603
[PM-1490] – Support Webdav for data transfers #1604
[PM-1508] – in the python api, the method signature to manually add dependencies needs to be updated #1622
[PM-1512] – Removed unused code from planner codebase #1626
[PM-1514] – Move to boto3 and port p-s3 #1628
[PM-1527] – p-status shows failure when state is unknown #1641
[PM-1532] – python client “plan” should accept –sites as a list needs a –staging-site flag #1646
[PM-1539] – Support PANDA GAHP – allow condorio for glite style #1653
[PM-1544] – pegasus-transfer logs skips container verification when retrieving from singularity hub #1658
[PM-1545] – handling metadata in 5.0 #1659
[PM-1550] – resolve relative input-dir and output-dir options for dax jobs in hierarchical workflows #1664
[PM-1580] – build packages for ubuntu 20 for 5.0 #1694
[PM-1581] – pegasus-integrity callout from kickstart #1695
[PM-1584] – 5.0 Python Api Improvements #1698
[PM-1592] – Consistent UTF8 environment across full workflow #1706
[PM-1594] – Short circuit p-transfer in the case of excessive failure in large transfers #1708
[PM-1621] – Basic support for gpus in Docker and Singularity containers #1735
[PM-1622] – /bin/sh compatibility #1736
[PM-1626] – Allow users to specify arbitrary cli arguments for containers #1740
[PM-1647] – Add more profile keys in the add_condor_profile #1761
[PM-1650] – replace –dax with a positional argument #1764
[PM-1651] – Add more profile keys in the add_pegasus_profile #1765
[PM-1654] – pick up workflow api from vendor extensions encoded in the workflow #1768
[PM-1657] – a pre-flight check needs to be done for the python package attrs #1771
[PM-1674] – Deprecate hints profile namespace and use selector namespace #1788
[PM-1687] – disable warnings when validating yaml files #1801
[PM-1691] – Add pre script hook to hub repos #1805
[PM-1236] – Compatibility Module #1350
[PM-1237] – Python 3: Pegasus Dashboard #1351
[PM-1238] – Python 3: Pegasus Monitord #1352
[PM-1239] – Python 2/3 Compatible: Pegasus Transfer #1353
[PM-1240] – Python 3: Pegasus Statistics #1354
[PM-1241] – Python 3: Pegasus DB Admin #1355
[PM-1242] – Python 3: Pegasus Metadata #1356
[PM-1328] – support sharedfs on the compute site as staging site #1442
[PM-1340] – make planner os releases consistent with builds #1454
[PM-1353] – update monitord to parse both xml and yams based ks records #1467
[PM-1361] – create the schema for a YAML based Site Catalog #1475
[PM-1365] – remove __ from event keys wherever possible #1479
[PM-1371] – Python 3 Compatible: DAX #1485
[PM-1372] – Python 3 Compatible: pegasus-exitcode #1486
[PM-1381] – Associated planner changes to handle LSF sites #1495
[PM-1387] – make the netlogger events consistent with the documentation #1501
[PM-1404] – pegasus tutorial for summit from Kubernetes #1518
[PM-1407] – Python 3: Netlogger Monitord Code #1521
[PM-1408] – Python 3: Pegasus Analyzer #1522
[PM-1410] – create the schema for a YAML based Replica Catalog #1524
[PM-1419] – Java #1533
[PM-1420] – add checksum field for Containers in the TransformationCatalog schema #1534
[PM-1422] – integrate the client code into the dax api #1536
[PM-1423] – refactor and improve tests #1537
[PM-1428] – change dax and dag to subworkflow #1542
[PM-1430] – ensure that api can write out unicode characters in utf-8 #1544
[PM-1433] – Remove old SC RC TC catalog versions #1547
[PM-1435] – create a “moving from dax3 to new api” doc #1549
[PM-1436] – implement method chaining using decorators #1550
[PM-1437] – add type annotations #1551
[PM-1439] – Planner support for YAML based Site Catalog #1553
[PM-1441] – Python #1555
[PM-1442] – Integrate Check in CI #1556
[PM-1443] – implement api for pegasus.conf #1557
[PM-1445] – update monitord to parse yaml based brain dump file #1559
[PM-1447] – update pegasus-sc-converter to covert old format catalog to new yaml based one #1561
[PM-1448] – SiteFactory should auto detect version and load the correct implementation #1562
[PM-1450] – Module to load/dump YAML files #1564
[PM-1453] – Module to load/dump Workflow(DAX) files #1567
[PM-1454] – Module to load/dump Replica Catalog files #1568
[PM-1455] – Module to load/dump Transformation Catalog files #1569
[PM-1456] – Module to load/dump Site Catalog files #1570
[PM-1457] – Module to load/dump Properties files #1571
[PM-1460] – add docs to user guide #1574
[PM-1464] – disallow variable expansion while converting site catalog from one format to another #1578
[PM-1465] – Module to load/dump JSON files #1579
[PM-1468] – remove catalog #1582
[PM-1469] – update Pegasus-DAX3-Tutorial.ipynb #1583
[PM-1471] – update jupyter docs #1585
[PM-1473] – fix logging bug #1587
[PM-1475] – Pegasus Plan #1589
[PM-1477] – pegasus-config #1591
[PM-1478] – pegasus-remove #1592
[PM-1479] – pegasus-run #1593
[PM-1480] – pegasus-status #1594
[PM-1482] – Clean up javadoc warnings #1596
[PM-1483] – rename ProfileMixin.add_
() to ProfileMixin.add_profile_ () #1597 [PM-1487] – update python api to write new default filenames #1601
[PM-1491] – Update pegasus-tc-converter to convert from old format to new YAML format #1605
[PM-1492] – support docker/singularity container usage in cwl #1606
[PM-1494] – Parse YAML Based RC files #1608
[PM-1496] – update rc schema so that rc entries can have checksum fields #1610
[PM-1497] – update rc api to support checksum fields #1611
[PM-1498] – update RC docs to mention checksum values #1612
[PM-1499] – Implement YAML Based RC Backend #1613
[PM-1500] – create workflow test #1614
[PM-1502] – add infer_dependencies=True to wf.write #1616
[PM-1503] – in the python api, write out yml in order #1617
[PM-1504] – flatten out the uses section. remove the extra file property aggregation #1618
[PM-1505] – for job arguments, files should be added as a strings #1619
[PM-1507] – update schema for stdin, stdout, stderr in jobs to s.t only lfn is used #1621
[PM-1509] – update “type” field options in the AbstractJob schema to be “job”, “pegasusWorkflow”, and “condorWorkflow” #1623
[PM-1511] – Update DAXParser Factory to load the right parser based on content of input dax file #1625
[PM-1515] – prefer catalog entries for SC, RC and TC in the DAX over everything else #1629
[PM-1517] – update default –sites option in python client planner wrapper code #1631
[PM-1518] – Update Replica Factory to auto load RC backend based on type of file #1632
[PM-1519] – update schema for job args s.t. types can be strings and scalars #1633
[PM-1520] – Update Transformation Factory to auto load Transformation backend based on type of file #1634
[PM-1523] – work on replica catalog converter #1637
[PM-1525] – support in planner for compound transformations #1639
[PM-1526] – update python api to write out tr requirements in the format “namespace::name:version” #1640
[PM-1535] – default paths picked up should be logged in the properties file in the submit directory #1649
[PM-1538] – convert test 023-sc4-ssh-http to use yaml #1652
[PM-1540] – allow mount point regex to parse shell variable names #1654
[PM-1541] – change the pegasus-plan invocation via pegasus lite in dagman prescripts to handle new worker package organization #1655
[PM-1542] – convert test 024-sc4-gridftp-http to use yaml #1656
[PM-1543] – convert test 025-sc4-file-http to use yaml #1657
[PM-1546] – add metadata to entry in replica catalog #1660
[PM-1548] – preserve case for property keys when properties python api is used #1662
[PM-1549] – registration of outputs in Pegasus 5.0 #1663
[PM-1555] – Chapter 2. Tutorial – Update and review #1669
[PM-1556] – Chapter 3. Installation – Update and review #1670
[PM-1557] – Chapter 4. Creating Workflows – Update and review #1671
[PM-1558] – Chapter 5. Running Workflows – Update and review #1672
[PM-1559] – Chapter 6. Monitoring, Debugging and Statistics – Update and review #1673
[PM-1560] – Chapter 7. Execution Environments – Update and review #1674
[PM-1561] – Chapter 8. Containers – Update and review #1675
[PM-1562] – Chapter 9. Example Workflows – Update and review #1676
[PM-1563] – Chapter 10. Data Management – Update and review #1677
[PM-1564] – Chapter 11. Optimizing Workflows for Efficiency and Scalability – Update and review #1678
[PM-1565] – Chapter 12. Pegasus Service – Update and review #1679
[PM-1566] – Chapter 13. Configuration – Update and review #1680
[PM-1567] – Chapter 14. Submit Directory Details – Update and review #1681
[PM-1568] – Chapter 15. Jupyter Notebooks – Update and review #1682
[PM-1569] – Chapter 16. API Reference – Update and review #1683
[PM-1570] – Chapter Command Line Tools man pages – Update and review #1684
[PM-1571] – rewrite Introduction #1685
[PM-1572] – Chapter “Packages” – Update and review #1686
[PM-1573] – Chapter 17. Useful Tips – Update and review #1687
[PM-1574] – Chapter 18. Funding, citing, and anonymous usage statistics – Update and review #1688
[PM-1575] – Chapter 19. Glossary – Update and review #1689
[PM-1576] – Chapter 20. Tutorial VM – Update and review #1690
[PM-1577] – convert test 045-hierarchy-sharedfs #1691
[PM-1578] – Migration Guide to 5.0 #1692
[PM-1579] – convert test 045-hierarchy-sharedfs-b to use yaml #1693
[PM-1582] – ensure checksum and file metadata also appears in the output replica catalog #1696
[PM-1583] – Parse meta files as a RC backend in the planner #1697
[PM-1585] – remove glibc from schema and api as it is not used anymore #1699
[PM-1586] – add built-in support for pathlib.Path objects where ever paths are used #1700
[PM-1587] – _DirectoryType enums should have underscores in name #1701
[PM-1588] – in add_pegasus_profile(), add data_configuration as a kwarg #1702
[PM-1590] – in the Workflow object set infer_dependencies to be True by default #1704
[PM-1591] – Only require site and pfn in Transformation constructor when automatically creating a TransformationSite #1705
[PM-1596] – pegasus-db-admin should use PEGASUS_HOME to discover pegasus-version etc #1710
[PM-1597] – Pegasus Run #1711
[PM-1598] – Output originating from pegasus-tools should be output by workflow object as is (without any log category) #1712
[PM-1599] – Improve exception handling for failed execution of pegasus client commands #1713
[PM-1600] – Update workflow and client python apis to support multiple output sites #1714
[PM-1601] – Client plan input_dir must take in a list of str #1715
[PM-1604] – Fix deprecation warnings #1718
[PM-1605] – Get ensemble manager running on master (Python3) #1719
[PM-1606] – Merge in/factor in code implemented in add-ensemble-triggers branch #1720
[PM-1607] – Add time interval based triggering on a given directory/file pattern #1721
[PM-1609] – register_replica should be set to True by default #1723
[PM-1610] – ensure pegasus-db-admin downgrade works #1724
[PM-1611] – add pegasus-graphviz functionality into the api #1725
[PM-1612] – update monitord to record avg cpu utilization and maxrss #1726
[PM-1613] – update database schema to track maxrss and avg_cpu #1727
[PM-1614] – update planner for revised 5.0 replica catalog format #1728
[PM-1615] – unique constraint failed error when using multiple output sites #1729
[PM-1616] – add checksums for executables #1730
[PM-1617] – add missing integrity check for executables #1731
[PM-1618] – update 039-black-metadata to use python 5.0 api #1732
[PM-1623] – fix 5.0 python auto generated python api documentation #1737
[PM-1624] – convert 032-black-checkpoint #1738
[PM-1625] – Pegasus specific profile for requesting GPU resources #1739
[PM-1627] – workflow uuid, submit dir, submit hostname, root wf id should be accessible from the workflow object #1741
[PM-1628] – files generated by the api should have comments specifying that they have been auto generated by the api #1742
[PM-1629] – Add metadata to container #1743
[PM-1631] – create 032-kickstart-chkpoint-signal-condorio #1745
[PM-1632] – create 032-kickstart-chkpoint-signal-nonsharedfs #1746
[PM-1633] – CLI: manpage pegasus-plan #1747
[PM-1634] – CLI: manpage pegasus-db-admin #1748
[PM-1635] – CLI: manpage pegasus-status #1749
[PM-1636] – CLI: manpage pegasus-remove #1750
[PM-1637] – CLI: manpage pegasus-anaylzer #1751
[PM-1638] – CLI: manpage pegasus-statistics #1752
[PM-1639] – CLI: manpage pegasus-run #1753
[PM-1640] – CLI: manpage pegasus-transfer #1754
[PM-1641] – CLI: manpage pegasus-s3 #1755
[PM-1642] – CLI: manpage pegasus-integrity #1756
[PM-1643] – CLI: manpage pegasus-rc-converter #1757
[PM-1644] – CLI: manpage pegasus-sc-converter #1758
[PM-1645] – CLI: manpage pegasus-tc-converter #1759
[PM-1646] – update database overview, and update schema picture #1760
[PM-1648] – add unit tests that put/pull to/from aws s3 and ceph #1762
[PM-1649] – remove old config parameters if they are not used #1763
[PM-1652] – update pegasus client api to pass the workflow to be planned at the end #1766
[PM-1655] – the metrics server should pick up key wf_api , and default back to dax_api if not present #1769
[PM-1656] – Add unit tests to ensure YAML is being generated correctly #1770
[PM-1659] – in the python api, add pegasus profile key container.arguments #1773
[PM-1660] – add boolean bypass flag for input files #1774
[PM-1661] – expose bypass parameter for Containers and TransformationSites #1775
[PM-1662] – document behavior in user guide about bypassing file staging #1776
[PM-1665] – expose –randomdir/–randomdir=
in client code in the python api #1779 [PM-1666] – get live output from pegasus cli tools when they are called with the client code #1780
[PM-1667] – when client code is called, it should be more obvious what pegasus-
is being called #1781 [PM-1668] – fb-nlp workflow generator creates multiple jobs that create same output file #1782
[PM-1669] – remove “pegasus:
” from tc when inlined in a workflow #1783 [PM-1670] – expose a schema validation function that can be used in unit tests #1784
[PM-1682] – add –reuse option to Workflow.plan in python api #1796
[PM-1683] – update section 7.3 supported transfer protocols in data management guide #1797
[PM-1684] – reorg table of contents #1798
[PM-1688] – update pegasus-db-admin as a new ‘trigger’ table has been added to the schema #1802
[PM-1693] – convert test 010-runtime-clustering-CondorIO to use python api #1807
[PM-1694] – convert test 010-runtime-clustering-Non-SharedFS to use python api #1808
[PM-1695] – convert 010-runtime-clustering-SharedFS to use python api #1809
[PM-1696] – convert 010-runtime-clustering-SharedFS Staging and No Kickstart to use python api #1810
[PM-1698] – Release notes for 5.0 release #1812
[PM-1699] – add logo the documentation pages #1813
[PM-1700] – add SIGINT handler to Client.wait() so that it can exit cleanly #1814
[PM-1701] – planner should get chmod jobs to run locally if compute site has auxiliary.local set #1815
2.10.3. Bugs Fixed
[PM-1150] – Pegasus should verify that required credentials exists before starting a workflow #1264
[PM-1192] – User supplied env setup script for lite #1306
[PM-1199] – Notification naming / meta notifications #1313
[PM-1326] – singularity suffix computed incorrectly #1440
[PM-1327] – bypass input file staging broken for container execution #1441
[PM-1330] – .meta files created even when integrity checking is disabled. #1444
[PM-1332] – monitord is failing on a dagman.out file #1446
[PM-1333] – amqp endpoint errors should not disable database population for multiplexed sinks #1447
[PM-1334] – pegasus dagman is not exiting cleanly #1448
[PM-1336] – pegasus-submitdir is broken #1450
[PM-1346] – Pegasus job checkpointing is incompatible with condorio #1460
[PM-1347] – pegasus will always try and transfer output when a code has checkpointed #1461
[PM-1350] – pegasus is ignoring when_to_transfer_output #1464
[PM-1358] – HTCondor 8.8.0/8.8.1 remaps /tmp, and can break access to x509 credentials #1472
[PM-1360] – planner drops transfer_(in|out)put_files if NoGridStart is used #1474
[PM-1362] – Chinese characters in the file path #1476
[PM-1366] – Pegasus Cluster Label – Job Env Not Picked Up in Containers #1480
[PM-1377] – A + in a tc name breaks pegasus-plan #1491
[PM-1379] – Stage out job fails – wrong src location #1493
[PM-1380] – Support for Singularity Library #1494
[PM-1389] – pegasus.cores causes issues on Summit #1503
[PM-1395] – GLite LSF scripts don’t work as intended on OLCF’s DTNs #1509
[PM-1405] – Is Pegasus supposed to build on 32-bit x86 (Debian i386 Stretch)? #1519
[PM-1409] – Python virtual environments not considered first before system wide installation #1523
[PM-1421] – p-lite generated sh files fail, when used with Docker containers, that make use of USER argument #1535
[PM-1466] – Upgrade Python Package Versions #1580
[PM-1488] – Get condorpool worker machine ipaddr error!!! #1602
[PM-1506] – pegasus-python-wrapper locates executable from PEGASUS_HOME instead of dirname of the exec. #1620
[PM-1521] – Decaf Jobs does not generate valid JSON for Decaf anymore #1635
[PM-1522] – output not being capture in pegasus client code #1636
[PM-1530] – site selector does map job correctly when using stageable or all mapper with a containerized job #1644
[PM-1531] – pegasus-db-admin fails on macosx on a clean db #1645
[PM-1534] – integrity checking with bypass input staging if checksums are specified in replica catalog #1648
[PM-1595] – hostname is not generated in case of integrity failures #1709
[PM-1619] – pegasus-analyzer output doesn’t match pegasus-status #1733
[PM-1630] – PATH variable in Docker containers is not preserved in some cases #1744
[PM-1671] – ConfigParser still referenced in pegasus-worker CLI scripts #1785
[PM-1675] – planner throws error when using “pegasus.dir.storage.mapper.replica.file” without specifying “pegasus.dir.storage.mapper.replica” #1789
[PM-1676] – hierarchical workflow failing only when integrity checking is on #1790
[PM-1680] – kickstart needs to quote argument vector when outputting it to yaml #1794
[PM-1702] – registration jobs fail out of a 5.0 binary install #1816
[PM-1705] - priorities are not propagated correctly #1819
2.11. Pegasus 5.0.0beta1
Release Date: July 27, 2020
We are happy to announce the beta1 release of Pegasus 5.0. Pegasus 5.0 will be a major release of pegasus and this beta version has most of the features and bug fixes that will be in 5.0.
We invite our users to give it a try. Since this is a beta release, it has to manually downloaded from the website at https://download.pegasus.isi.edu/pegasus/5.0.0beta1/
There, you can find there the various RPM/DEB packages and the binary tarballs.
If you are an existing user, please carefully follow these instructions to upgrade. https://pegasus.isi.edu/docs/5.0.0dev/migration.html#migrating-from-pegasus-4-9-x-to-pegasus-5-0
2.11.1. Highlights of the new Release
Reworked Python API:
This new API has been developed from the grounds up that in addition to generating the abstract workflow and all the catalogs, allows you to plan, submit, monitor, analyze and generate statistics of your workflow. To use this new Python API refer to the Moving From DAX3 to Pegasus.api. https://pegasus.isi.edu/docs/5.0.0dev/migration.html#moving-from-dax3
Adoption of YAML formats:
With Pegasus 5.0, we are moving to adoption of YAML for representation of all major catalogs. In 5.0, the following are now represented in YAML
Abstract Workflow
Replica Catalog
Transformation Catalog
Site Catalog
Kickstart Provenance Records
Python3 Support
All Pegasus tools are Python 3 compliant.
5.0 release will require Python 3 on workflow submit node
Python PIP packages for workflow composition and monitoring
Default data configuration
In Pegasus 5.0, the default data configuration has been changed to condorio . Uptil 4.9.x releases, the default configuration was sharedfs.
Zero configuration required to submit to local HTCondor pool
Data Management Improvements
New output replica catalog that registers outputs including file metadata such as size and checksums
Ability to do bypass staging of files at a per file, executable and container level.
Improved support for hierarchal workflows allow you to create data dependencies between sub workflow jobs and compute jobs
Support for integrity checking of user executables and application containers in addition to data
Revamped Documentation
The documentation has been moved to readthedocs style documentation using restructured text. The documentation can be found at https://pegasus.isi.edu/docs/5.0.0dev/index.html