10.3. Credentials Management

Pegasus tries to do data staging from localhost by default, but some data scenarios makes some remote jobs do data staging. An example of such a case is when running in nonsharedfs mode. Depending on the transfer protocols used, the job may have to carry credentials to enable these data transfers. To specify where which credential to use and where Pegasus can find it, use environment variable profiles in your site catalog. The supported credential types are X.509 grid proxies, Amazon AWS S3 keys, Google Cloud Platform OAuth token (.boto file), iRods password and SSH keys.

Credentials are usually associated per site in the site catalog. Users can associate the credentials either as a Pegasus profile or an environment profile with the site.

  1. A pegasus profile with the value pointing to the path to the credential on the local site or the submit host. If a pegasus credential profile associated with the site, then Pegasus automatically transfers it along with the remote jobs.

  2. A env profile with the value pointing to the path to the credential on the remote site. If an env profile is specified, then no credential is transferred along with the job. Instead the job's environment is set to ensure that the job picks up the path to the credential on the remote site.


Specifying credentials as Pegasus profiles was introduced in 4.4.0 release.

In case of data transfer jobs, it is possible to associate different credentials for a single file transfer ( one for the source server and the other for the destination server) . For example, when leveraging GridFTP transfers between two sides that accept different grid credentials such as XSEDE Stampede site and NCSA Bluewaters. In that case, Pegasus picks up the associated credentials from the site catalog entries for the source and the destination sites associated with the transfer.

10.3.1. X.509 Grid Proxies

If the grid proxy is required by transfer jobs, and the proxy is in the standard location, Pegasus will pick the proxy up automatically. For non-standard proxy locations, you can use the X509_USER_PROXY environment variable. Site catalog example:

<profile namespace="pegasus" key="X509_USER_PROXY" >/some/location/x509up</profile>

10.3.2. Amazon AWS S3

If a workflow is using s3 URLs, Pegasus has to be told where to find the .s3cfg file. This format of the file is described in the pegaus-s3 command line client's man page. For the file to be picked up by the workflow, set the S3CFG profile to the location of the file. Site catalog example:

<profile namespace="pegasus" pegasus="S3CFG" >/home/user/.s3cfg</profile>

10.3.3. Google Storage

If a workflow is using gs:// URLs, Pegasus needs access to a Google Storage service account. First generate the credential by following the instructions at:


Download the credential in PKCS12 format, and then use "gsutil config -e" to generate a .boto file. For example:

$ gsutil config -e
This command will create a boto config file at /home/username/.boto
containing your credentials, based on your responses to the following
What is your service account email address? some-identifier@developer.gserviceaccount.com
What is the full path to your private key file? /home/username/my-cred.p12
What is the password for your service key file [if you haven't set one
explicitly, leave this line blank]? 

Please navigate your browser to https://cloud.google.com/console#/project,
then find the project you will use, and copy the Project ID string from the
second column. Older projects do not have Project ID strings. For such projects,
click the project and then copy the Project Number listed under that project.

What is your project-id? your-project-id

Boto config file "/home/username/.boto" created. If you need to use a
proxy to access the Internet please see the instructions in that file.

Pegasus has to be told where to find both the .boto file as well as the PKCS12 file. For the files to be picked up by the workflow, set the BOTO_CONFIG and GOOGLE_PKCS12 profiles for the storage site. Site catalog example:

<profile namespace="pegasus" key="BOTO_CONFIG" >/home/user/.boto</profile>
<profile namespace="pegasus" key="GOOGLE_PKCS12" >/home/user/.google-service-account.p12</profile>

10.3.4. iRods Password

If a workflow is using iRods URLs, Pegasus has to be given an irodsEnv file. It is a standard file, with the addtion of an password attribute. Example when using iRods 3.X:

# iRODS personal configuration file.
# iRODS server host name:
irodsHost 'some.host.edu'
# iRODS server port number:
irodsPort 1259

# Account name:
irodsUserName 'someuser'
# Zone:
irodsZone 'somezone' 

# this is used with Pegasus
irodsPassword 'somesecretpassword'

iRods 4.0 switched to a JSON based configuration file. Pegasus can handle either config file. JSON Example:

    "irods_host": "some.host.edu",
    "irods_port": 1247,
    "irods_user_name": "someuser",
    "irods_zone_name": "somezone",
    "irodsPassword" : "somesecretpassword"

The location of the file can be given to the workflow using the irodsEnvFile environment profile. Site catalog example:

<profile namespace="pegasus" key="irodsEnvFile" >/home/user/.irods/.irodsEnv</profile>

10.3.5. SSH Keys

New in Pegasus 4.0 is the support for data staging with scp using ssh public/private key authentication. In this mode, Pegasus transports a private key with the jobs. The storage machines will have to have the public part of the key listed in ~/.ssh/authorized_keys.


SSH keys should be handled in a secure manner. In order to keep your personal ssh keys secure, It is recommended that a special set of keys are created for use with the workflow. Note that Pegasus will not pick up ssh keys automatically. The user will have to specify which key to use with SSH_PRIVATE_KEY.

The location of the ssh private key can be specified with the SSH_PRIVATE_KEY environment profile. Site catalog example:

<profile namespace="pegasus" key="SSH_PRIVATE_KEY" >/home/user/wf/wfsshkey</profile>