This section describes the configuration required for Pegasus to generate an executable workflow that uses glite to submit to a Slurm, PBS, or SGE batch system on a local cluster. This environment is referred to as the local campus cluster, as the workflow submit node (Pegasus + HTCondor) need to be installed on a login node (or a node where the local batch scheduler commands can be executed) of the cluster.
Glite is the old name for BLAH (or BLAHP). BLAH binaries are distributed with HTCondor as the "batch_gahp". For historical reasons, we often use the term "glite", and you will see "glite" and "batch_gahp" references in HTCondor, but all of them refer to the same thing, which has been renamed BLAH.
This guide covers Slurm, PBS, Moab, and SGE, but glite also works with other PBS-like batch systems, including LSF, Cobalt and others. If you need help configuring Pegasus and HTCondor to work with one of these systems, please contact email@example.com. For the sake of brevity, the text below will say "PBS", but you should read that as "PBS or PBS-like system such as SGE, Moab, LSF, and others".
This is because the glite layer communicates with the batch system
running on the cluster using
qsub/... or equivalent
commands. If you can submit jobs to the local scheduler from the workflow
submit host, then the local HTCondor can be used to submit jobs via glite
(with some modifications described below). If you need to SSH to a
different cluster head node in order to submit jobs to the scheduler, then
you can use BOSCO, which is documented in another
There is also a way to do remote job submission via glite even if you cannot SSH to the head node. This might be the case, for example, if the head node requires 2-factor authentication (e.g. RSA tokens). This approach is called the "Reverse GAHP" and you can find out more information on the GitHub page. All it requires is SSH from the cluster head node back to the workflow submit host.
In either case, you need to modifiy the HTCondor glite installation
that will be used to submit jobs to the local scheduler. To do this, run
pegasus-configure-glite command. This command will
install all the required scripts to map Pegasus profiles to batch-system
specific job attributes, and add support for Moab. You may need to run it
as root depending on how you installed HTCondor.
HTCondor has an issue for the Slurm configuration when running
on Ubuntu systems. Since in Ubuntu,
not link to
bash, the Slurm script will fail when
trying to run the
source command. A quick fix to
this issue is to force the script to use
bls_set_up_local_and_extra_args function of the
blah_common_submit_functions.sh script, which is
located in the same folder as the installation above, only add
$bls_tmp_file 2> /dev/null command line.
In order to configure a workflow to use glite you need to create an entry in your site catalog for the cluster and set the following profiles:
pegasus profile style with value set to glite.
condor profile grid_resource with value set to batch slurm, batch pbs, batch sge or batch moab.
An example site catalog entry for a local glite PBS site looks like this:
<sitecatalog xmlns="http://pegasus.isi.edu/schema/sitecatalog" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://pegasus.isi.edu/schema/sitecatalog http://pegasus.isi.edu/schema/sc-4.0.xsd" version="4.0"> <site handle="local" arch="x86" os="LINUX"> <directory type="shared-scratch" path="/lfs/shared-scratch/glite-sharedfs-example/work"> <file-server operation="all" url="file:///lfs/local-scratch/glite-sharedfs-example/work"/> </directory> <directory type="local-storage" path="/shared-scratch//glite-sharedfs-example/outputs"> <file-server operation="all" url="file:///lfs/local-scratch/glite-sharedfs-example/outputs"/> </directory> </site> <site handle="local-slurm" arch="x86" os="LINUX"> <!-- the following is a shared directory shared amongst all the nodes in the cluster --> <directory type="shared-scratch" path="/lfs/glite-sharedfs-example/local-slurm/shared-scratch"> <file-server operation="all" url="file:///lfs/glite-sharedfs-example/local-slurm/shared-scratch"/> </directory> <profile namespace="env" key="PEGASUS_HOME">/lfs/software/pegasus</profile> <profile namespace="pegasus" key="style" >glite</profile> <profile namespace="condor" key="grid_resource">batch slurm</profile> <profile namespace="pegasus" key="queue">normal</profile> <profile namespace="pegasus" key="runtime">30000</profile> </site> </sitecatalog>
Starting 4.2.1, in the examples directory you can find a glite shared filesystem example that you can use to test out this configuration.
You probably don't need to know this, but Pegasus generates a
+remote_cerequirements expression for an HTCondor glite
job based on the Pegasus profiles associated with the job. This expression
is passed to glite and used by the
*_local_submit_attributes.sh scripts installed by
pegasus-configure-glite to generate the correct batch
submit script. An example
classad expression in the HTCondor submit file looks like this:
+remote_cerequirements = JOBNAME=="preprocessj1" && PASSENV==1 && WALLTIME=="01:00:00" && \ EXTRA_ARGUMENTS=="-N testjob -l walltime=01:23:45 -l nodes=2" && \ MYENV=="CONDOR_JOBID=$(cluster).$(process),PEGASUS_DAG_JOB_ID=preprocess_j1,PEGASUS_HOME=/usr,PEGASUS_WF_UUID=aae14bc4-b2d1-4189-89ca-ccd99e30464f"
The job name and environment variables are automatically passed through to the remote job.
The following sections document the mapping of Pegasus profiles to batch system job requirements as implemented by Pegasus, HTCondor, and glite.
The job requirements are constructed based on the following profiles:
Table 7.1. Mapping of Pegasus Profiles to Job Requirements
|Profile Key||Key in +remote_cerequirements||SLURM parameter||PBS Parameter||SGE Parameter||Moab Parameter||Cobalt Parameter||Description|
|pegasus.cores||CORES||--ntasks cores||n/a||-pe ompi||n/a||--proccount cores||Pegasus uses cores to calculate either nodes or ppn. If cores and ppn are specified, then nodes is computed. If cores and nodes is specified, then ppn is computed. If both nodes and ppn are specified, then cores is ignored. The resulting values for nodes and ppn are used to set the job requirements for PBS and Moab. If neither nodes nor ppn is specified, then no requirements are set in the PBS or Moab submit script. For SGE, how the processes are distributed over nodes depends on how the parallel environment has been configured; it is set to 'ompi' by default.|
|pegasus.nodes||NODES||--nodes nodes||-l nodes||n/a||-l nodes||-n nodes||This specifies the number of nodes that the job should use. This is not used for SGE.|
|pegasus.ppn||PROCS||n/a||-l ppn||n/a||-l ppn||--mode c[ppn]||This specifies the number of processors per node that the job should use. This is not used for SGE.|
|pegasus.runtime||WALLTIME||--time walltime||-l walltime||-l h_rt||-l walltime||-t walltime||This specifies the maximum runtime for the job in seconds. It should be an integer value. Pegasus converts it to the "hh:mm:ss" format required by the batch system. The value is rounded up to the next whole minute.|
|pegasus.memory||PER_PROCESS_MEMORY||--mem memory||-l pmem||-l h_vmem||--mem-per-cpu pmem||n/a||This specifies the maximum amount of physical memory used by any process in the job. For example, if the job runs four processes and each requires up to 2 GB (gigabytes) of memory, then this value should be set to "2gb" for PBS and Moab, and "2G" for SGE. The corresponding PBS directive would be "#PBS -l pmem=2gb".|
|pegasus.project||PROJECT||--account project_name||-A project_name||n/a||-A project_name||-A project_name||Causes the job time to be charged to or associated with a particular project/account. This is not used for SGE.|
|pegasus.queue||QUEUE||--partition||-q||-q||-q||This specifies the queue for the job. This profile does
not have a corresponding value in
|globus.totalmemory||TOTAL_MEMORY||--mem memory||-l mem||n/a||-l mem||n/a||The total memory that your job requires. It is usually better to just specify the pegasus.memory profile. This is not mapped for SGE.|
|pegasus.glite.arguments||EXTRA_ARGUMENTS||prefixed by "#SBATCH"||prefixed by "#PBS"||prefixed by "#?"||prefixed by "#MSUB"||n/a||This specifies the extra arguments that must appear in the generated submit script for a job. The value of this profile is added to the submit script prefixed by the batch system-specific value. These requirements override any requirements specified using other profiles. This is useful when you want to pass through special options to the underlying batch system. For example, on the USC cluster we use resource properties to specify the network type. If you want to use the Myrinet network, you must specify something like "-l nodes=8:ppn=2:myri". For infiniband, you would use something like "-l nodes=8:ppn=2:IB". In that case, both the nodes and ppn profiles would be effectively ignored.|
gLite/blahp does not follow the
classad directives. Therefore, all the jobs that have the
glite style applied don't have a remote directory
specified in the submit script. Instead, Pegasus uses Kickstart to
change to the working directory when the job is launched on the remote