7.9. SDSC Comet with BOSCO glideins

BOSCO documentation: https://twiki.opensciencegrid.org/bin/view/CampusGrids/BoSCO

BOSCO is part of the HTCondor system which allows you to set up a personal pool of resources brought in from a remote cluster. In this section, we describe how to use BOSCO to run glideins (pilot jobs) dynamically on the SDSC Comet cluster. The glideins are submitted based on the demand of the user jobs in the pool.

As your regular user, on the host you want to use as a workflow submit host, download the latest version of HTCondor from the HTCondor Download page. At this point the latest version was 8.5.2 and we downloaded condor-8.5.2-x86_64_RedHat6-stripped.tar.gz. Untar, and run the installer:

$ tar xzf condor-8.5.2-x86_64_RedHat6-stripped.tar.gz
$ cd condor-8.5.2-x86_64_RedHat6-stripped
$ ./bosco_install
Created a script you can source to setup your Condor environment
variables. This command must be run each time you log in or may
be placed in your login scripts:
   source /home/$USER/bosco/bosco_setenv

Source the setup file as instructed, run bosco_start, and then test that condor_q and condor_status works.

$ source /home/$USER/bosco/bosco_setenv
$ condor_q

-- Schedd: workflow.iu.xsede.org :

0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
$ condor_status

Let's tell BOSCO about our SDSC Comet account:

$ bosco_cluster -a YOUR_SDSC_USERNAME@comet-ln2.sdsc.edu pbs

BOSCO needs a little bit more information to be able to submit the glideins to Comet. Log in to your Comet account via ssh (important - this step has to take place on Comet) and create the ~/bosco/glite/bin/pbs_local_submit_attributes.sh file with the following content. You can find your allocation by running show_accounts and looking at the project column.

echo "#PBS -q compute"
echo "#PBS -l nodes=1:ppn=24"
echo "#PBS -l walltime=24:00:00"

Also chmod the file:

$ chmod 755 ~/bosco/glite/bin/pbs_local_submit_attributes.sh

Log out of Comet, and get back into the host and user BOSCO was installed into. We also need to edit a few files on that host. ~/bosco/libexec/campus_factory/share/glidein_jobs/glidein_wrapper.sh has a bug in some versions of HTCondor. Open up the file and make sure the eval line in the beginning is below the unset/export HOME section. If that is not the case, edit the file to look like:


starting_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

# BLAHP does weird things with home directory
unset HOME
export HOME

eval campus_factory_dir=$_campusfactory_CAMPUSFACTORY_LOCATION

If the order of the HOME and eval statements are reversed in your file, change them to look like the above. At the end of ~/bosco/libexec/campus_factory/share/glidein_jobs/glidein_condor_config add:

# dynamic slots
SLOT_TYPE_1 = cpus=100%,disk=100%,swap=100%

In the file ~/bosco/libexec/campus_factory/share/glidein_jobs/job.submit.template find the line reading:

         _condor_NUM_CPUS=1; \

You should now have a functioning BOSCO setup. Submit a Pegasus workflow.