Chapter 5. Advanced Pegasus Tutorial Using A Virtual Machine

The goal of this section is to start a complete virtual cluster inside of FutureGrid, and run a more challenging Pegasus workflow in this cluster, and tear it down again. This will show-case a fuller functionality of Pegasus. At the time of this writing, the simplest method to start the virtual cluster is using the Nimbus context broker. This means that the advanced tutorial is only available for the Nimbus platform.

5.1. Start Your 1-Button Virtual Cluster

Unlike the previous chapters, this chapter uses a fully-configured Condor cluster that runs entirely within FutureGrid. We are using a special feature of the Nimbus cloud client to bring up the cluster for us. Hence, this chapter does not work with Eucalyptus!

The file shows the setup for the FutureGrid XEN-based Nimbus systems. Eligible systems include hotel, sierra and foxtrot. For alamo, you need a slightly different file, because it uses KVM and we tend to name our images accordingly. However, if you need it, we do have a KVM version available on alamo for you with slightly modified image names.

<?xml version="1.0" encoding="UTF-8"?>
<cluster xmlns="http://www.globus.org/2008/06/workspace/metadata/logistics">
 <workspace>
   <name>Pegasus Submit Host</name>
   <image>pegasus-tutorial.x64.gz</image>
   <quantity>1</quantity>
   <nic wantlogin="true">public</nic>
   <ctx>
     <provides>
        <identity />
        <role>condormaster</role>
     </provides>
     <requires>
        <identity />
        <role name="condorworker" hostname="true" pubkey="false" />
     </requires>
   </ctx>
 </workspace>

 <workspace>
   <name>Pegasus Worker Node</name>
   <image>pegasus-worker.x64.gz</image>
   <quantity>2</quantity>
   <nic wantlogin="true">public</nic>
   <ctx>
     <provides>
        <identity />
        <role>condorworker</role>
     </provides>
     <requires>
        <identity />
        <role name="condormaster" hostname="true" pubkey="false" />
     </requires>
   </ctx>
 </workspace>
</cluster>

The XML snippet above is a file with name pegasus-tutorial.xml that you can copy into your samples directory. It describes two different kinds of VMs known as workspace above. The first workspace describes the Pegasus tutorial VM that we have worked with up to this point. We are starting a single one only, because we only need one Condor master and submit host. However, since we are going to extend its Condor pool, we add a bit of contextualization that Nimbus offers.

The second workspace describes the Condor pool worker nodes. These are slightly simpler VMs, still with a full Pegasus and Condor. The quantity element determines how many of these we are starting. Please do not go overboard, because the number of Nimbus hosts is limited. The contextualization configures the worker VMs to only start a sub-set of Condor daemons in order to run jobs, and report back to the submit host VM.

Additionally, we need to tell the cloud-client that we need to run a cluster of VMs instead of a single VM. To do this, we replace the --run option and its argument with the --cluster option:

$ cloud-client --conf conf/sierra.conf --run --hours 8 --cluster pegasus-tutorial.xml
...

SSH known_hosts contained tilde:
  - '~/.ssh/known_hosts' --> '/your/home/.ssh/known_hosts'

Requesting cluster.
  - Pegasus Submit Host: image 'pegasus-tutorial.x64.gz', 1 instance
  - Pegasus Worker Node: image 'pegasus-worker.x64.gz', 2 instances 

Context Broker:
    https://s83r.idp.sdsc.futuregrid.org:8443/wsrf/services/NimbusContextBroker

Created new context with broker.

Workspace Factory Service:
    https://s83r.idp.sdsc.futuregrid.org:8443/wsrf/services/WorkspaceFactoryService                                                                             

Creating workspace "Pegasus Submit Host"... done.
  - 198.202.120.101 [ vm-7.sdsc.futuregrid.org ] 

Creating group "Pegasus Worker Node"... done.
  - 198.202.120.102 [ vm-8.sdsc.futuregrid.org ]
  - 198.202.120.103 [ vm-9.sdsc.futuregrid.org ]

Launching cluster-055... done.

Waiting for launch updates.
  - cluster-055: all members are Running
  - wrote reports to '/path/to/Nimbus/history/cluster-055/reports-vm'                                                                          

Waiting for context broker updates.
  - cluster-055: contextualized    
  - wrote ctx summary to '/path/to/Nimbus/history/cluster-055/reports-ctx/CTX-OK.txt'                                                          
  - wrote reports to '/path/to/Nimbus/history/cluster-055/reports-ctx'                                                                         

SSH trusts new key for vm-7.sdsc.futuregrid.org  [[ Pegasus Submit Host ]]

SSH trusts new key for vm-9.sdsc.futuregrid.org  [[ Pegasus Worker Node #0 ]]

SSH trusts new key for vm-8.sdsc.futuregrid.org  [[ Pegasus Worker Node #1 ]]

The execution will take a few minutes. Once the cloud-client returns, you can log into the head-node of your virtual cluster. This is the "Pegasus Submit Host" in the example above. While it is possible to log into the worker nodes, there is no need to do so, except for debugging purposes.

A future version of this tutorial will attempt to use private address worker nodes, because public IPv4 addresses are a precious resource.

Once logged into the tutorial VM, you should check that the worker nodes indeed report to the Condor master by checking the Condor resources available:

$ ssh tutorial@vm-7.sdsc.futuregrid.org
/usr/bin/xauth:  creating new authority file /home/tutorial/.Xauthority

tutorial@vm-7.sdsc.futuregrid.org:~ $ condor_status

Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime

slot1@vm-7.sdsc.fu LINUX      X86_64 Unclaimed Idle     0.000  2048  0+00:34:42
slot2@vm-7.sdsc.fu LINUX      X86_64 Unclaimed Idle     0.000  2048  0+00:35:05
slot1@vm-8.sdsc.fu LINUX      X86_64 Unclaimed Idle     0.000  2048  0+00:24:45
slot2@vm-8.sdsc.fu LINUX      X86_64 Unclaimed Idle     0.000  2048  0+00:26:06
slot1@vm-9.sdsc.fu LINUX      X86_64 Unclaimed Idle     0.000  2048  0+00:28:38
slot2@vm-9.sdsc.fu LINUX      X86_64 Unclaimed Idle     0.000  2048  0+00:28:58
                     Total Owner Claimed Unclaimed Matched Preempting Backfill

        X86_64/LINUX     6     0       0         6       0          0        0

               Total     6     0       0         6       0          0        0

If you do not see the other resources reporting to the master, i.e. you see errors or only two slots, something is wrong. Wait a few more minutes, and if they still don't report, please contact your local tutorial person for help.

5.2. Planning and Executing Workflow against a Remote Resource

In this exercise we are going to run pegasus-plan to generate a executable workflow from the abstract workflow in file montage.dax. The executable workflow contains Condor submit files, which are geared towards remote execution. In this tutorial, we are tailoring the remote execution to worker nodes that reside in the cloud. These worker nodes report back to the submit host, and communicate by means of Condor-I/O to stage data.

The instructors have provided a DAX file montage.dax in the $HOME/pegasus-wms/dax/ directory. You will need to write some things yourself, by following the instructions.

Let us run pegasus-plan on the montage dax on the futuregrid cluster. If multiple sites are available, you could provide additional sites using comma "," separation. However, in the cloud mode, even different clouds with different infrastructures may all report to the same submit host, and can all be handled by a single entry in the site catalog.

$ pegasus-version
4.0.0
$ cd $HOME/pegasus-wms
$ pegasus-plan -Dpegasus.data.configuration=condorio --dir $PWD/dags \
               --sites futuregrid --output local --force \
               --nocleanup --dax $PWD/dax/montage.dax -v

The above command says that we need to plan the montage dax on the futuregrid site. The futuregrid site are any of the worker node VMs that are reporting back to our submit host VM. The jobs for this workflow will be submitted our local Condor in the submit host, which will in turn spread them across idle cloud resources.

The output data needs to be transferred back to the local host. The condor submit files are to be generated in a directory structure whose base is dags. We also are requesting that no cleanup jobs be added as we require the intermediate data on the remote host. Here is the output of pegasus-plan:

2012.03.13 16:44:49.234 EDT: [INFO] event.pegasus.parse.dax dax.id /home/tutorial/pegasus-wms/dax/montage.dax  - STARTED                                                                      
2012.03.13 16:44:49.417 EDT: [INFO]  Generating Stampede Events for Abstract Workflow                                                                                                         
2012.03.13 16:44:49.472 EDT: [INFO]  Generating Stampede Events for Abstract Workflow -DONE                                                                                                   
2012.03.13 16:44:49.474 EDT: [INFO] event.pegasus.refinement dax.id montage_0  - STARTED                                                                                                      
2012.03.13 16:44:49.494 EDT: [INFO] event.pegasus.siteselection dax.id montage_0  - STARTED
2012.03.13 16:44:49.533 EDT: [INFO] event.pegasus.siteselection dax.id montage_0  - FINISHED
2012.03.13 16:44:49.611 EDT: [INFO]  Grafting transfer nodes in the workflow
2012.03.13 16:44:49.611 EDT: [INFO] event.pegasus.generate.transfer-nodes dax.id montage_0  - STARTED
2012.03.13 16:44:49.699 EDT: [INFO] event.pegasus.generate.transfer-nodes dax.id montage_0  - FINISHED
2012.03.13 16:44:49.703 EDT: [INFO] event.pegasus.generate.workdir-nodes dax.id montage_0  - STARTED
2012.03.13 16:44:49.707 EDT: [INFO] event.pegasus.generate.workdir-nodes dax.id montage_0  - FINISHED
2012.03.13 16:44:49.708 EDT: [INFO] event.pegasus.refinement dax.id montage_0  - FINISHED
2012.03.13 16:44:49.748 EDT: [INFO]  Generating codes for the concrete workflow
2012.03.13 16:44:49.989 EDT: [INFO]  Generating codes for the concrete workflow -DONE
2012.03.13 16:44:49.990 EDT:


I have concretized your abstract workflow. The workflow has been entered
into the workflow database with a state of "planned". The next step is
to start or execute your workflow. The invocation required is


pegasus-run  /home/tutorial/pegasus-wms/dags/tutorial/pegasus/montage/run0001


2012.03.13 16:44:49.990 EDT:   Time taken to execute is 1.233 seconds
2012.03.13 16:44:49.990 EDT: [INFO] event.pegasus.parse.dax dax.id /home/tutorial/pegasus-wms/dax/montage.dax  - FINISHED

If you get any errors above while running pegasus-plan you can add -vvvvv to enable maximum verbosity. That should help you pin-point the error, or at least provide us with better information when you file a bug report.

The next step is to run the workflow, as shown in the planner's output.

$ pegasus-run  /home/tutorial/pegasus-wms/dags/tutorial/pegasus/montage/run0001

-----------------------------------------------------------------------
File for submitting this DAG to Condor           : montage-0.dag.condor.sub
Log of DAGMan debugging messages                 : montage-0.dag.dagman.out
Log of Condor library output                     : montage-0.dag.lib.out
Log of Condor library error messages             : montage-0.dag.lib.err
Log of the life of condor_dagman itself          : montage-0.dag.dagman.log

Submitting job(s).
1 job(s) submitted to cluster 1.
-----------------------------------------------------------------------

Your Workflow has been started and runs in base directory given below

cd /home/tutorial/pegasus-wms/dags/tutorial/pegasus/montage/run0001

*** To monitor the workflow you can run ***

pegasus-status -l /home/tutorial/pegasus-wms/dags/tutorial/pegasus/montage/run0001

*** To remove your workflow run ***
pegasus-remove /home/tutorial/pegasus-wms/dags/tutorial/pegasus/montage/run0001

The above command submits the workflow to Condor DAGMan, using the Condor-G sub-system of Condor to talk to Grids. The workflow submission starts a monitoring daemon pegasus-monitord that parses the condor log files to update the status of the jobs and push it in a work database.

Monitor the workflow using the commands provided in the output above, and as explained earlier.

The workflow generates a JPEG output file that resides in the directory /home/tutorial/local-storage/storage/ if it runs successfully. Look for a file with suffix *.jpg.

The grid workflow will take time to execute on the virtual cluster. On the instructor's FG setup with two worker-node VMS that have two CPU slots each, it took about 5 minutes to run.

If you point your web browser to the public IP address of your submit host, and look into its storage sub-directory, you will find a single JPEG there. That is the output of this workflow. You find the name of your submit host in your prompt. For instance, if my prompt show that I am user tutorial on vm-7.sdsc.futuregrid.org, I point my browser to http://vm-7.sdsc.futuregrid.org/storage/ to look for the image.

Figure 5.1. Figure: Output from the Montage workflow (M17, j-band).

Figure: Output from the Montage workflow (M17, j-band).

The figure above is an example output for the workflow.

5.3. Optimizing A Workflow By Clustering Small Jobs (offline)

Sometimes a workflow may have too many jobs whose execution time is a few seconds long. In such instances the overhead of scheduling each job on a grid is too large. Even for cloud computing, the communication overhead will still take the majority of run-time. The runtime of the entire workflow can be optimized by using Pegasus clustering techniques. One such technique is to cluster jobs horizontally on the same level into one or more sequential jobs.

$ cd $HOME/pegasus-wms
$ pegasus-plan -Dpegasus.data.configuration=condorio --dir $PWD/dags \
               --sites futuregrid --output local --nocleanup --force \
               --cluster horizontal --dax $PWD/dax/montage.dax -v

After clustering the executable workflow will contain 26 jobs compared to 44 in the non clustered mode.

5.4. Terminate Your 1-Button Virtual Cluster

Since you created the virtual cluster with a single command, it is fairly simple to bring down everything with a single command, too. As usual, if you are still logged into the remote resource, you should rescue all your generated data to a permanent disk outside the VM, and log out from any VMs in the cluster you are about to terminate:

tutorial@vm-7.sdsc.futuregrid.org:~ $ logout
Connection to 198.202.120.101 closed.

This will simple close the secure-shell (ssh) connection to the remote VM instance. At this point, you are back to the system from which you logged into the remote VM.

To terminate the VM, you use your installed cloud-client to terminate the running instance. You must use the same cloud client on the same machine that you used to start the VM, because the client keeps some local state on that machine. To shut down an instance, you need the handle to the remote VM. Thus, you will often need to query Nimbus for this handle:

$ bin/cloud-client.sh --conf conf/sierra.conf --status
...
Querying for ALL instances.

[*] - Workspace #84. 198.202.120.101 [ vm-7.sdsc.futuregrid.org ]
      State: Running                                             
      Duration: 480 minutes.                                     
      Start time: Tue Mar 13 10:45:50 PDT 2012
      Shutdown time: Tue Mar 13 18:45:50 PDT 2012
      Termination time: Tue Mar 13 18:47:50 PDT 2012
      *Handle: cluster-055
       Images: pegasus-worker.x64.gz, pegasus-tutorial.x64.gz

[*] - Workspace #85. 198.202.120.102 [ vm-8.sdsc.futuregrid.org ]
      State: Running
      Duration: 480 minutes.
      Start time: Tue Mar 13 10:45:50 PDT 2012
      Shutdown time: Tue Mar 13 18:45:50 PDT 2012
      Termination time: Tue Mar 13 18:47:50 PDT 2012
      *Handle: cluster-055
       Images: pegasus-worker.x64.gz, pegasus-tutorial.x64.gz

[*] - Workspace #86. 198.202.120.103 [ vm-9.sdsc.futuregrid.org ]
      State: Running
      Duration: 480 minutes.
      Start time: Tue Mar 13 10:45:50 PDT 2012
      Shutdown time: Tue Mar 13 18:45:50 PDT 2012
      Termination time: Tue Mar 13 18:47:50 PDT 2012
      *Handle: cluster-055
       Images: pegasus-worker.x64.gz, pegasus-tutorial.x64.gz

Unlike previous cases, the output will show every VM you have running on the remote site. Additionally, you will notice that the handle starts with the cluster instead of vm prefix. All VMs that belong to the same cluster have the same handle. If you use the handle above, in the example cluster-041, it will terminate three VMs.

$ bin/cloud-client.sh --conf conf/sierra.conf --terminate --handle cluster-055
...

Terminating cluster.
  - Cluster handle (EPR): '/path/to/Nimbus/history/cluster-055/cluster-epr.xml'

Destroying cluster-055... destroyed.

And you are done.