One of the Pegasus users, Renette, recently contacted us during Pegasus Office Hours, seeking help in running workflows on AWS Batch. We provided support for users to run their workflows on AWS Batch since the Pegasus 4.9.0 release in 2018.
It was pretty impressive to see what Renette has built by just going through the Pegasus documentation and our test example that we run nightly on Pegasus AWS Batch.
This blogpost highlights the workflow infrastructure that Renette developed for producing Titan2D hazard maps that display the probability of a volcanic flow depth reaching a critical height following a premonitory volcanic eruption event. TITAN2D is a free software application developed by the Geophysical Mass Flow Group at the State University of New York (SUNY) at Buffalo. TITAN2D was developed for the purpose of simulating granular flows (primarily geological mass flows such as debris avalanches and landslides) over digital elevation models (DEM) of natural terrain. The code is designed to help scientists and civil protection authorities assess the risk of hazards due to dry debris flows and avalanches and to mitigate them. TITAN2D combines numerical simulations of a flow with digital elevation data of natural terrain supported through a Geographical Information System (GIS) interface such as GRASS. The workflow that is executed as part of this setup is a port of the Vhub Titan2D Hazard Map Emulator Workflow Tool. The Vhub workflow tool was based on Keith Dalby’s PhD Thesis, Dalbey, K. (2009) from Department of Mechanical and Aerospace Engineering, University at Buffalo titled “Predictive Simulation and Model Based Hazard Maps of Geophysical Mass Flows”.
Figure 1: Titan 2d Hazard Map Emulator Workflow
In order to make the setup seamless for users, Renette developed 2 docker images
-
- A submit host image: This docker image packages Pegasus and HTCondor, and gets preconfigured with the user’s Amazon credentials, and also includes Jupyter notebook interface for users to select the various parameters to tune the Hazard Map calculations
-
- A remote host image: This is a docker image that gets pushed to the Amazon Elastic Container Repository, and contains the software to run Titan2D and the emulator’s uncertainty quantification (UQ) software.
The Jupyter interface allows users to select a volcano, set the material model and pile parameters for the selected volcano’s eruption, setting parameters for the emulator’s UQ software, selecting a VPC subnet and security group, as well as running a script that further configures the JSON-formatted information catalogs that Pegasus requires.
Figure 2: Selection of parameters for Volcano Simulation
The interface also allows you to control the size of the workflow, the number of emulations to be done and allows you to interface the progress of the user workflows all through the Jupyter interface
Figure 3: Launching the workflow through the Jupyter Interface
When a user runs a workflow, all the related entities required for processing on AWS Batch such as a job queue, job definition file and a compute environment are automatically created, and then deleted once the workflow completes. The workflow on completes generates a PDF file with the images of hazard maps similar to the ones below in Figure 4.
The upcoming 5.0.9 and 5.1.0 Pegasus releases, include improvements to the AWS Batch support. These improvements are related to improved polling of job status to get around AWS Batch API limits, update to new BOTO implementation for pegasus-s3, and support for the us-east-1 region. The AWS APIs have slight idiosyncrasies dependent on whether one is running workloads in us-east-1 or any other region that were brought up during this work.
Hazard Report
Date/Time UTC: 2024-11-14 18:26:36
Volcano Name: COLIMA VOLC COMPLEX
Latitude: 19.512727 [degrees north -90 to 90]
Longitude: -103.617241 [degrees east -180 to 180]
Figure 4: Hazard Maps outputs generated by the workflow