PUG 2021 - February 23rd and February 25th 2021 – Virtual Event
2020 brought major improvements to Pegasus with the 5.0 release. To keep the momentum up in 2021 and ensure Pegasus is heading in the most beneficial direction for our users, we had our first ever Pegasus Users Group Meeting (PUG 2021). Because of the pandemic, this event was a virtual workshop spread over 2 days.
PUG 2021 aimed to give an opportunity to Pegasus users and collaborators to interact with Pegasus developers and share ideas and provide feedback. The users group meeting was a mix of user experience talks along with technical Pegasus talks, tutorials, and office hours.
If you have any questions/issues, please contact us at pegasus@isi.edu.
We use Slack as our main communication channel and community discussions. Please, click the button below to join us on Slack:
Agenda
Day 1: February 23rd, 2021
Time | Title of Talk | Presenter |
---|---|---|
9:00-9:10am PST noon-12:10pm EST |
Introduction and Meeting Logistics | Wendy Whitcup (USC/ISI) |
9:10-9:20am PST 12:10-12:20pm EST |
Welcome to the Workshop | Ewa Deelman (USC/ISI) |
9:10-9:20am PST 12:10-12:20pm EST |
Pegasus in 5 Minutes | Ryan Tanaka (USC/ISI) |
9:20-9:35am PST 12:20-12:35pm EST |
Pegasus 5.0
Pegasus 5.0 is the latest stable release of Pegasus that was released in November 2020. A key highlight of this release, is a brand new Python3 based Pegasus API that allows users to compose workflows and to control their execution programmatically. This talk will give an overview of the new API and highlight various key improvements introduced that address system usability (including a comprehensive, yet easy-to-navigate documentation, and training), and the development of core functionalities for improving the management and processing of large, distributed data sets, and the management of experiment campaigns defined as ensembles. |
Karan Vahi (USC/ISI) |
9:35-9:50am PST 12:35-12:50pm EST |
Ensemble Manager
Data processing and analytics pipelines often comprise multiple workflows which are executed over an extended period of time. As such, it is necessary to be able to run and monitor groups of similar workflows, also known as ensembles. The Pegasus Ensemble Manager is a tool provided by Pegasus which enables users to create ensembles, dynamically add and stop workflows, and monitor workflow runs. Furthermore, the Pegasus Ensemble Manager can trigger workflow runs based on the arrival of new data. In this talk, we dive into the features provided by the Pegasus Ensemble Manager, discuss potential use cases, and update you on new features, soon to be released. |
Ryan Tanaka (USC/ISI) |
9:50-10:10am PST 12:50-1:10pm EST |
Building an Integrated Assessment Model with Pegasus
Integrated assessment models (IAMs) have become important tools to explore the interactions between different modeled components of socio-environmental systems (SES). One common application is projecting the impacts of potential climate change mitigation policies. Vermont EPSCoR has built an IAM framework using Pegasus for the Lake Champlain basin. The current iteration of the Vermont EPSCoR IAM includes a regional climate model, a land use land change agent-based model, two available hydrology models, and a coupled lake hydrodynamic and water quality model. The Vermont EPSCoR IAM framework is designed to allow bidirectional asynchronous feedbacks between the models and is modular to allow component models to be exchanged for another component model from the same domain. This modular design also allows for smoother upgrades between different versions of the same component model. In this session, the design decisions and tradeoffs made along the way to implement these features will be discussed. |
Patrick J Clemins (University of Vermont) |
10:10-10:30am PST 1:10-1:30pm EST |
Weather Feature Extraction on the Academic Cloud for Drone Route Planning
The concept of Urban Air Mobility (UAM) is rapidly advancing. The skies will soon be filled with small aircraft performing delivery missions, video surveillance, atmospheric sensing, and even providing ad-hoc communications networks for disaster relief efforts. However these vehicles are sensitive to winds, precipitation, and temperature, and their abilities to tolerate these conditions can change as fast as the weather itself. For this reason we have developed a system for monitoring observed and predicted meteorological conditions, extracting areas of risk tailored to the specific and dynamic vehicle parameters of particular flights, and generating suggested routes to efficiently avoid areas of risk as conditions evolve. This is a compute intensive process, with many types of observations and forecast products contributing to the information base. We use the ExoGENI academic cloud as the basis for a scalable infrastructure, and the Pegasus WMS to orchestrate the various product generation and extraction routines which ultimately inform the flight pathing algorithm. Here we describe the overall system, the role of Pegasus in managing the load, and propose interfaces that may be useful going ahead. |
Eric Lyons (University of Massachusetts Amherst) |
10:30-11:00am PST 1:30-2:00pm EST |
Break | |
11:00-11:20 PST 2:00-2:20pm EST |
Leveraging Pegasus to find colliding black holes in the data from the LIGO and Virgo gravitational-wave observatories
In the last 5 years, the Advanced LIGO and Advanced Virgo observatories have opened the field of gravitational-wave astronomy by observing over 50 colliding black holes and neutron stars. These collisions are among the most extreme and energetic events in the Universe and understanding the properties of such collisions, and the environments in which they happen, offer us a new window to understand the Universe's formation and evolution. Analysing the data taken by these observatories to find compact binary mergers relies on a technique called matched-filtering, which is embarrassingly parallel and well suited for high-throughput computing. The PyCBC codebase is one of the primary analysis toolkits used inside (and outside) the LIGO and Virgo collaborations to observe these objects. In this talk we will discuss how Pegasus is used within PyCBC to create the workflows that are used to detect compact binary mergers, and briefly explain the science that this has enabled. |
Ian Harry (LIGO, University of Portsmouth) |
11:20-11:40AM PST 2:20-2:40pm EST |
Connecting Observation and Theory---Black Hole Modeling with Large Scale Synthetic Data Generation
The Event Horizon Telescope (EHT) is a Very-Long-Baseline Interferometry (VLBI) experiment that links together multiple radio telescopes around the globe and captured the first event horizon scale resolution images of a black hole in 2019. Given the sparsity of the array, EHT image reconstructing is not unique, which post a significant challenge in interpreting EHT data. In this talk, I will present how the EHT uses Pegasus to deploy a large scale synthetic data generation pipeline on the Open Science Grid. These synthetic data sets play an important role in EHT science. First, they allow us to verify and validate our data processing and image reconstruction algorithms. Second, they help us to understand how the natural of black hole accretion flows can bias our measurements. Third, they provide a forward modeling pathway for us to compare simulations with observations. I will conclude the talk with a plan on how to use Pegasus for some other EHT workflows. |
Chi-kwan Chan, Michael Janssen (Event Horizon Telescope, University of Arizona) |
11:40-12:00pm PST 2:40-3:00pm EST |
Running Rubin LSST Science pipelines on AWS
The Legacy Survey of Space and Time (LSST, www.lsst.org), operated by the Vera C. Rubin Observatory, is a 10-year astronomical survey due to start operations in 2022 that will image half the sky every three nights. LSST will produce ~20TB of raw data per night which will be calibrated and analyzed in almost real time. Rubin estimates that the total amount of data that will be collected during operations to be about 60 petabytes (PB) from which a 20PB large catalog will be produced. At these data volumes reprocessing even a relatively small subset of data requires significant infrastructure. We describe how we were, by using HTCondor, Pegasus WMS and integrating Amazon Web Services functionality in Rubin's Middleware code, able to execute Rubin LSST Science Pipelines at scale. We discuss challenges, benefits of such a system and describe the performance, scalability and cost of deploying such a system in the cloud. |
Dino Bektešević (LSST, University of Washington) |
12:00-12:20pm PST 3:00-3:20pm EST |
Searching for Dark Matter with XENONnT and Pegasus
XENONnT is a ton-scale liquid xenon time projection chamber that will very soon search for dark matter deep under the Gran Sasso mountains in central Italy. Despite the ultra low background level we expect to reach in XENONnT, the low-energy threshold required to search for dark matter results in a data volume of O(PB) per year, much too large to store on a single site. We thus use Rucio for the data management across multiple storage sites across the world and Pegasus for the distributed processing workflow on the Open Science Grid, which produces the high-level data used for final analyses. At the same time, Pegasus is also used for the Monte Carlo data production, a critical step in the analysis of XENONnT data. In this talk, I will summarize the science goals of XENONnT and how Pegasus is used to help us reach those goals. |
Evan Shockley (University of California, San Deigo) |
12:20-12:35pm PST 3:20-3:35pm EST |
Coffee Break | |
12:35-1:30pm PST 3:35-4:30pm EST |
Mats Rynge |
Day 2: February 25th, 2021
Code of Conduct
This workshop is dedicated to providing a welcoming and supportive environment for all people, regardless of background or identity. By participating in this workshop, participants agree to abide by this Code of Conduct and accept the procedures by which any Code of Conduct incidents are resolved. We do not tolerate behavior that is disrespectful or that excludes, intimidates, or causes discomfort to others. We do not tolerate discrimination or harassment based on characteristics that include, but are not limited to, gender identity and expression, sexual orientation, disability, physical appearance, body size, citizenship, nationality, ethnic or social origin, pregnancy, familial status, veteran status, genetic information, religion or belief (or lack thereof), membership of a national minority, property, age, education, socio-economic status, technical choices, and experience level.
Everyone who participates in workshop activities is required to conform to this Code of Conduct. It applies to all spaces managed by or affiliated with the workshop. Workshop hosts are expected to assist with the enforcement of the Code of Conduct. By participating, participants indicate their acceptance of the procedures by which the workshop resolves any Code of Conduct incidents.
Expected behavior
All participants in the PUG2021 and communications are expected to show respect and courtesy to others. All interactions should be professional regardless of platform: either online or in-person. In order to foster a positive and professional learning environment, we encourage the following kinds of behaviors in all workshop events and platforms:
- Use welcoming and inclusive language
- Be respectful of different viewpoints and experiences
- Gracefully accept constructive criticism
- Focus on what is best for the community
- Show courtesy and respect towards other community members
Unacceptable behavior
Examples of unacceptable behavior by participants at any workshop event/platform include:
- written or verbal comments which have the effect of excluding people on the basis of membership of any specific group
- causing someone to fear for their safety, such as through stalking, following, or intimidation
- violent threats or language directed against another person
- the display of sexual or violent images
- unwelcome sexual attention
- nonconsensual or unwelcome physical contact
- sustained disruption of talks, events, or communications
- insults or put downs
- sexist, racist, homophobic, transphobic, ableist, or exclusionary jokes
- excessive swearing
- incitement to violence, suicide, or self-harm
- continuing to initiate interaction (including photography or recording) with someone after being asked to stop
- publication of private communication without consent
Consequences of Unacceptable behavior
If you believe someone is violating the Code of Conduct, we ask that you report it to any of the workshop organizers. Participants who are asked to stop any inappropriate behavior are expected to comply immediately. If a participant engages in behavior that violates this code of conduct, the organizers may warn the offender, ask them to leave the event or platform, or investigate the Code of Conduct violation and impose appropriate sanctions.
This code of conduct is adopted from the FABRIC Community Workshop and the excellent code of conduct articulated by Software Carpentry.