Pegasus and AtlanticWave-SDX Help Orchestrating Science Applications

with No Comments

Many new technologies and new paradigms are emerging in the research community to support science workflows, including the use of Data Transfer Nodes (DTNs), distributed compute and data infrastructure, research testbeds, and new inter-domain federated orchestrators. Workflow management systems (WMS), like Pegasus, are evolving to be more resource aware in order to make more intelligent decisions on dispatching and coordinating computation and data transfer jobs. 

For global-scale scientific applications, Open Exchange Points (OXP) [1] are the critical communication infrastructure to connect the compute resources and the data intensive instruments that are often located in different continents [2]. For nearly a decade,  Software Defined Networking (SDN) [3] has been introduced into the OXP domains to create a Software Defined eXchange (SDX) [4] that can dramatically improve the network utilization, programmability, and the efficiency to manage network resources. This opens the door to allow the resource orchestrator and application software to make the best use of the networking capabilities in real time.

The collaboration between the Pegasus and AtlanticWave-SDX (AW-SDX) projects demonstrates our vision to develop an end-to-end resource orchestration framework that will enable intelligent interactions between the scientific workflows and the network substrate.      

The AtlanticWave-SDX (AW-SDX) project (NSF OAC-1451024 and 1451024, 2015-2025) changed how domain scientists and network engineers can use Open Exchange Points (OXPs) to support science workflows. The AW-SDX team developed an inter-domain SDX platform to link a large number of data sources and scientific users to the advanced distributed OXPs across Miami, Atlanta, Latin America, and South Africa towards an open, innovative platform while continuously enhancing their production capacity and service capabilities.

AtlanticWave-SDX

Challenges in Wide-Area Networks (WAN) are being addressed by the application of network virtualization and network programmability solutions using SDN and Network Functions Virtualization (NFV) [5] technologies. Network virtualization and programmable networks are two key enablers that facilitate agile, fast, and more economical network infrastructures, as well as service development, deployment, and provisioning. Typically, SDNs that cross multiple domains are constructed manually, involving significant coordination and effort by network operators. OXPs serve as intersections for connecting and facilitating the exchange of data between different Research and Education domain networks. They are critical cyberinfrastructure in the interconnectivity of data over long geographic distances, switching data flows from one research and education network to the next, to its destination. They require this unique effort in networking science to perform efficiently and effectively. Operationalizing the transit of data flows across OXPs is increasingly important to minimize the impact from events on network services (hardware failures and soft failures). End-to-end network paths for these data flows are not under the control of any individual organization. Multiple R&E network operators collaborate and must coordinate to establish geographically distributed end-to-end paths – an effort that can take days to weeks.

AW-SDX serves as a distributed experimental SDX to support research, experimental deployments, prototyping, and interoperability testing on national and international scales. AW-SDX is a response to the growing demand to support end-to-end services spanning multiple SDN domains. AtlanticWave-SDX is comprised of two components: (1) a network infrastructure development component to bridge 100G of network capacity between R&E backbone networks in the U.S. and S. America, as shown in Figure 1; and (2) an innovation component to build a distributed intercontinental experimental SDX between the U.S. and S. America, by leveraging open exchange point resources at SoX (Atlanta), AMPATH (Miami), and Southern Light (São Paulo, Brazil), as shown in Figure 2. 

Figure 1. AW-SDX as the bridge between U.S. and S. America

Figure 2. AtlanticWave-SDX 2.0 Architecture

Pegasus Orchestrating Science Workflows on AW-SDX

The Pegasus Workflow Management System (Pegasus WMS) bridges the scientific domain and the execution environment, such as compute, networking and storage infrastructures, by automatically mapping high-level workflow descriptions onto distributed resources. AW-SDX leverages Pegasus’ capabilities to orchestrate the science application execution on the underlying SDX infrastructures. As a demonstration, two science workflows are used to show that high performance, data-intensive science workflows can seamlessly run with Pegasus WMS on AW-SDX infrastructures.

1000Genome

The 1000 genomes project [6] provides a reference for human variation, having reconstructed the genomes of 2,504 individuals across 26 different populations. The test case used in this work identifies mutational overlaps using data from the 1000 genomes project in order to provide a null distribution for rigorous statistical evaluation of potential disease-related mutations.

This 1000Genome testing workflow is composed of five different tasks: (1) individuals – fetches and parses the Phase 3 data from the 1000 genomes project per chromosome; (2) populations — fetches and parses five super populations (African, Mixed American, East Asian, European, and South Asian) and a set of all individuals; (3) sifting — computes the SIFT scores of all of the SNPs (single nucleotide polymorphisms) variants, as computed by the Variant Effect Predictor; (4) pair overlap mutations — measures the overlap in mutations (SNPs) among pairs of individuals; and (5) frequency overlap mutations — calculates the frequency of overlapping mutations across subsamples of certain individuals.

Figure 3. 1000Genome Workflow

CASA Nowcast Workflow

The Collaborative and Adaptive Sensing of the Atmosphere (CASA) [7] was formed to study the lower atmosphere with networks of high resolution Doppler weather radars  with  the goal to improve severe weather awareness. CASA Nowcast workflows are short-term advection forecasts that use  observed reflectivity data from multiple radars, composite them for a certain number of minutes, and project into the future by estimating the derivatives of motion and intensity with respect to time. The Pegasus workflow for Nowcasts was developed under the NSF DyNamo project [8] and in its current version, the workflow is processing data every minute and produces a series of gridded reflectivity data, where each grid corresponds to the projected conditions of each minute within the span of the next 30 minutes into the future. An example of the calculated reflectivity is presented in Figure 4.

Figure 4. CASA Nowcasting Workflow and Radar Image

 Figure 5. The AW-SDX Data Transfer Dashboard

In the demonstration, both 1000Genome and CASA Nowcast workflows are able to seamlessly run on AW-SDX real-time provisioned networking and compute resources. Pegasus plays a key role in the orchestration and execution of both workflows. With the container support from Pegasus, the science workflows can be effectively deployed on the inter-domain application environment that is interconnected by SDX. Pegasus also provides a strong online monitoring system that collects real-time performance data from low-level system profiling tools and transfer services. The data can be displayed on the interactive dashboard, shown in Figure 5.

AtlanticWave-SDX Team: Julio E. Ibarra, Heidi L. Morgan, Jeronimo A. Bezerra, Vasilka Chergarova, Yufeng Xin, Mert Cevik, Cong Wang, Michael Stealey, Lisandro Granville, Marcos Schwarz

DyNamo Team: Anirban Mandal, Ewa Deelman, Michael Zink, Ivan Rodero, J. J. Villalobos, Paul Ruth, Eric Lyons, George Papadimitriou, Cong Wang, Komal Thareja

Acknowledgements: AW-SDX is supported by the National Science Foundation, award numbers #1451024 and #2029278.
DyNamo is supported by the National Science Foundation, award number #1826997.

Authors: Cong Wang (Renaissance Computing Institute), Yufeng Xin (Renaissance Computing Institute), Heidi Morgan (USC – Information Sciences Institute), Julio Ibarra (Florida International University)

References

[1] P. L. Ventre et al., “GEANT SDX – SDN based Open eXchange Point,” 2016 IEEE NetSoft Conference and Workshops (NetSoft), Seoul, Korea (South), 2016, pp. 345-346

[2] Vera C. Rubin Observatory, https://www.lsst.org/.

[3] Keith Kirkpatrick. 2013. Software-defined networking. Commun. ACM 56, 9 (September 2013), 16–19.

[4] Arpit Gupta, Laurent Vanbever, Muhammad Shahbaz, Sean P. Donovan, Brandon Schlinker, Nick Feamster, Jennifer Rexford, Scott Shenker, Russ Clark, and Ethan Katz-Bassett. 2014. SDX: a software defined internet exchange. SIGCOMM Comput. Commun. Rev. 44, 4 (October 2014), 551–562.

[5] R. Mijumbi, J. Serrat, J. Gorricho, S. Latre, M. Charalambides and D. Lopez, “Management and orchestration challenges in network functions virtualization,” in IEEE Communications Magazine, vol. 54, no. 1, pp. 98-105, January 2016.

[6] Siva, N. 1000 Genomes project. Nat Biotechnol 26, 256 (2008).

[7] B. Philips, D. Pepyne, D. Westbrook, E. Bass, J. Brotzge, W. Diaz, K. Kloesel, J. Kurose, D. McLaughlin, H. Rodriguez, and M. Zink. “Integrating End User Needs into System Design and Operation: The Center for Collaborative Adaptive Sensing of the Atmosphere (CASA).” In Proceedings of Applied Climatol., American Meteorological Society Annual Meeting, San Antonio, TX, USA, 2007.

[8] Lyons, Eric J. and Papadimitriou, George and Wang, Cong and Thareja, Komal and Ruth, Paul and Villalobos, J. and Rodero, Ivan and Deelman, Ewa and Zink, Michael and Mandal, Anirban. “Toward a Dynamic Network-Centric Distributed Cloud Platform for Scientific Workflows: A Case Study for Adaptive Weather Sensing,” 2019 15th International Conference on eScience (eScience), 2019. doi:10.1109/eScience.2019.00015