Chapter 15. Jupyter Notebooks

15.1. Introduction

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Its flexible and portable format resulted in a rapidly adoption by the research community to share and interact with experiments. Jupyter Notebooks has a strong potential to reduce the gap between researchers and the complex knowledge required to run large-scale scientific workflows via a programmatic high-level interface to access/manage workflow capabilities.

The Pegasus-Jupyter integration aims to facilitate the usage of Pegasus via Jupyter notebooks. In addition to easiness of usage, notebooks foster reproducibility (all the information to run an experiment is in a unique place) and reuse (notebooks are portable if running in equivalent environments). Since Pegasus 4.8.0, a Python API to declare and manage Pegasus workflows via Jupyter has been provided. The user can create a notebook and declare a workflow using the Pegasus DAX API, and then create an instance of the workflow for execution. This API encapsulates most of Pegasus commands (e.g., plan, run, statistics, among others), and also allows workflow creation, execution, and monitoring. The API also provides mechanisms to define Pegasus catalogs (sites, replica, and transformation), as well as to generate tutorial example workflows.