acme_0The Accelerated Climate Modeling for Energy (ACME) project is using coupled models of ocean, land, atmosphere and ice to study the complex interaction between climate change and societal energy requirements. One of the flagship workflows of this effort is the fully-coupled climate model running at high resolution. The ACME climate modeling effort involves the interaction of many different software and hardware components distributed across computing resources at several DOE laboratories. As part of the ACME project, many of the workflow activities that were previously done manually are being automated. The goal is to have an automated, end-to-end workflow that provides full data provenance. One important step towards that goal is to automate the small portion of the full workflow that involves running the ACME climate model. The Panorama project is developing a workflow that automates the manual effort involved in monitoring and resubmitting the model code in case of failures, and provides periodic reporting for validation of science outputs. The workflow, illustrated here, divides a large climate simulation into several stages. Each stage completes a portion of the total target simulation time. For example, a 40-year simulation may be divided into 8, 5-year stages. This enables each stage of the workflow to be completed within the maximum walltime permitted for batch jobs on the target DOE leadership class computing system. Restart files generated at the end of each stage are used as input to the next stage in order to continue the simulation. Each stage also produces history files, which are used by the workflow to automatically compute summary data called climatologies. This climatology data can be reviewed periodically by project scientists to ensure that the simulation is progressing as expected, so that problems can be identified, and corrections made, before computing resources are wasted. Both the history files and the climatologies are transferred to HPSS and CADES (open infrastructure) for long-term storage and future analysis.

Scientists: The ACME Collaboration