As the only forecasting centre ECMWF has assumed full
membership status of the ETP4HPC since 2014 with the aim to generate a
concerted European approach to produce sustainable excellence in HPC for weather
and climate prediction. ESCAPE was the first project targeted to address
specific aspects of the ETP4HPC research agenda and its realisation in multi-scale
numerical modelling of the atmosphere. ESCAPE-2 elevates this effort to a much
wider community, the full Earth system and across a wider set of representative
models, and with substantial technical enhancements towards fulfilling
the ETP4HPC SRA's multi-dimensional HPC vision. Technical Research Priorities (and milestones): - HPC system architecture and components: ESCAPE-2
prepares for diverse options to use specialized compute units but does
not contribute directly to hardware development. Based on existing and
pre-release technology to be made available by BULL and HPC centres
associated with the ESCAPE-2 consortium partners, the performance portability
and programmability of ESCAPE-2 benchmarks will be tested, and
performance (with focus on energy efficiency and time-to-solution) will be
assessed based on defined metrics. Hence, ESCAPE-2 will directly impact
on the application side of co-design. Moreover, compilers are an essential
part of HPC installations. In providing weather & climate dwarfs and the
HPCW to compiler developers, systematic and routine testing of vendor provided
compilers against archetypical domain algorithms will aid and accelerate
compiler development cycles and add robustness together with enhanced customer
satisfaction.
Relevant SRA-2 milestones: M-ARCH-1, M-ARCH-7.
- Programming environment: ESCAPE-2
contributes directly to the programming environment by adopting the
effectiveness of domain-specific languages for enhancing productivity,
accelerating development cycles and achieve performance portability in terms of
computing and energy efficiency of key algorithmic components. The design
and implementation of a weather and climate domain-specific language
(DSL) concept based on the tools introduced in ESCAPE will
bridge the chasm between highly complex heritage codes and software layers with
substantial hardware specific design elements. This development is considered
crucial for enabling the mathematical and algorithmic developments to be (a)
useable across different models and (b) applicable to and portable between
existing and future hardware technologies. The collaboration
with BULL as a partner introduces an interface to compiler design in
support of hardware abstraction towards a flexible management of data
locality and concurrency. ESCAPE-2's weather and climate DSL design will
dramatically transform the implementation and adaptation efficiency of weather
and climate prediction applications throughout the FET programme's
co-design phase.
Relevant SRA-2 milestones: M-PROG-API-1,
M-PROG-API-2, M-PROG-API-5.
- Energy and resilience: Enhancing energy
efficiency in weather and climate prediction is essential when
approaching global kilometre-scale simulations and under stringent operational
time constraints. ESCAPE combined a paradigm change for the relevant algorithms
with a concept for employing specialized hardware in a heterogeneous
environment for dedicated tasks dealing with the resolved flow (model dynamics)
and unresolved processes (physical parameterizations). New approaches to
enhancing time-to-solution at the same time as energy-to-solution (e.g. energy
per forecast) represents a key objective of ESCAPE-2. ESCAPE-2 proposes the
development of novel numerical techniques that combine highly effective
large-time-step advection with highly scalable, flexible order spatial
discretization, thus minimizing communication and enhancing data locality
without compromising time-to-solution. The definition of metrics and the employment
of generic VVUQ tools will achieve community-wide applicability by providing
a detailed quantification of performance portability achieved through domain-specific
language implementations.
ESCAPE-2 directly addresses
resilience with hierarchical concepts for fault tolerant solvers that
support application resilience during large-scale parallel
simulations under strict weather and climate work schedule constraints. The
solvers will be tested by implementing a fault detection scheme and iterative
data recovery schemes preserving the numerical performance of the solver.
Relevant SRA-2 milestones: M-ENR-MS-2, M-ENR-FT-6, M-ENR-AR-7,
M-ENR-AR-8.
- Mathematics and algorithms for extreme-scale
HPC systems: ESCAPE-2 aims
to deliver a breakthrough in time-to-solution effectiveness of highly scalable,
flexible-order spatial discretizations, introducing fault tolerant algorithms
supported by hierarchical multigrid tools and a controlled sensitivity to
numerical precision, as well as introducing surrogate neural network models by
essentially moving training periods outside the critical path and by
transforming low-flop operations typical in physical parametrizations to efficient
matrix-multiply operations. Connecting and combining these techniques, ESCAPE-2
will directly address the software gap between complex hardware and
complex applications through its focus on advancing energy efficient
algorithmic building blocks optimized for data flow, data locality and
communication patterns across processors. Weather and climate dwarfs pioneered
in ESCAPE are emerging as an accepted development template for the entire
weather and climate prediction community. Performance portability for emerging
hardware is a second corner stone that will ensure sustainable development
productivity of software cycles with complex weather & climate codes. The
developments will impact the European science community by advancing
productivity and showcasing performance portability with world-leading and
highly complex forecasting models. These models are at the core of operational
service providers and the ESCAPE-2 developments will affect (a)
science implementation roadmaps and (b) future HPC procurements throughout
the community as the community's workloads approach the exascale era.
Relevant SRA-2 milestones: M-ALG-1, M-ALG-2, M-ALG-8, M-ALG-9.
Extreme-scale Demonstrators: - ESCAPE-2 will play an important role in defining
a key European weather and climate prediction application benchmark (HPCW) for
Extreme-scale Demonstrators. ESCAPE-2 will further develop the dwarf concept pioneered in ESCAPE and the
Kronos workload simulator to generate ready-to-use applications for co-design
projects (e.g. EuroEXA, NextGenIO) and Extreme-scale Demonstrators. The
inclusion of the ICON and NEMO models and the establishment of a
weather & climate specific DSL concept --- to allow the implementation
of novel mathematical concepts and algorithms across models and hardware
platforms --- prepares the weather and climate applications for
deployment on the Extreme-scale Demonstrators. The combined outcomes of
ESCAPE, NextGenIO, EuroEXA and ESCAPE-2 will be readily available in phase B of
the demonstrators.
Ecosystem at large - stakeholders and initiatives: - European Extreme Data and Computing
Initiative: Weather and climate prediction represents a dedicated
application area within the current EXDCI project (its work package 3), co-led
by an ESCAPE-2 partner institute (CMCC). ESCAPE-2 will impact the definition of
the science case for the weather and climate community and act as a focal point
for the transition of community models to the exascale with the centre of
excellence (ESiWACE) as a dissemination hub. Note that ESCAPE-2 will be the
only core development project supporting this transition within FET.
- Centres of Excellence in Computing
Applications: ESCAPE-2 develops user-driven application components
that provide scalable benchmarks for weather and climate prediction. This will
be used to support the definition of the use cases that represent the grand
science challenges addressed by the weather and climate prediction centre of
excellence ESiWACE (and its potential successor). ESCAPE-2 partners comprise
the ESiWACE co-leading institutes (DKRZ and ECMWF), and key partners (MPIM,
CMCC, BSC, BULL). ESCAPE-2 will be instrumental in defining the scope of
community models supported by future centres of excellence acting on behalf of
the weather and climate prediction community.
ETP4HPC SRA, Completing the value chain: - ECMWF
combines advanced research and operational applications which benefits both the
application and service layers spanned by ECMWF (including Copernicus services),
its member states and ESCAPE-2 project partners as they represent a significant
portion of the European weather and climate forecasting community. The
push-through of the envisaged ESCAPE-2 developments follows the same impact
route. While the ETP4HPC SRA focuses its recommendations on the industrial
impact, a similar value-chain template applies to environmental application and
service provision.
|
|