Successful Workshop Fault tolerant algorithms and resiliency approaches

28 Jan 2019
ESCAPE-2 organised a workshop oh fault tolerant algorithms and resiliency approaches on the 23rd and 24th of January 2019 in Milan, Italy. The workshop consisted of a first day of seminars by experts in systems resilience and fault-tolerant numerical algorithms and a second day of scientific discussions of the same experts with project participants. The presentations gave a detailed picture of the state of the art in the field and established connections with operational workflows and numerical algorithms used in atmospheric applications. 

During the discussion sessions, participants explored more in detail how to complement existing numerical weather and climate prediction models with resilience and fault-tolerance techniques. Specific recommendations included benchmarking NWP data volume and operational requirements, pairing fault-tolerant algorithms with system resilience in consistent workflows, coordinating with vendors to provide detailed hardware fault information, and embedding fault-tolerance in domain-specific language programming paradigms. 

The conclusions of the workshop will feature in a white paper to be submitted as an ESCAPE-2 project deliverable, and will inform the investigation of hardware and software resiliency tools within existing and future ESCAPE-2 project dwarfs. Presentations can be found on the workshop page.