|
|
 |
|
Viewing report
|
|
 |
 |
Fault Tolerance for High-Performance
Applications. Edition No. 1
VDM Publishing House, April 2009, Pages: 240
In the last years parallel computing has increasingly exploited high-level models of structured parallelism, an example of which are algorithmic skeletons. This trend has been motivated by the properties of these models, which can be used to derive several optimizations at the implementation level. In this thesis we study the properties of structured parallel models useful for providing a fault tolerance support, oriented towards High-Performance applications. Unlike existing approaches, we make a step towards a more abstract and general viewpoint highlighting the properties of structured parallel models interesting for fault tolerance purposes. We introduce a modeling tool for structured constructs and we apply it to two notable examples of parallel constructs, deriving abstract properties. We show how the derived properties can be used to introduce an optimized fault tolerance support based on checkpointing and rollback-recovery techniques. The exploitation of structured parallel constructs allow us to derive performance models of computation describing the costs of fault tolerance.
|
 |
|
|