Publications
A resiliency model for high performance infrastructure based on logical encapsulation
Abstract
An emerging trend in distributed systems is the creation of dynamically provisioned heterogeneous high performance platforms that include the co-allocation of both virtualized computing and network attached storage volumes offering NAS and SAN level data services. These high performance computing environments support parallel applications performing traditional file system operations. As with any parallel platform the ability to continue computation in the face of component failures is an important characteristic. Achieving resiliency in heterogeneous environments presents unique challenges and opportunities not found in homogeneous aggregations of computing resources. We present a logical encapsulation model for heterogeneous high performance infrastructure, which enables a reactive resiliency approach for federations of virtual machines and externally hosted physical storage volumes …
- Date
- June 18, 2012
- Authors
- James J Moore, Carl Kesselman
- Book
- Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
- Pages
- 283-294