MOANA: Modeling and Analyzing I/O Variability in Parallel System Experimental Design
Thomas Lux, Layne T. Watson
Abstract
Exponential increases in complexity and scale make variability a growing threat to sustaining HPC performance at exascale. Performance variability in HPC I/O is common, acute, and formidable. We take the first step towards comprehensively studying linear and nonlinear approaches to modeling HPC I/O system variability in an effort to demonstrate that variability is often a predictable artifact of system design. Using over 8 months of data collection on 6 identical systems, we propose and validate a modeling and analysis approach (MOANA) that predicts HPC I/O variability for thousands of software and hardware configurations on highly parallel shared-memory systems. Our findings indicate nonlinear approaches to I/O variability prediction are an order of magnitude more accurate than linear regression techniques. We demonstrate the use of MOANA to accurately predict the confidence intervals of unmeasured I/O system configurations for a given number of repeat runs - enabling users to quantitatively balance experiment duration with statistical confidence.
People
Publication Details
- Date of publication:
- January 31, 2019
- Journal:
- IEEE Transactions on Parallel and Distributed Systems
- Page number(s):
- 1843-1856
- Volume:
- 30
- Issue Number:
- 8
- Publication note:
Kirk W. Cameron, Ali Anwar, Yue Cheng, Li Xu, Bo Li, Uday Ananth, Jon Bernard, Chandler Jearls, Thomas Lux, Yili Hong, Layne T. Watson, Ali Raza Butt: MOANA: Modeling and Analyzing I/O Variability in Parallel System Experimental Design. IEEE Trans. Parallel Distributed Syst. 30(8): 1843-1856 (2019)