"Ask not what your compiler can do for you, ask what you can do for your compiler."
--John Levesque, Director of Cray’s Supercomputing Centers of Excellence
The next decade of computationally intense computing lies with more powerful multi/manycore nodes where processors share a large memory space. These nodes will be the building block for systems that range from a single node workstation up to systems approaching the exaflop regime. The node itself will consist of 10’s to 100’s of MIMD (multiple instruction, multiple data) processing units with SIMD (single instruction, multiple data) parallel instructions. Since a standard, affordable memory architecture will not be able to supply the bandwidth required by these cores, new memory organizations will be introduced. These new node architectures will represent a significant challenge to application developers.
Programming for Hybrid Multi/Manycore MPP Systems
attempts to briefly describe the current state-of-the-art in programming these systems, and proposes an approach for developing a performance-portable application that can effectively utilize all of these systems from a single application. The book starts with a strategy for optimizing an application for multi/manycore architectures. It then looks at the three typical architectures, covering their advantages and disadvantages.
The next section of the book explores the other important component of the target―the compiler. The compiler will ultimately convert the input language to executable code on the target, and the book explores how to make the compiler do what we want. The book then talks about gathering runtime statistics from running the application on the important problem sets previously discussed.
How best to utilize available memory bandwidth and virtualization is covered next, along with hybridization of a program. The last part of the book includes several major applications, and examines future hardware advancements and how the application developer may prepare for those advancements.
Table of Contents
CHAPTER 1 Introduction
CHAPTER 2 Determining an Exaflop Strategy
CHAPTER 3 Target Hybrid Multi/Manycore System
CHAPTER 4 How Compilers Optimize Programs
CHAPTER 5 Gathering Runtime Statistics for Optimizing
CHAPTER 6 Utilization of Available Memory Bandwidth
CHAPTER 7 Vectorization
CHAPTER 8 Hybridization of an Application
CHAPTER 9 Porting Entire Applications
CHAPTER 10 Future Hardware Advancements
APPENDIX A Supercomputer Cache Architectures
APPENDIX B The Translation Look-Aside Buffer
APPENDIX C Command Line Options and Compiler Directives
APPENDIX D Previously Used Optimizations
APPENDIX E I/O Optimization
APPENDIX F Terminology
APPENDIX G 12-Step Process