A key concern of the Exascale Computing Project is the high cost of developing and maintaining performance-portable applications for diverse exascale architectures, including many-core CPUs and GPUs. To address this issue, we are developing an approach that separates a high-level C/C++/FORTRAN implementation from architecture-specific implementation, optimization, and tuning.
This approach will enable exascale application developers to express and maintain a single, portable implementation of their computation – legal code that can be compiled and run by using standard tools. The resulting autotuning compiler and search framework will transform the baseline code into a collection of highly optimized implementations. Thus, autotuning will mitigate the need for extensive manual tuning.