A methodology correlating code optimizations with data memory accesses, execution time and energy consumption

Vasilios Kelefouras;Karim Djemame;Vasilios Kelefouras;Karim Djemame;

doi:doi:10.1007/s11227-019-02880-z

A methodology correlating code optimizations with data memory accesses, execution time and energy consumption

Clicks: 183

ID: 266606

2019

Free PDF

Article Quality & Performance Metrics

Overall Quality Improving Quality

0.0 /100

Combines engagement data with AI-assessed academic quality

Reader Engagement Emerging Content

7.2 /100

24 views

24 readers

AI Quality Assessment

Not analyzed

Abstract

EN
- Turkish
- Spanish
- Portuguese
- Arabic
- Chinese
- French
- German
- Indonesian
- Russian
- Thai

The advent of data proliferation and electronic devices gets low execution time and energy consumption software in the spotlight. The key to optimizing software is the correct choice, order as well as parameters of optimization transformations that has remained an open problem in compilation research for decades for various reasons. First, most of the transformations are interdependent and thus addressing them separately is not effective. Second, it is very hard to couple the transformation parameters to the processor architecture (e.g., cache size) and algorithm characteristics (e.g., data reuse); therefore, compiler designers and researchers either do not take them into account at all or do it partly. Third, the exploration space, i.e., the set of all optimization configurations that have to be explored, is huge and thus searching is impractical. In this paper, the above problems are addressed for data-dominant affine loop kernels, delivering significant contributions. A novel methodology is presented reducing the exploration space of six code optimizations by many orders of magnitude. The objective can be execution time (ET), energy consumption (E) or the number of L1, L2 and main memory accesses. The exploration space is reduced in two phases: firstly, by applying a novel register blocking algorithm and a novel loop tiling algorithm and secondly, by computing the maximum and minimum ET/E values for each optimization set. The proposed methodology has been evaluated for both embedded and general-purpose CPUs and for seven well-known algorithms, achieving high memory access, speedup and energy consumption gain values (from 1.17 up to 40) over gcc compiler, hand-written optimized code and Polly. The exploration space from which the near-optimum parameters are selected is reduced from 17 up to 30 orders of magnitude.

Reference Key	kelefouras2019thea Use this key to autocite in the manuscript while using SciMatic Manuscript Manager or Thesis Manager
Authors	Vasilios Kelefouras;Karim Djemame;Vasilios Kelefouras;Karim Djemame;
Journal	the journal of supercomputing
Year	2019
DOI	doi:10.1007/s11227-019-02880-z Searching for DOI...
URL	https://doi.org/10.1007%2Fs11227-019-02880-z https://doi.org/doi:10.1007/s11227-019-02880-z
Keywords	computer science general programming languages compilers interpreters processor architectures

Citations

No citations found. To add a citation, contact the admin at info@scimatic.org

Comments

Login to comment Register

No comments yet. Be the first to comment on this article.