Experimental Results
The experimental results presented in the following are described in detail here.
Preliminary Remark
ATF is successfully used in literature for auto-tuning applications from different important domains, summarized in the following table:
Auto-Tuning Efficiency
ATF compared to state-of-the-art approaches, on NVIDIA GPU and Intel CPU, for application case studies:
CONV
: Convolution,GEMM
: Matrix Multiplication,CCSD(T)
: Coupled Cluster,PRL
: Probabilistic Record Linkage
Generating & Storing & Exploring Constrained Search Spaces
ATF analyzed for each individual phase of the auto-tuning process.
Generating Constrained Search Spaces
Search space generation time (lower is better) of ATF compared to a Constraint Solver (CS) and CLTune.
Here, we use the following abbreviations: s
for seconds; h
for hours; m
for months; c
for centuries.
Storing Constrained Search Spaces
Search space memory footprint (lower is better) of ATF compared to CLTune.
Exploring Constrained Search Spaces
Search space exploration efficiency (lower is better) of ATF compared to CLTune.
ATF for CLTune’s Target Application Class
ATF vs CLTune for CLTune’s own running example 2D Convolution (described in detail here)
ATF for OpenTuner’s Target Application Class
ATF vs OpenTuner for OpenTuner’s own running example GCC Flags (described in detail here)