Experimental Results

The experimental results presented in the following are described in detail here.

Preliminary Remark

ATF is successfully used in literature for auto-tuning applications from different important domains, summarized in the following table:

Applications Auto-Tuned via ATF

Auto-Tuning Efficiency

ATF compared to state-of-the-art approaches, on NVIDIA GPU and Intel CPU, for application case studies:

  1. CONV: Convolution,
  2. GEMM: Matrix Multiplication,
  3. CCSD(T): Coupled Cluster,
  4. PRL: Probabilistic Record Linkage

Applications Auto-Tuned via ATF Applications Auto-Tuned via ATF

Generating & Storing & Exploring Constrained Search Spaces

ATF analyzed for each individual phase of the auto-tuning process.

Generating Constrained Search Spaces

Search space generation time (lower is better) of ATF compared to a Constraint Solver (CS) and CLTune. Here, we use the following abbreviations: s for seconds; h for hours; m for months; c for centuries.

Applications Auto-Tuned via ATF

Storing Constrained Search Spaces

Search space memory footprint (lower is better) of ATF compared to CLTune.

Applications Auto-Tuned via ATF

Exploring Constrained Search Spaces

Search space exploration efficiency (lower is better) of ATF compared to CLTune.

Applications Auto-Tuned via ATF

ATF for CLTune’s Target Application Class

ATF vs CLTune for CLTune’s own running example 2D Convolution (described in detail here)

ATF vs CLTune

ATF for OpenTuner’s Target Application Class

ATF vs OpenTuner for OpenTuner’s own running example GCC Flags (described in detail here)

ATF vs OpenTuner