Cost Functions

For high flexibility, the pyATF user can use any arbitrary, self-implemented cost function. pyATF allows as a cost function any Python function that takes as input a configuration of tuning parameters and returns a value of type pyatf.tuning_data.Cost (currently defined as float). pyATF interprets the cost function’s return value as the cost that has to be minimized during the auto-tuning process.

A cost function can raise the error pyatf.tuning_data.CostFunctionError if the configuration to measure is invalid (e.g. because the configuration crashes due to too excessive memory usage) — the configuration is then penalized by the search technique (e.g., with a penalty cost). If any other error is raised, the tuning run is aborted by pyATF.

Pre-Implemented Cost Functions

pyATF provides the following pre-implemented cost functions:

class pyatf.cost_functions.generic.CostFunction
CostFunction(run_command: str)
Parameters:

run_command – Run command (executed via subprocess.run).

compile_command(compile_command: str)
Parameters:

compile_command – Compile command (executed via subprocess.run).

cost_file(costfile: str)
Parameters:

cost_file – Path to cost file containing cost as string (must be convertible to pyatf.tuning_data.Cost).

class pyatf.cost_functions.opencl.CostFunction
CostFunction(kernel: pyatf.cost_functions.opencl.Kernel)

Initializes cost function with OpenCL kernel to tune.

platform_id(platform_id: int)

Target OpenCL platform id.

device_id(device_id: int)

Target OpenCL device id.

kernel_args(*kernel_args: numpy.ndarray | numpy.generic)

Kernel’s arguments (specified as instances of numpy.ndarray and numpy.generic).

global_size(gs_0: int | Callable[..., int], gs_1: int | Callable[..., int] = 1, gs_2: int | Callable[..., int] = 1)

Kernel’s 3-dimensional OpenCL global size as arithmetic expressions that may contain tuning parameters.

local_size(ls_0: int | Callable[..., int], ls_1: int | Callable[..., int] = 1, ls_2: int | Callable[..., int] = 1)

Kernel’s 3-dimensional OpenCL local size as arithmetic expressions that may contain tuning parameters.

check_result(index: int, gold_data_or_callable: numpy.ndarray | numpy.generic | Callable, comparator=equality)

Check result for scalar/buffer at position index against gold_data_or_callable.

Parameters:
  • gold_data_or_callable – either of type: i) numpy.ndarray, ii) numpy.generic, or iii) a callable using kernel’s scalar/buffer arguments (of type numpy.generic/numpy.ndarray) to compute a gold scalar/buffer.

  • comparator – used for comparing kernel values against gold values; is a callable that takes two values as input (kernel and gold value) and returns True, iff the values are considered the same.

warmups(warmups: int)

Number of warmups for each kernel run.

evaluations(evaluations: int)

Number of evaluations for each kernel run.

silent(silent: bool)

Silences log messages.

class pyatf.cost_functions.cuda.CostFunction
CostFunction(kernel: pyatf.cost_functions.cuda.Kernel)

Initializes cost function with CUDA kernel to tune.

device_id(device_id: int)

Target CUDA device id.

kernel_args(*kernel_args: numpy.ndarray | numpy.generic)

Kernel’s arguments (specified as instances of numpy.ndarray and numpy.generic).

grid_dim(x: int | Callable[..., int], y: int | Callable[..., int] = 1, z: int | Callable[..., int] = 1)

Kernel’s 3-dimensional CUDA grid dimension as arithmetic expressions that may contain tuning parameters.

block_dim(x: int | Callable[..., int], y: int | Callable[..., int] = 1, z: int | Callable[..., int] = 1)

Kernel’s 3-dimensional CUDA block dimension as arithmetic expressions that may contain tuning parameters.

check_result(index: int, gold_data_or_callable: numpy.ndarray | numpy.generic | Callable, comparator=equality)

Check result for scalar/buffer at position index against gold_data_or_callable.

Parameters:
  • gold_data_or_callable – either of type: i) numpy.ndarray, ii) numpy.generic, or iii) a callable using kernel’s scalar/buffer arguments (of type numpy.generic/numpy.ndarray) to compute a gold scalar/buffer.

  • comparator – used for comparing kernel values against gold values; is a callable that takes two values as input (kernel and gold value) and returns True, iff the values are considered the same.

warmups(warmups: int)

Number of warmups for each kernel run.

evaluations(evaluations: int)

Number of evaluations for each kernel run.

silent(silent: bool)

Silences log messages.

Misc

class pyatf.cost_functions.opencl.Kernel
Kernel(source: str, name: str = 'func', flags: Iterable[str] = None)

OpenCL kernel wrapper.

Parameters:
  • source – OpenCL source code as string; function pyatf.cost_functions.opencl.path( path: str ) can be used to extract source code from file

  • name – kernel name

  • flags – kernel flags

class pyatf.cost_functions.cuda.Kernel
Kernel(source: str, name: str = 'func', flags: Iterable[str] = None)

CUDA kernel wrapper.

Parameters:
  • source – CUDA source code as string; function pyatf.cost_functions.cuda.path( path: str ) can be used to extract source code from file

  • name – kernel name

  • flags – kernel flags