Cost Functions¶
For high flexibility, the pyATF user can use any arbitrary, self-implemented cost function. pyATF allows as a cost function any Python function that takes as input a configuration of tuning parameters and returns a value of type pyatf.tuning_data.Cost
(currently defined as float
). pyATF interprets the cost function’s return value as the cost that has to be minimized during the auto-tuning process.
A cost function can raise the error pyatf.tuning_data.CostFunctionError
if the configuration to measure is invalid (e.g. because the configuration crashes due to too excessive memory usage) — the configuration is then penalized by the search technique (e.g., with a penalty cost). If any other error is raised, the tuning run is aborted by pyATF.
Pre-Implemented Cost Functions¶
pyATF provides the following pre-implemented cost functions:
- class pyatf.cost_functions.generic.CostFunction¶
- CostFunction(run_command: str)¶
- Parameters:
run_command – Run command (executed via
subprocess.run
).
- compile_command(compile_command: str)¶
- Parameters:
compile_command – Compile command (executed via
subprocess.run
).
- cost_file(costfile: str)¶
- Parameters:
cost_file – Path to cost file containing cost as string (must be convertible to
pyatf.tuning_data.Cost
).
- class pyatf.cost_functions.opencl.CostFunction¶
- CostFunction(kernel: pyatf.cost_functions.opencl.Kernel)¶
Initializes cost function with OpenCL kernel to tune.
- platform_id(platform_id: int)¶
Target OpenCL platform id.
- device_id(device_id: int)¶
Target OpenCL device id.
- kernel_args(*kernel_args: numpy.ndarray | numpy.generic)¶
Kernel’s arguments (specified as instances of
numpy.ndarray
andnumpy.generic
).
- global_size(gs_0: int | Callable[..., int], gs_1: int | Callable[..., int] = 1, gs_2: int | Callable[..., int] = 1)¶
Kernel’s 3-dimensional OpenCL global size as arithmetic expressions that may contain tuning parameters.
- local_size(ls_0: int | Callable[..., int], ls_1: int | Callable[..., int] = 1, ls_2: int | Callable[..., int] = 1)¶
Kernel’s 3-dimensional OpenCL local size as arithmetic expressions that may contain tuning parameters.
- check_result(index: int, gold_data_or_callable: numpy.ndarray | numpy.generic | Callable, comparator=equality)¶
Check result for scalar/buffer at position
index
againstgold_data_or_callable
.- Parameters:
gold_data_or_callable – either of type: i)
numpy.ndarray
, ii)numpy.generic
, or iii) a callable using kernel’s scalar/buffer arguments (of typenumpy.generic
/numpy.ndarray
) to compute a gold scalar/buffer.comparator – used for comparing kernel values against gold values; is a callable that takes two values as input (kernel and gold value) and returns True, iff the values are considered the same.
- warmups(warmups: int)¶
Number of warmups for each kernel run.
- evaluations(evaluations: int)¶
Number of evaluations for each kernel run.
- silent(silent: bool)¶
Silences log messages.
- class pyatf.cost_functions.cuda.CostFunction¶
- CostFunction(kernel: pyatf.cost_functions.cuda.Kernel)¶
Initializes cost function with CUDA kernel to tune.
- device_id(device_id: int)¶
Target CUDA device id.
- kernel_args(*kernel_args: numpy.ndarray | numpy.generic)¶
Kernel’s arguments (specified as instances of
numpy.ndarray
andnumpy.generic
).
- grid_dim(x: int | Callable[..., int], y: int | Callable[..., int] = 1, z: int | Callable[..., int] = 1)¶
Kernel’s 3-dimensional CUDA grid dimension as arithmetic expressions that may contain tuning parameters.
- block_dim(x: int | Callable[..., int], y: int | Callable[..., int] = 1, z: int | Callable[..., int] = 1)¶
Kernel’s 3-dimensional CUDA block dimension as arithmetic expressions that may contain tuning parameters.
- check_result(index: int, gold_data_or_callable: numpy.ndarray | numpy.generic | Callable, comparator=equality)¶
Check result for scalar/buffer at position
index
againstgold_data_or_callable
.- Parameters:
gold_data_or_callable – either of type: i)
numpy.ndarray
, ii)numpy.generic
, or iii) a callable using kernel’s scalar/buffer arguments (of typenumpy.generic
/numpy.ndarray
) to compute a gold scalar/buffer.comparator – used for comparing kernel values against gold values; is a callable that takes two values as input (kernel and gold value) and returns True, iff the values are considered the same.
- warmups(warmups: int)¶
Number of warmups for each kernel run.
- evaluations(evaluations: int)¶
Number of evaluations for each kernel run.
- silent(silent: bool)¶
Silences log messages.
Misc¶
- class pyatf.cost_functions.opencl.Kernel¶
- Kernel(source: str, name: str = 'func', flags: Iterable[str] = None)¶
OpenCL kernel wrapper.
- Parameters:
source – OpenCL source code as string; function
pyatf.cost_functions.opencl.path( path: str )
can be used to extract source code from filename – kernel name
flags – kernel flags
- class pyatf.cost_functions.cuda.Kernel¶
- Kernel(source: str, name: str = 'func', flags: Iterable[str] = None)¶
CUDA kernel wrapper.
- Parameters:
source – CUDA source code as string; function
pyatf.cost_functions.cuda.path( path: str )
can be used to extract source code from filename – kernel name
flags – kernel flags