fupermod: Functional Performance Models of heterogeneous processors
Classes | |
struct | fupermod_precision |
struct | fupermod_point |
struct | fupermod_benchmark |
Defines | |
#define | FUPERMOD_MAX_TIME 1e9 |
Typedefs | |
typedef void(* | fupermod_benchmark_free )(fupermod_benchmark *benchmark) |
Functions | |
fupermod_precision | fupermod_precision_defaults () |
double | fupermod_ci (double cl, int reps, double *t) |
int | fupermod_point_read (FILE *stream, fupermod_point *point) |
int | fupermod_point_write (FILE *stream, fupermod_point *point, fupermod_complexity complexity) |
int | fupermod_point_node (fupermod_point *point, fupermod_point *point_node, MPI_Comm comm_intra, int root_intra) |
fupermod_benchmark * | fupermod_benchmark_basic_alloc (fupermod_kernel *kernel, fupermod_process_conf conf) |
void | fupermod_benchmark_basic_free (fupermod_benchmark *benchmark) |
fupermod_benchmark * | fupermod_benchmark_emulate_alloc (fupermod_kernel *kernel, struct fupermod_model *model) |
void | fupermod_benchmark_emulate_free (fupermod_benchmark *benchmark) |
Variables | |
int | fupermod_verbose |
Detailed Description
This module provides API for benchmarking. The key data structure is fupermod_benchmark : it encapsulates the kernel to benchmark and provides the measurement method. There are two methods implemented:
- basic parallel benchmark (use fupermod_benchmark_basic_alloc and fupermod_benchmark_basic_free)
- emulation via provided FPMs (use fupermod_benchmark_emulate_alloc and fupermod_benchmark_emulate_free)
Define Documentation
#define FUPERMOD_MAX_TIME 1e9 |
Maximum time (30 years)
Typedef Documentation
typedef void(* fupermod_benchmark_free)(fupermod_benchmark *benchmark) |
Deallocates the benchmark
Function Documentation
fupermod_precision fupermod_precision_defaults | ( | ) |
Returns default precision settings
double fupermod_ci | ( | double | cl, | |
int | reps, | |||
double * | t | |||
) |
Returns a confidence interval that contains the average execution time with a certain probability: .
- Note:
- We assume that the execution times form an independent sample from a normally distributed population, and use t distribution to estimate confidence interval.
- Parameters:
-
cl confidence level reps number of measurements (should be > 1) t array of reps measurement results
- Returns:
- confidence interval
int fupermod_point_read | ( | FILE * | stream, | |
fupermod_point * | point | |||
) |
Reads data point from file
int fupermod_point_write | ( | FILE * | stream, | |
fupermod_point * | point, | |||
fupermod_complexity | complexity | |||
) |
Writes data point to file
int fupermod_point_node | ( | fupermod_point * | point, | |
fupermod_point * | point_node, | |||
MPI_Comm | comm_intra, | |||
int | root_intra | |||
) |
Calculates the data point of a node from the data points of the processes running on the node
- Parameters:
-
point data point of a process point_node data point of a multiprocessor/multicore (significant only at root of intra communicator) comm_intra intra communicator root_intra root of intra communicator
fupermod_benchmark* fupermod_benchmark_basic_alloc | ( | fupermod_kernel * | kernel, | |
fupermod_process_conf | conf | |||
) |
Allocates the benchmark that executes the kernel at all processes.
- Parameters:
-
kernel Kernels are synchronised and executed the same number of times. conf Describes the type of device benchmark is executed on (cpu or gpu) The result, the data point, is returned separately for each process.
void fupermod_benchmark_basic_free | ( | fupermod_benchmark * | benchmark | ) |
Deallocates the basic benchmark
fupermod_benchmark* fupermod_benchmark_emulate_alloc | ( | fupermod_kernel * | kernel, | |
struct fupermod_model * | model | |||
) |
Allocates the benchmark that looks up the existing models at all processes
void fupermod_benchmark_emulate_free | ( | fupermod_benchmark * | benchmark | ) |
Deallocates the emulate benchmark
Variable Documentation
int fupermod_verbose |
Verbose level (default: 0 - none)