|
StarPU Handbook
|
The behavior of the StarPU library and tools may be tuned thanks to the following configure options.
Enable checking that spinlocks are taken and released properly.
Increase the verbosity of the debugging messages. This can be disabled at runtime by setting the environment variable STARPU_SILENT to any value.
$ STARPU_SILENT=1 ./vector_scal
Specify tests and examples should be run on a smaller data set, i.e allowing a faster execution time
Enable some exhaustive checks which take a really long time.
Specify hwloc should be used by StarPU. hwloc should be found by the means of the tool pkg-config.
prefix Specify hwloc should be used by StarPU. hwloc should be found in the directory specified by prefix
doxygen and latex (plus the packages latex-xcolor and texlive-latex-extra). Additionally, the script configure recognize many variables, which can be listed by typing ./configure –help. For example, ./configure NVCCFLAGS="-arch sm_13" adds a flag for the compilation of CUDA kernels.
count Use at most count CPU cores. This information is then available as the macro ::STARPU_MAXCPUS.
Disable the use of CPUs of the machine. Only GPUs etc. will be used.
count Use at most count CUDA devices. This information is then available as the macro STARPU_MAXCUDADEVS.
Disable the use of CUDA, even if a valid CUDA installation was detected.
prefix Search for CUDA under prefix, which should notably contain the file include/cuda.h.
dir Search for CUDA headers under dir, which should notably contain the file cuda.h. This defaults to /include appended to the value given to --with-cuda-dir.
dir Search for CUDA libraries under dir, which should notably contain the CUDA shared libraries—e.g., libcuda.so. This defaults to /lib appended to the value given to --with-cuda-dir.
count Use at most count OpenCL devices. This information is then available as the macro STARPU_MAXOPENCLDEVS.
prefix Search for an OpenCL implementation under prefix, which should notably contain include/CL/cl.h (or include/OpenCL/cl.h on Mac OS).
dir Search for OpenCL headers under dir, which should notably contain CL/cl.h (or OpenCL/cl.h on Mac OS). This defaults to /include appended to the value given to --with-opencl-dir.
dir Search for an OpenCL library under dir, which should notably contain the OpenCL shared libraries—e.g. libOpenCL.so. This defaults to /lib appended to the value given to --with-opencl-dir.
Enable considering the provided OpenCL implementation as a simulator, i.e. use the kernel duration returned by OpenCL profiling information as wallclock time instead of the actual measured real time. This requires simgrid support.
count Allow for at most count codelet implementations for the same target device. This information is then available as the macro ::STARPU_MAXIMPLEMENTATIONS macro.
count Allow for at most count scheduling contexts This information is then available as the macro ::STARPU_NMAX_SCHED_CTXS.
Disable asynchronous copies between CPU and GPU devices. The AMD implementation of OpenCL is known to fail when copying data asynchronously. When using this implementation, it is therefore necessary to disable asynchronous data transfers.
Disable asynchronous copies between CPU and OpenCL devices. The AMD implementation of OpenCL is known to fail when copying data asynchronously. When using this implementation, it is therefore necessary to disable asynchronous data transfers.
Disable the SOCL extension (SOCL OpenCL Extensions). By default, it is enabled when an OpenCL implementation is found.
Disable the StarPU-Top interface (StarPU-Top Interface). By default, it is enabled when the required dependencies are found.
Disable the GCC plug-in (C Extensions). By default, it is enabled when the GCC compiler provides a plug-in support.
path Use the compiler mpicc at path, for StarPU-MPI. (MPI Support).
(see ../../src/datawizard/datastats.c) Enable gathering of various data statistics (Data Statistics).
Define the maximum number of buffers that tasks will be able to take as parameters, then available as the macro STARPU_NMAXBUFS.
Enable the use of a data allocation cache to avoid the cost of it with CUDA. Still experimental.
Enable the use of OpenGL for the rendering of some examples.
1.8.7