gpu_macros¶
Note
These macros were provided by Marcin Sokolowski from the PaCER BLINK astroio package. See also Sokolowski et al. 2024 for an application of these macros.
API documentation for gpu_macros.h.
A bunch of macros to choose either CUDA or HIP for GPU operations. Needs the compiler to define __NVCC__ or __HIPCC__ to choose the correct functionality. Also includes a macro to check for errors in GPU operations, and a macro to run a GPU kernel and check for errors.
- Author
Marcin Sokolowski and Cristian Di Pietrantonio, edited by Jack Line
Defines
-
gpuErrorCheckKernel(message, kernel, grid, threads, ...)¶
Takes a GPU kernel, runs it with given arguments, and passes results onto
ErrorCheckto check for errors.All arguements need to run
kernelmust be included after the listed arguments here.kernelis then run viawherekernel <<< grid , threads, 0 >>>(__VA_ARGS__)
__VA_ARGS__passes on the arguments located at...gpuErrorCheckKernelthen passes the stringmessageon toGPUErrorCheck, along with the file name and line number via__FILE__and__LINE__, and checks the errors from bothgpuGetLastError()andgpuDeviceSynchronize()after running the kernel.For example, if
fancy_kerneltakes the argumentsarg1andarg2, to run it with 10 grids of 64 threads, run the following:dim3 grid, threads; grid.x = 10 threads.x = 64 gpuErrorCheckKernel("Call to fancy_kernel", fancy_kernel, grid, threads, arg1, arg2);
- Parameters:
message – [in] Message to report when an error occurs
kernel – [in] Name of kernel to be run
grid – [in] A
dim3containing grid specifications to runkernelwiththreads – [in] A
dim3containing thread specifications to runkernelwith... – [in] All arguments to be passed into
kernel
Functions
-
inline void docGPUErrorCheck(const char *message, gpuError_t code, const char *file, int line, bool abort = EXITERROR)¶
NOTE the actual function is
GPUErrorCheck, but for some goddam reason Doxygen refuses to document it when inside a conditional so I’ve made a copy here. Take a GPU error message (code), and checks whether an error occurred.If an error happened, uses
messageto give more information to the user, along with the decoded CUDA error message. Usesfileandlineto report where the error happened. Optional boolabortmeans you can switch off exiting if an error is found (default true)- Parameters:
message – [in] User supplied error message
code – [in] Error message out of CUDA call (e.g. cudaMalloc)
file – [in] Name of file call was made in
line – [in] Line of file call was made in
abort – [in] If true, exit the CUDA code when an error is found. Defaults to True.
Macros¶
See below for a table of the macros employed in gpucomplex.h. Note that many
of the GPU functions are wrapped in the GPUErrorCheck function (documented
above as docGPUErrorCheck), which checks for errors and exits with a message
including the line number and file name if an error is detected. If there is error
checking in the macro it is listed below. Again, setting -D__NVCC__ or
-D__HIPCC__ at compilations determines whether CUDA or HIP functions
are used.
Macro |
|
|
Error wrapped |
|---|---|---|---|
gpuMalloc |
cudaMalloc |
hipMalloc |
Yes |
gpuHostAlloc |
cudaHostAlloc |
hipHostMalloc |
Yes |
gpuHostAllocDefault |
cudaHostAllocDefault |
0 |
No |
gpuMemcpy |
cudaMemcpy |
hipMemcpy |
Yes |
gpuMemcpyAsync |
cudaMemcpyAsync |
hipMemcpyAsync |
Yes |
gpuMemset |
cudaMemset |
hipMemset |
Yes |
gpuDeviceSynchronize |
cudaDeviceSynchronize |
hipDeviceSynchronize |
No |
gpuMemcpyDeviceToHost |
cudaMemcpyDeviceToHost |
hipMemcpyDeviceToHost |
No |
gpuMemcpyHostToDevice |
cudaMemcpyHostToDevice |
hipMemcpyHostToDevice |
No |
gpuMemcpyDeviceToDevice |
cudaMemcpyDeviceToDevice |
hipMemcpyDeviceToDevice |
No |
gpuFree |
cudaFree |
hipFree |
Yes |
gpuHostFree |
cudaFreeHost |
hipHostFree |
Yes |
gpuStream_t |
cudaStream_t |
hipStream_t |
No |
gpuStreamCreate |
cudaStreamCreate |
hipStreamCreate |
Yes |
gpuStreamDestroy |
cudaStreamDestroy |
hipStreamDestroy |
Yes |
gpuEventCreate |
cudaEventCreate |
hipEventCreate |
Yes |
gpuGetDeviceCount |
cudaGetDeviceCount |
hipGetDeviceCount |
Yes |
gpuGetLastError |
cudaGetLastError |
hipGetLastError |
No |
gpuMemGetInfo |
cudaMemGetInfo |
hipMemGetInfo |
Yes |
gpuMallocHost |
cudaMallocHost |
hipHostMalloc |
Yes |
gpuFreeHost |
cudaFreeHost |
hipFreeHost |
Yes |
gpuGetDeviceProperties |
cudaGetDeviceProperties |
hipGetDeviceProperties |
Yes |
gpuDeviceProp |
cudaDeviceProp |
hipDeviceProp_t |
No |
gpuPeekAtLastError |
cudaPeekAtLastError |
hipPeekAtLastError |
No |
Macro |
|
|
|---|---|---|
gpuCreal |
cuCreal |
hipCreal |
gpuCrealf |
cuCrealf |
hipCrealf |
gpuCimag |
cuCimag |
hipCimag |
gpuCimagf |
cuCimagf |
hipCimagf |
gpuCadd |
cuCadd |
hipCadd |
gpuCmul |
cuCmul |
hipCmul |
gpuCdiv |
cuCdiv |
hipCdiv |
gpuConj |
cuConj |
hipConj |
gpuCsub |
cuCsub |
hipCsub |
gpuCabs |
cuCabs |
hipCabs |
gpuCaddf |
cuCaddf |
hipCaddf |
gpuCsubf |
cuCsubf |
hipCsubf |
gpuCmulf |
cuCmulf |
hipCmulf |
gpuCdivf |
cuCdivf |
hipCdivf |
gpuDoubleComplex |
cuDoubleComplex |
hipDoubleComplex |
gpuFloatComplex |
cuFloatComplex |
hipFloatComplex |
make_gpuDoubleComplex |
make_cuDoubleComplex |
make_hipDoubleComplex |
make_gpuFloatComplex |
make_cuFloatComplex |
make_hipFloatComplex |