gpu_macros

Note

These macros were provided by Marcin Sokolowski from the PaCER BLINK astroio package. See also Sokolowski et al. 2024 for an application of these macros.

API documentation for gpu_macros.h.

A bunch of macros to choose either CUDA or HIP for GPU operations. Needs the compiler to define __NVCC__ or __HIPCC__ to choose the correct functionality. Also includes a macro to check for errors in GPU operations, and a macro to run a GPU kernel and check for errors.

Author

Marcin Sokolowski and Cristian Di Pietrantonio, edited by Jack Line

Defines

gpuErrorCheckKernel(message, kernel, grid, threads, ...)

Takes a GPU kernel, runs it with given arguments, and passes results onto ErrorCheck to check for errors.

All arguements need to run kernel must be included after the listed arguments here. kernel is then run via

    kernel <<< grid , threads, 0 >>>(__VA_ARGS__)
where __VA_ARGS__ passes on the arguments located at ...

gpuErrorCheckKernel then passes the string message on to GPUErrorCheck, along with the file name and line number via __FILE__ and __LINE__, and checks the errors from both gpuGetLastError() and gpuDeviceSynchronize() after running the kernel.

For example, if fancy_kernel takes the arguments arg1 and arg2, to run it with 10 grids of 64 threads, run the following:

      dim3 grid, threads;
      grid.x = 10
      threads.x = 64
      gpuErrorCheckKernel("Call to fancy_kernel",
                          fancy_kernel, grid, threads,
                          arg1, arg2);

Parameters:
  • message[in] Message to report when an error occurs

  • kernel[in] Name of kernel to be run

  • grid[in] A dim3 containing grid specifications to run kernel with

  • threads[in] A dim3 containing thread specifications to run kernel with

  • ...[in] All arguments to be passed into kernel

Functions

inline void docGPUErrorCheck(const char *message, gpuError_t code, const char *file, int line, bool abort = EXITERROR)

NOTE the actual function is GPUErrorCheck, but for some goddam reason Doxygen refuses to document it when inside a conditional so I’ve made a copy here. Take a GPU error message (code), and checks whether an error occurred.

If an error happened, uses message to give more information to the user, along with the decoded CUDA error message. Uses file and line to report where the error happened. Optional bool abort means you can switch off exiting if an error is found (default true)

Parameters:
  • message[in] User supplied error message

  • code[in] Error message out of CUDA call (e.g. cudaMalloc)

  • file[in] Name of file call was made in

  • line[in] Line of file call was made in

  • abort[in] If true, exit the CUDA code when an error is found. Defaults to True.

Macros

See below for a table of the macros employed in gpucomplex.h. Note that many of the GPU functions are wrapped in the GPUErrorCheck function (documented above as docGPUErrorCheck), which checks for errors and exits with a message including the line number and file name if an error is detected. If there is error checking in the macro it is listed below. Again, setting -D__NVCC__ or -D__HIPCC__ at compilations determines whether CUDA or HIP functions are used.

GPU function macros

Macro

__NVCC__

__HIPCC__

Error wrapped

gpuMalloc

cudaMalloc

hipMalloc

Yes

gpuHostAlloc

cudaHostAlloc

hipHostMalloc

Yes

gpuHostAllocDefault

cudaHostAllocDefault

0

No

gpuMemcpy

cudaMemcpy

hipMemcpy

Yes

gpuMemcpyAsync

cudaMemcpyAsync

hipMemcpyAsync

Yes

gpuMemset

cudaMemset

hipMemset

Yes

gpuDeviceSynchronize

cudaDeviceSynchronize

hipDeviceSynchronize

No

gpuMemcpyDeviceToHost

cudaMemcpyDeviceToHost

hipMemcpyDeviceToHost

No

gpuMemcpyHostToDevice

cudaMemcpyHostToDevice

hipMemcpyHostToDevice

No

gpuMemcpyDeviceToDevice

cudaMemcpyDeviceToDevice

hipMemcpyDeviceToDevice

No

gpuFree

cudaFree

hipFree

Yes

gpuHostFree

cudaFreeHost

hipHostFree

Yes

gpuStream_t

cudaStream_t

hipStream_t

No

gpuStreamCreate

cudaStreamCreate

hipStreamCreate

Yes

gpuStreamDestroy

cudaStreamDestroy

hipStreamDestroy

Yes

gpuEventCreate

cudaEventCreate

hipEventCreate

Yes

gpuGetDeviceCount

cudaGetDeviceCount

hipGetDeviceCount

Yes

gpuGetLastError

cudaGetLastError

hipGetLastError

No

gpuMemGetInfo

cudaMemGetInfo

hipMemGetInfo

Yes

gpuMallocHost

cudaMallocHost

hipHostMalloc

Yes

gpuFreeHost

cudaFreeHost

hipFreeHost

Yes

gpuGetDeviceProperties

cudaGetDeviceProperties

hipGetDeviceProperties

Yes

gpuDeviceProp

cudaDeviceProp

hipDeviceProp_t

No

gpuPeekAtLastError

cudaPeekAtLastError

hipPeekAtLastError

No

Complex number operation

Macro

__NVCC__

__HIPCC__

gpuCreal

cuCreal

hipCreal

gpuCrealf

cuCrealf

hipCrealf

gpuCimag

cuCimag

hipCimag

gpuCimagf

cuCimagf

hipCimagf

gpuCadd

cuCadd

hipCadd

gpuCmul

cuCmul

hipCmul

gpuCdiv

cuCdiv

hipCdiv

gpuConj

cuConj

hipConj

gpuCsub

cuCsub

hipCsub

gpuCabs

cuCabs

hipCabs

gpuCaddf

cuCaddf

hipCaddf

gpuCsubf

cuCsubf

hipCsubf

gpuCmulf

cuCmulf

hipCmulf

gpuCdivf

cuCdivf

hipCdivf

gpuDoubleComplex

cuDoubleComplex

hipDoubleComplex

gpuFloatComplex

cuFloatComplex

hipFloatComplex

make_gpuDoubleComplex

make_cuDoubleComplex

make_hipDoubleComplex

make_gpuFloatComplex

make_cuFloatComplex

make_hipFloatComplex