`FEE_primary_beam_cuda`¶

The MWA Fully Embedded Element primary beam model (Sokolowski et al. 2017) is at the time of writing the most accurate and advanced MWA primary beam model. The model is stored in spherical harmonic coefficients, at a frequency resolution of 1.28MHz, in an hdf5 file called mwa_full_embedded_element_pattern.h5. I have done my best to document the RTS code I have used. The RTS code makes use of the Legendre Polynomial code available here by John Burkardt, and is included in the WODEN distribution as legendre_polynomial.c.

Device methods to calculate the MWA FEE primary beam response

Author: J.L.B. Line

Functions

void copy_FEE_primary_beam_to_GPU(RTS_MWA_FEE_beam_t *FEE_beam)¶

Copy the spherical harmonic MWA FEE coeffs from host to device.

Copies FEE_beam->M, FEE_beam->N, FEE_beam->Q1, FEE_beam->Q2 into FEE_beam->d_M, FEE_beam->d_N, FEE_beam->d_Q1, FEE_beam->d_Q2, reshaping from 2D arrays into 1D arrays.

Parameters: *FEE_beam – [inout] RTS_MWA_FEE_beam_t which has been initialised with FEE_primary_beam.RTS_MWAFEEInit

__device__ void RTS_CUDA_pm_polynomial_value_singlef(int n, int m, float x, float *values)¶

RTS GPU implemtation of John Burkardt’s legendre_polynomial.pm_polynomial_value. Evaluates all legendre polynomials up to the specified orders, for a single evaluation point x. Original documentation from pm_polynomial_value included below.

Note the original function evaluated for mulitple x vales, so needed a number of evalutation points mm, which is set to 1.0 in this RTS GPU code.

Purpose:

PM_POLYNOMIAL_VALUE evaluates the Legendre polynomials Pm(n,m,x).

Differential equation:

(1-X*X) * Y’’ - 2 * X * Y + ( N (N+1) - (M*M/(1-X*X)) * Y = 0

First terms:

M = 0 ( = Legendre polynomials of first kind P(N,X) )

Pm(0,0,x) = 1 Pm(1,0,x) = 1 X Pm(2,0,x) = ( 3 X^2 - 1)/2 Pm(3,0,x) = ( 5 X^3 - 3 X)/2 Pm(4,0,x) = ( 35 X^4 - 30 X^2 + 3)/8 Pm(5,0,x) = ( 63 X^5 - 70 X^3 + 15 X)/8 Pm(6,0,x) = (231 X^6 - 315 X^4 + 105 X^2 - 5)/16 Pm(7,0,x) = (429 X^7 - 693 X^5 + 315 X^3 - 35 X)/16

M = 1

Pm(0,1,x) = 0 Pm(1,1,x) = 1 * SQRT(1-X^2) Pm(2,1,x) = 3 * SQRT(1-X^2) * X Pm(3,1,x) = 1.5 * SQRT(1-X^2) * (5*X^2-1) Pm(4,1,x) = 2.5 * SQRT(1-X^2) * (7*X^3-3*X)

M = 2

Pm(0,2,x) = 0 Pm(1,2,x) = 0 Pm(2,2,x) = 3 * (1-X^2) Pm(3,2,x) = 15 * (1-X^2) * X Pm(4,2,x) = 7.5 * (1-X^2) * (7*X^2-1)

M = 3

Pm(0,3,x) = 0 Pm(1,3,x) = 0 Pm(2,3,x) = 0 Pm(3,3,x) = 15 * (1-X^2)^1.5 Pm(4,3,x) = 105 * (1-X^2)^1.5 * X

M = 4

Pm(0,4,x) = 0 Pm(1,4,x) = 0 Pm(2,4,x) = 0 Pm(3,4,x) = 0 Pm(4,4,x) = 105 * (1-X^2)^2

Recursion:

if N < M: Pm(N,M,x) = 0 if N = M: Pm(N,M,x) = (2*M-1)!! * (1-X*X)^(M/2) where N!! means the product of all the odd integers less than or equal to N. if N = M+1: Pm(N,M,x) = X*(2*M+1)*Pm(M,M,x) if M+1 < N: Pm(N,M,x) = ( X*(2*N-1)*Pm(N-1,M,x) - (N+M-1)*Pm(N-2,M,x) )/(N-M)

Licensing:

This code is distributed under the GNU LGPL license.

Modified:

08 August 2013

Author:

John Burkardt

Reference:

Milton Abramowitz, Irene Stegun, Handbook of Mathematical Functions, National Bureau of Standards, 1964, ISBN: 0-486-61272-4, LC: QA47.A34.

Parameters:

Input, int MM, the number of evaluation points.

Input, int N, the maximum first index of the Legendre function, which must be at least 0.

Input, int M, the second index of the Legendre function, which must be at least 0, and no greater than N.

Input, double X[MM], the point at which the function is to be evaluated.

Output, double PM_POLYNOMIAL_VALUE[MM*(N+1)], the function values.

Parameters

n – [in] The maximum first index of the Legendre function, which must be at least 0
m – [in] The second index of the Legendre function,which must be at least 0, and no greater than N
x – [in] The point at which the function is to be evaluated
values – [inout] Array to store calculated function values within

__global__ void RTS_P1SINfKernel(float *d_theta, cuFloatComplex *rts_P_sin, cuFloatComplex *rts_p1, int nmax, int num_coords)¶

This grabs a legendre polynomial value for a theta value in d_theta and a given order, and inserts the output into rts_p1, and the output / sin(theta) into rts_P_sin.

The outputs are scaled correctly for the FEE model resolution, and dumped into rts_p1, rts_P_sin in order of theta value first, order of legendre polynomial second. rts_p1 and rts_P_sin should be num_coords*(nmax*nmax + 2*nmax) long.

When called with dim3 grid, threads, kernel should be called with both grid.x and grid.y defined, where:

grid.x * threads.x >= num_coords
grid.y * threads.y >= nmax*nmax + 2*nmax

Parameters

d_theta – [in] Array of theta values (radians)
rts_P_sin – [inout] Output array to store l. poly. value / sin(theta) in
rts_p1 – [inout] Output array to store l. poly. value in
nmax – [in] Maximum order of legendre polynomial to calculate to
num_coords – [in] Number of theta coords

__global__ void RTS_getTileGainsKernel(float *d_phi, float *d_theta, int nMN, int num_coords, float *pb_M, float *pb_N, cuFloatComplex *pb_Q1, cuFloatComplex *pb_Q2, cuFloatComplex *rts_P_sin, cuFloatComplex *rts_P1, cuFloatComplex *emn_T, cuFloatComplex *emn_P)¶

This kernel grabs all the FEE beam coeffs and generated legendre polynomial values, combines them to get the spherical harmonic values for all available orders, and converts them into complex field values.

pb_M, pb_N, pb_Q1, pb_Q2 should have been calculated using FEE_primary_beam.RTS_MWAFEEInit and transferred to the device using FEE_primary_beam_cuda.copy_FEE_primary_beam_to_GPU. rts_P_sin, rts_p1 and should have been calculated using FEE_primary_beam_cuda.RTS_P1SINfKernel. The output arrays emn_T,emn_P contain the theta and phi polarisation outputs for both the north-south and east-west dipoles, and contain separate values for all spherical harmonic values, and so contain 2*num_coords*nMN values.

When called with dim3 grid, threads, kernel should be called with all grid.x, grid.y, and grid.z defined, where:

grid.x * threads.x >= num_coords
grid.y * threads.y >= nMN
grid.z * threads.z >= 2 (number of polarisations)

Parameters

d_phi – [in] Array of phi values (radians) (these are azimuth values)
d_theta – [in] Array of theta values (radians) (these are zenith angles)
nMN – [in] Total number of legendre polynomial values to be used (if nmax is the maximum order to use, nMN = nmax*nmax + 2*nmax)
num_coords – [in] Number of theta/phi coords
pb_M – [in] Precalculated FEE beam params
pb_N – [in] Precalculated FEE beam params
pb_Q1 – [in] Precalculated FEE beam params
pb_Q2 – [in] Precalculated FEE beam params
rts_P_sin – [in] Calculated l. poly. value / sin(theta)
rts_P1 – [in] Calculated l. poly. value in
emn_T – [inout] Complex theta polarisation output for all sky coords, still separated by order of spherical harmonic
emn_P – [inout] Complex phi polarisation output for all sky coords, still separated by order of spherical harmonic

__global__ void kern_sum_emn_PT_by_M(cuFloatComplex *emn_T, cuFloatComplex *emn_P, float *d_emn_T_sum_real, float *d_emn_T_sum_imag, float *d_emn_P_sum_real, float *d_emn_P_sum_imag, float *d_m_range, float *d_M, int nMN, int nmax, int num_coords)¶

This kernel takes comlex polarisation outputs of RTS_getTileGainsKernel, which are separated by spherical harmonic order, and sums over the spherical harmonics, to return a single complex beam gain direction on sky.

The summation happens over the m index of the spherical harmonic, and so we require an array containing the range of possible m values (d_m_range) and an array that contains the m value of all outputs in emn_T, emn_P (d_M), to ensure we are summing the correct values and not double counting anything.

When called with dim3 grid, threads, kernel should be called with all grid.x and grid.y defined, where:

grid.x * threads.x >= num_coords
grid.y * threads.y >= 2*nmax + 1

Parameters

emn_T – [in] theta polarisation outputs of RTS_getTileGainsKernel
emn_P – [in] phi polarisation outputs of RTS_getTileGainsKernel
d_emn_T_sum_real – [inout] Real part of summed theta polarisation
d_emn_T_sum_imag – [inout] Imaginary part of summed theta polarisation
d_emn_P_sum_real – [inout] Real part of summed phi polarisation
d_emn_P_sum_imag – [inout] Imaginary part of summed phi polarisation
d_m_range – [in] All possible m values (from -nmax to +nmax)
d_M – [in] The m value corresponding to values in emn_T, emn_P
nMN – [in] Total number of spherical harmonics, nMN = nmax*nmax + 2*nmax
nmax – [in] Maximum order of spherical harmonic
num_coords – [in] Number of theta/phi coords

__global__ void kern_map_emn(cuFloatComplex *TileGainMatrices, float *d_emn_T_sum_real, float *d_emn_T_sum_imag, float *d_emn_P_sum_real, float *d_emn_P_sum_imag, int nmax, int num_coords)¶

This basically just maps the outputs of kern_sum_emn_PT_by_M into a single complex array, ordered by polarisation and dipole orientation, and then sky coordinate.

The summation happens over the m index of the spherical harmonic, and so we require an array containing the range of possible m values (d_m_range) and an array that contains the m value of all outputs in emn_T, emn_P (d_M), to ensure we are summing the correct values and not double counting anything.

Output ordering in TileGainMatrices is ordered by J_theta_x,J_phi_x,J_theta_y,J_phi_y (where theta,phi is polarisation, x,y are dipole orientation), and these four values are then repeated for each direction on the sky.

When called with dim3 grid, threads, kernel should be called with grid.x and grid.y defined, where:

grid.x * threads.x >= num_coords
grid.y * threads.y >= 2

Parameters

TileGainMatrices – [in] A single 1D array that contains all complex beam gains for all polaristations, dipole orientations, and sky coords
d_emn_T_sum_real – [inout] Real part of summed theta polarisation
d_emn_T_sum_imag – [inout] Imaginary part of summed theta polarisation
d_emn_P_sum_real – [inout] Real part of summed phi polarisation
d_emn_P_sum_imag – [inout] Imaginary part of summed phi polarisation
nmax – [in] Maximum order of spherical harmonic
num_coords – [in] Number of sky direction coords

__global__ void kern_apply_FEE_norm(cuFloatComplex *TileGainMatrices, cuFloatComplex *d_norm_fac, int num_coords)¶

Normalises the complex beam gains in TileGainMatrices to the zenith reponse of a zenith beam pointing, d_norm_fac. The latter array should have been calcuated using FEE_primary_beam_cuda.get_HDFBeam_normalisation.

This normalisation should be applied while the polarisations values in TileGainMatrices are still tied to the phi,theta coord system (i.e.) before any parallactic angle rotation). d_norm_fac should be an array of 4 values, appropriate to each polarisation in TileGainMatrices. The values in d_norm_fac should be the absolute value of the zenith gain, but input as a float _Complex type (so just have imaginary set to zero).

When called with dim3 grid, threads, kernel should be called with grid.x defined, where:

grid.x * threads.x >= num_coords

Parameters

TileGainMatrices – [in] A single 1D array that contains all complex beam gains for all polaristations, dipole orientations, and sky coords
d_norm_fac – [in] Normalisation factor for each polarisation
num_coords – [in] Number of sky direction coords

__global__ void kern_rotate_FEE_beam(cuFloatComplex *d_FEE_beam_gain_matrices, float *d_sin_para_angs, float *d_cos_para_angs, int num_coords)¶

Takes the beam gains in d_FEE_beam_gain_matrices, which are in theta, phi polarisations, and rotates them by the parallactic angle, to align them into north-south and east-west gain and leakage terms, which can be used to create XX,XY,YX,YY polarisations.

d_FEE_beam_gain_matrices should be as output by FEE_primary_beam_cuda.kern_map_emn, ordered by J_theta_x,J_phi_x,J_theta_y,J_phi_y (where theta,phi is polarisation, x,y are dipole orientation). d_sin_para_angs and d_cos_para_angs should be the sine and cosine of the parallactic angle that corresponds to each sky direction in d_FEE_beam_gain_matrices, repectively. Once the rotation has been applied, the complex beam values are ordered as g_x, D_x, D_y, g_y where g means gain, D means leakage, x means dipole aligned north-south, y aligned east-west.

The original FEE beam code has a different east-west / north-south convention, and the sense of azimuth rotation, and so a reordering / sign flip is applied here to match up with Stokes parameters.

When called with dim3 grid, threads, kernel should be called with grid.x defined, where:

grid.x * threads.x >= num_coords

Parameters

d_FEE_beam_gain_matrices – [inout] A single 1D array that contains all complex beam gains for all polaristations, dipole orientations, and sky coords
d_sin_para_angs – [in] Sine of the parallactic angle for all sky directions
d_cos_para_angs – [in] Cosine of the parallactic angle for all sky directions
num_coords – [in] Number of sky directions in d_FEE_beam_gain_matrices

void RTS_CUDA_get_TileGains(float *phi, float *theta, float *sin_para_angs, float *cos_para_angs, int num_time_steps, int num_components, float rotation, RTS_MWA_FEE_beam_t *primary_beam, float _Complex *TileGainMatrices, int scaling)¶

For the given phi (azimuth) and theta (zenith angle) sky coords, calculate the MWA FEE beam model response for all polarisation and dipole orientations, given the initialised beam model primary_beam. Optionally normalise the beam to zenith (scaling=1) and rotate into a frame compatible with Stokes parameters (rotation=1).

primary_beam should have been intialised using FEE_primary_beam.RTS_MWAFEEInit and FEE_primary_beam_cuda.copy_FEE_primary_beam_to_GPU.

If scaling == 1, normalises the beam to zenith. For this, FEE_primary_beam_cuda.get_HDFBeam_normalisation should have already be run to setup the correct attributes in primary_beam.

If rotation == 1, rotate the beam gains from a instrument sky-locked theta, phi polarisation system into one aligned with Stokes parameters (for creating XX, XY, YX, YY beams that can be combined with Stokes I,Q,U,V to generate instrumental visibilities). For this, sin_para_angs,cos_para_angs must be allocated.

Calls the following kernels from FEE_primary_beam_cuda. See their documentation for more detail.

RTS_P1SINfKernel
RTS_getTileGainsKernel
kern_sum_emn_PT_by_M
kern_map_emn
kern_apply_FEE_norm (optional)
kern_rotate_FEE_beam (optional)

Parameters

phi – [in] Array of phi values (radians) (these are azimuth values)
theta – [in] Array of theta values (radians) (these are zenith angles)
sin_para_angs – [in] Sine of the parallactic angle for all sky directions
cos_para_angs – [in] Cosine of the parallactic angle for all sky directions
num_time_steps – [in] Number of time steps in simulation
num_components – [in] Number of COMPONENTs being used here
rotation – [in] 0=False, 1=True, rotate results by parallactic angle
primary_beam – [in] An initialised RTS_MWA_FEE_beam_t containing MWA FEE spherical harmonic coeffs for this pointing
TileGainMatrices – [inout] A single 1D array that contains all complex beam gains for all polaristations, dipole orientations, and sky coords
scaling – [in] 0=False, 1=True, Normlise results to zenith

void calc_CUDA_FEE_beam(float *azs, float *zas, float *sin_para_angs, float *cos_para_angs, int num_components, int num_time_steps, RTS_MWA_FEE_beam_t *FEE_beam, int rotation, int scaling)¶

This function is basically a wrapper to FEE_primary_beam_cuda.RTS_CUDA_get_TileGains, by cudaMalloc-ing an appropriate array to store the outputs in. If you want to call the MWA FEE beam, and only use the outputs in device memory, this is the function you want.

This function cudaMallocs FEE_beam->d_FEE_beam_gain_matrices and sets all array entries to zero, before passing into RTS_CUDA_get_TileGains. See RTS_CUDA_get_TileGains for more detail.

FEE_beam should have been intialised using FEE_primary_beam.RTS_MWAFEEInit and FEE_primary_beam_cuda.copy_FEE_primary_beam_to_GPU.

If scaling == 1, normalises the beam to zenith. For this, FEE_primary_beam_cuda.get_HDFBeam_normalisation should have already be run to setup the correct attributes in primary_beam.

If rotation == 1, rotate the beam gains from a instrument sky-locked theta, phi polarisation system into one aligned with Stokes parameters (for creating XX, XY, YX, YY beams that can be combined with Stokes I,Q,U,V to generate instrumental visibilities). For this, sin_para_angs,cos_para_angs must be allocated.

Parameters

azs – [in] Array of azimuth values (radians)
zas – [in] Array of zenith angle values (radians)
sin_para_angs – [in] Sine of the parallactic angle for all az,za
cos_para_angs – [in] Cosine of the parallactic angle for all az,za
num_components – [in] Number of COMPONENTs being used here
num_time_steps – [in] Number of time steps in simulation
FEE_beam – [inout] An initialised RTS_MWA_FEE_beam_t containing MWA FEE spherical harmonic coeffs for this pointing
rotation – [in] 0=False, 1=True, rotate results by parallactic angle
scaling – [in] 0=False, 1=True, Normlise results to zenith

__global__ void kern_make_norm_abs(cuFloatComplex *d_norm_fac)¶

We want to normalise by the absolute value of the FEE primary beam, so take the absolute of the cuFloatComplex values in d_norm_fac

Beam values to be normalised are stil cuFloatComplex, so easy to just keep d_norm_fac as a complex, and put the absolute value in the real and set imaginary to zero. There should be 16 values to normalise, as there are four az/za directions and four polarisations for each

When called with dim3 grid, threads, kernel should be called with grid.x set, where:

grid.x * threads.x >= 16

Parameters: d_norm_fac – [inout] Complex values to convert to absolute

void get_HDFBeam_normalisation(RTS_MWA_FEE_beam_t *FEE_beam_zenith, RTS_MWA_FEE_beam_t *FEE_beam)¶

Using FEE_beam_zenith, calculate the zenith normlisation values for both dipoles and both polarisations in the native FEE beam coodinate system.

The FEE beam model is stored in theta/phi (instrument locked) polarisations. To therefore calculate the zenith normalisation values, actually need to calculate the beam at zenith with multiple azimuth values, and then from those directions, select the correct east-west/north south dipole and theta/phi polarisation output.

This function copies the zenith beam to device, calculates normalistaion values by calling

copy_FEE_primary_beam_to_GPU
calc_CUDA_FEE_beam
kern_make_norm_abs

and then inserts the correct normalisation values into FEE_beam->norm_fac, which can be used later on by calc_CUDA_FEE_beam with scaling=1.

Both FEE_beam_zenith and FEE_beam should have been initialised using FEE_primary_beam.RTS_MWAFEEInit, with the former having all delays set to zero.

Parameters

FEE_beam_zenith – [in] An initialised RTS_MWA_FEE_beam_t with zero delays
FEE_beam – [in] An initialised RTS_MWA_FEE_beam_t of the same frequency in which to store the normalisation outputs for later use

__global__ void kern_map_FEE_beam_gains(cuFloatComplex *d_FEE_beam_gain_matrices, cuFloatComplex *d_primay_beam_J00, cuFloatComplex *d_primay_beam_J01, cuFloatComplex *d_primay_beam_J10, cuFloatComplex *d_primay_beam_J11, int num_freqs, int num_components, int num_visis, int num_baselines, int num_times)¶

Map the RTS ordered FEE gain array in four separate arrays, each corresponding to a different element in the beam Jones matrix.

The visibility functions in source_components.cu expect the primary beam Jones matrix in four 1D cuFloatComplex arrays. This mapping should be done after rotation by parallatic angle, to be consistent with other primary beam functions in WODEN. The arrays map to the physical dipoles as

d_primay_beam_J00: north-south gain
d_primay_beam_J01: north-south leakage
d_primay_beam_J10: east-west leakage
d_primay_beam_J11: east-west gain

Within each array, the outputs are stored by time index (slowest changing), frequency index, then COMPONENT index (fastest changing). This is important to ensure visibility functions in source_components.cu grab the correct values later on.

When called with dim3 grid, threads, kernel should be called with both grid.x and grid.y set, where:

grid.x * threads.x >= num_visis
grid.y * threads.y >= num_components

Parameters

d_FEE_beam_gain_matrices – [in] A single 1D array that contains all complex beam gains for all polaristations, dipole orientations, and sky coords, as output by calc_CUDA_FEE_beam
d_primay_beam_J00 – [inout] Array to store north-south gain on device memory
d_primay_beam_J01 – [inout] Array to store north-south leakage on device memory
d_primay_beam_J10 – [inout] Array to store east-west leakage on device memory
d_primay_beam_J11 – [inout] Array to store east-west gain on device memory
num_freqs – [in] Number of frequencies
num_components – [in] Number of COMPONENTs
num_visis – [in] Total number of visibilities (num_freqs*num_baselines*num_times)
num_baselines – [in] Number of baselines in the array
num_times – [in] Number of times steps

void free_FEE_primary_beam_from_GPU(RTS_MWA_FEE_beam_t *primary_beam)¶

Frees device memory from primary_beam

Explicitly, cudaFrees:

primary_beam->d_M
primary_beam->d_N
primary_beam->d_Q1
primary_beam->d_Q2

Parameters: primary_beam – [in] A RTS_MWA_FEE_beam_t which has already been used on the device to calculate MWA FEE beam values

void test_RTS_CUDA_FEE_beam(int num_components, float *azs, float *zas, float latitude, RTS_MWA_FEE_beam_t *FEE_beam_zenith, RTS_MWA_FEE_beam_t *FEE_beam, int rotation, int scaling, float _Complex *FEE_beam_gains)¶

This function is basically a host wrapper to FEE_primary_beam_cuda.calc_CUDA_FEE_beam, by copying the resultant FEE_beam->d_FEE_beam_gain_matrices into host memory. It also calculates the normlisation factors. If you want to run the MWA FEE beam code on the GPU and copy outputs onto the CPU, this is the function you want.

Both FEE_beam_zenith and FEE_beam should have been initialised using FEE_primary_beam.RTS_MWAFEEInit, with the former having all delays set to zero.

If scaling == 1, normalises the beam to zenith. For this, FEE_primary_beam_cuda.get_HDFBeam_normalisation should have already be run to setup the correct attributes in primary_beam.

If rotation == 1, rotate the beam gains from a instrument sky-locked theta, phi polarisation system into one aligned with Stokes parameters (for creating XX, XY, YX, YY beams that can be combined with Stokes I,Q,U,V to generate instrumental visibilities). For this, sin_para_angs,cos_para_angs must be allocated.

Parameters

num_components – [in] Number of azimuth and zenith angles
azs – [in] Array of azimuth values (radians)
zas – [in] Array of zenith angle values (radians)
latitude – [in] Latitude of the instrument (radians)
FEE_beam_zenith – [inout] An initialised RTS_MWA_FEE_beam_t containing MWA FEE spherical harmonic coeffs for a zenith pointing
FEE_beam – [inout] An initialised RTS_MWA_FEE_beam_t containing MWA FEE spherical harmonic coeffs for desired pointing
rotation – [in] 0=False, 1=True, rotate results by parallactic angle
scaling – [in] 0=False, 1=True, Normlise results to zenith
FEE_beam_gains – [in] Complex array to store outputs in (ordered by XX,XY,YX,YY, and then by az/za)

`FEE_primary_beam_cuda`¶

Previous topic

Next topic

This Page

FEE_primary_beam_cuda¶

`FEE_primary_beam_cuda`¶