Broadcasting terms to kernels

Zhenyang · 1 July 2024 18:42

Hi,

I want to have a forcing term (line tripping, as equation 2.2 in the paper Turbulent boundary layers at moderate Reynolds numbers: inflow length and tripping effects | Journal of Fluid Mechanics | Cambridge Core) such as

sum( [sin(j * ploc + phs) for phs in Nphase])

where Nphase stores N random phase that should update every few hundred time steps. So I referred the turbulence plugin where random number generator and other functions are available. I tried to use _set_external function to pass random phase and update accordingly. However, Nphase is just a vector with a few hundred numbers. According to the post PyFR boundary conditions for turbomachinery - #16 by fdw discussed before, broadcast-col should only be used with matrix with shape (..., neles). For this situation, could you give some examples of broadcast. And what is the difference between broadcast, broadcast-col, broadcast-row,‘view’?

And also when I store a matrix in the backend intg.backend.matrix(shape, tags={'align'}), can I store a vector?

Thanks in advance.

Regards,
Zhenyang

fdw · 2 July 2024 03:09

In your instance you want a regular broadcast condition which will give each instance of the kernel the ability to access the entire array.

Regards, Freddie.

Zhenyang · 2 July 2024 08:08

I tried with broadcast but it gave me an error:

raise ValueError('Broadcasts must have two dimensions')

from regenerator.py. And my ncdim = 1 which raised the error. And I wonder what is the dim should be in the broadcast expression:

eles._set_external('hphs', f'in broadcast fpdtype_t[{dim}]', value=hphs)

Regards,
Zhenyang

fdw · 3 July 2024 03:53

It should be two dimensional. This makes sense as matrix objects are always two dimensional. You can simply make the first dimension 1. The dim should be the number of columns in the matrix.

Regards, Freddie.

Zhenyang · 4 July 2024 09:43

Thanks, I think now it is working as expected. Another thing is that in the code I am restricting the tripping line inside a pipe at wall. So essentially I need to calculate a spatial Gaussian function to limit it to the wall regime which will require radius r and angle theta. For now I calculate it inside mako kernel which I felt wasting computational power since it is calculated every time step. Ideally we should only calculate it once and offload to GPU. Do you have better idea to do this without passing a new variable to the kernel but as an external variable? Is this still possible?

Regards,
Zhenyang

fdw · 4 July 2024 09:57

Calculating each time is probably the most efficient approach on a GPU. The benefits of pre-calculating are likely marginal (and would require an extra external variable).

Regards, Freddie.

Zhenyang · 4 July 2024 11:05

But variables radius and angle are coordinate related so they won’t change with time integration. If the case is big with a lot of grid points, this calculation could be significant as I expect. As I understand, calling an external variable without updating won’t cost anything, no?

Also, to help me to understand better about broadcasting, in this case, assuming my external variable has a shape of (nvar, npts, neles) (similar to plocs), which broadcasting method should be used, broadcast-col?

Regards,
Zhenyang

fdw · 5 July 2024 04:00

The cost of reading in the physical locations (three values per point) is likely a rounding error in the grand scheme of things.

The main cost will likely be in terms of worse scaling. If all the elements with the additional term end up on the same rank then that rank will slow down (as those points have more work to do) and this, in turn, will slow down every rank in the simulation. This is where the real waste is: ranks which do not have any points sitting around doing nothing while they wait for the ranks which do have points to finish.

For broadcasting, broadcast-col is when a value is broadcast down a column so all npts see the same value(s). Similarly, broadcast-row is when a value is broadcast along a row so all neles see the same value(s). If you want everyone to see the exact same value(s) then you use broadcast on its own without a suffix.

Regards, Freddie.

Frankx9 · 4 March 2025 15:55

Hi @fdw,

one additional information regarding broadcasting: I have a view matrix which has shape (1, ninterfpts) and I would like to broadcast it to (nvars, ninterfpts) is in broadcast-row fpdtype_t[${str(nvars)}] the correct syntax? Or is there another one?
Best

fdw · 4 March 2025 17:03

Views can not be broadcast.

Regards, Freddie.

Frankx9 · 4 March 2025 17:05

And constant matrices?

fdw · 4 March 2025 22:12

Constant matrices support all three types of broadcasting.

Regards, Freddie.

Frankx9 · 5 March 2025 06:11

Broadcast row starting from a (1, ninterfpts) will broadcast it to (nvars ,interfpts) in broadcast-row fpdtype_t[${str(nvars)}]

Best

fdw · 5 March 2025 12:48

This is not something that should require broadcasting. If your kernel runs at each ninterfpts then you don’t need to broadcast; rather just ensure that you always access element [0] of the array.

Broadcasting is when you want different instances/loop iterations of a kernel to share a value from an array. Here, all you need to do is just ensure that you only access a single element.

Regards, Freddie.

Topic		Replies	Views
Caching in the macro kernal Development	3	22	15 October 2024
WMLES in PyFR General	2	666	28 February 2020
Sampling functionality for WMLES Boundary Conditions Development	2	48	14 February 2025
Alternative boundary conditions General	5	323	21 August 2022
Import and update a matrix during runtime Development	5	162	3 August 2023

Broadcasting terms to kernels

Related topics