where Nphase stores N random phase that should update every few hundred time steps. So I referred the turbulence plugin where random number generator and other functions are available. I tried to use _set_external function to pass random phase and update accordingly. However, Nphase is just a vector with a few hundred numbers. According to the post PyFR boundary conditions for turbomachinery - #16 by fdw discussed before, broadcast-col should only be used with matrix with shape (..., neles). For this situation, could you give some examples of broadcast. And what is the difference between broadcast, broadcast-col, broadcast-row,‘view’?
And also when I store a matrix in the backend intg.backend.matrix(shape, tags={'align'}), can I store a vector?
It should be two dimensional. This makes sense as matrix objects are always two dimensional. You can simply make the first dimension 1. The dim should be the number of columns in the matrix.
Thanks, I think now it is working as expected. Another thing is that in the code I am restricting the tripping line inside a pipe at wall. So essentially I need to calculate a spatial Gaussian function to limit it to the wall regime which will require radius r and angle theta. For now I calculate it inside mako kernel which I felt wasting computational power since it is calculated every time step. Ideally we should only calculate it once and offload to GPU. Do you have better idea to do this without passing a new variable to the kernel but as an external variable? Is this still possible?
Calculating each time is probably the most efficient approach on a GPU. The benefits of pre-calculating are likely marginal (and would require an extra external variable).
But variables radius and angle are coordinate related so they won’t change with time integration. If the case is big with a lot of grid points, this calculation could be significant as I expect. As I understand, calling an external variable without updating won’t cost anything, no?
Also, to help me to understand better about broadcasting, in this case, assuming my external variable has a shape of (nvar, npts, neles) (similar to plocs), which broadcasting method should be used, broadcast-col?
The cost of reading in the physical locations (three values per point) is likely a rounding error in the grand scheme of things.
The main cost will likely be in terms of worse scaling. If all the elements with the additional term end up on the same rank then that rank will slow down (as those points have more work to do) and this, in turn, will slow down every rank in the simulation. This is where the real waste is: ranks which do not have any points sitting around doing nothing while they wait for the ranks which do have points to finish.
For broadcasting, broadcast-col is when a value is broadcast down a column so all npts see the same value(s). Similarly, broadcast-row is when a value is broadcast along a row so all neles see the same value(s). If you want everyone to see the exact same value(s) then you use broadcast on its own without a suffix.
one additional information regarding broadcasting: I have a view matrix which has shape (1, ninterfpts) and I would like to broadcast it to (nvars, ninterfpts) is in broadcast-row fpdtype_t[${str(nvars)}] the correct syntax? Or is there another one?
Best
This is not something that should require broadcasting. If your kernel runs at each ninterfpts then you don’t need to broadcast; rather just ensure that you always access element [0] of the array.
Broadcasting is when you want different instances/loop iterations of a kernel to share a value from an array. Here, all you need to do is just ensure that you only access a single element.