Smats for linearised elements

nnunn · 17 September 2021 21:13

For navstokes with backend::cuda, a question about smats for kernels {gradcoru, gradcorulin} {tflux, tfluxlin}.

For the curved kernels {gradcoru, tflux}, precalculated constant smats data gets passed into the kernels at runtime.

But for the linear kernels {gradcorulin, tfluxlin}, it looks like the constant geometric data (verts, upts) is passed into the kernels, and smats gets recalculated each time the kernels are called.

Is there a reason for not precalculating smats for the linear/linearised elements, or am I missing something?

WillT · 18 September 2021 13:03

The idea with the linear element kernels is that, if an element is linear, the you don’t need to read in all the smats terms. Instead you can save bandwidth by only reading in the vertices and then reconstructing them. Given that we are bandwidth bound, this leads to a win.

The win will depend on the element type as, for example, a hex has more corner points that a tet.

nnunn · 18 September 2021 18:24

Thanks @WillT. This came up when I was looking at the kernels for quads. Since the end result of the runtime recalculation of the smats is just 5 floats (i.e. smats[2][2], djac), seems like in this case it would use less bandwidth to pass 5 pre-calculated floats rather than all the point and vertex data?

I’ll take a closer look and try to weigh both options.
Nigel

fdw · 18 September 2021 18:46

Consider a p = 4 linear quad. We have 25 solution points, so the smats requires loading 25*4 = 100 items. Now, instead consider loading four vertices (8 items) plus 25 solution points (50 items) and using them to compute the smats. Here we only need to load a total of 58 items per element. However, as every quad has the same set of solution points, this cost is amortised out, and so for all intents and purposes we are down to 8 items per element rather than 100.

Regards, Freddie.

nnunn · 19 September 2021 14:59

Thanks @fdw, I failed to notice all the solution points (upts) for an element could be sitting in L1. Very neat!

Topic		Replies	Views
Curved elements generation General	1	146	17 May 2018
Max nnz for GiMMiK kernels? General	2	154	18 August 2018
pyFR on hybrid meshes General	1	167	23 July 2014
Questions relating to the 2-D Incompressible Cylinder case Cases config , incompressible	16	447	27 March 2017
A question about the pyfrm data structure: import mesh files General	5	155	24 June 2015

Smats for linearised elements

Related topics