Question about Gimmik flops and Arithmetic intensity

luli · 12 August 2022 09:56

I read paper about gimmik package GiMMiK—Generating bespoke matrix multiplication kernels for accelerators: Application to high-order Computational Fluid Dynamics - ScienceDirect . When we use high-order scheme, the kernel generated by gimmik compute very large matrix mat,

the none zero number is large than 1000, so I think this kernel should be a compute bound kernel .However when I run pyfr with Cylinder-3D case mentioned in On the utility of GPU accelerated high-order methods for unsteady flow simulations: A comparison with industry-standard tools - ScienceDirect, the gimmik kernel only got AI(Arithmetic intensity)=2

Why is that？

WillT · 12 August 2022 11:03

The actual amount of computation is still very small as for each entry you perform some FMAs, this is highlighted by the arithmetic intensity. (If you were computing several sine or cosines per point it might be a different matter.) Whereas on the memory side of things you are having to read in quite a few values and write out quite a lot.

We ensure that the memory access is coalesced to maximise the compuation from each read, but the reads are still slow with not enough work to occopy threads between reads. If you look at the stall reports, you’ll see that threads are probably stalled alot of the time due to global memory.

Maybe have a look at this paper for more info, you can do quite a lot of comuptation while remaining bandwidth bound. Also see this insightful paper for the V100 https://arxiv.org/pdf/1804.06826.pdf

As a rule though, kernels in PyFR are generally limited by bandwidth not FLOPs and @fdw might like to add more nuance to this.

Topic		Replies	Views
Question on the functionlaity of GiMMiK General	11	351	22 March 2022
Problem with GiMMiK kernel Errors cuda	5	543	22 July 2021
What's the relationship between Gimmik and cublas? General	10	556	9 May 2022
Max nnz for GiMMiK kernels? General	2	150	18 August 2018
Scaling studies for isentropic vortex General	4	173	10 March 2017

Question about Gimmik flops and Arithmetic intensity

Related topics