What would be a good explanation of PyFR having a good utilization of GPU acceleration?

Hello all,

I am trying to find a proper reason for the statement of “PyFR has a good utilization of GPU acceleration technology” , or possibly refer to a paper. I understand that memory transfer between CPU and GPU is one of the big slowdowns of GPU computing.

How would you say PyFR as a modern code uses GPU resource better than codes with histories then modified to utilize GPU acceleration?

In the paper “pyfr an opensource frame work for solving advection-diffusion…” in 2014, it was said GEMM was optimized for large square matrices, where the constant operator in PyFR are small and square, and state matrices are short and fat. Is this improved?

Junting Chen

Hi Junting

I am trying to find a proper reason for the statement of “PyFR has a good utilization of GPU acceleration technology” , or possibly refer to a paper. I understand that memory transfer between CPU and GPU is one of the big slowdowns of GPU computing.

Here is an example of a paper that looks at performance aspects:

https://ieeexplore.ieee.org/document/7876999

In the paper “pyfr an opensource frame work for solving advection-diffusion…” in 2014, it was said GEMM was optimized for large square matrices, where the constant operator in PyFR are small and square, and state matrices are short and fat. Is this improved?

GEMM can actually perform well in a range of scenarios. However, we have also developed technology for smaller/sparse matrices:

https://www.sciencedirect.com/science/article/pii/S0010465515004506

Peter