What would be a good explanation of PyFR having a good utilization of GPU acceleration?

Junting_Chen · 25 July 2019 19:47

Hello all,

I am trying to find a proper reason for the statement of “PyFR has a good utilization of GPU acceleration technology” , or possibly refer to a paper. I understand that memory transfer between CPU and GPU is one of the big slowdowns of GPU computing.

How would you say PyFR as a modern code uses GPU resource better than codes with histories then modified to utilize GPU acceleration?

In the paper “pyfr an opensource frame work for solving advection-diffusion…” in 2014, it was said GEMM was optimized for large square matrices, where the constant operator in PyFR are small and square, and state matrices are short and fat. Is this improved?

Junting Chen

p.vincent · 25 July 2019 20:31

Hi Junting

I am trying to find a proper reason for the statement of “PyFR has a good utilization of GPU acceleration technology” , or possibly refer to a paper. I understand that memory transfer between CPU and GPU is one of the big slowdowns of GPU computing.

Here is an example of a paper that looks at performance aspects:

https://ieeexplore.ieee.org/document/7876999

In the paper “pyfr an opensource frame work for solving advection-diffusion…” in 2014, it was said GEMM was optimized for large square matrices, where the constant operator in PyFR are small and square, and state matrices are short and fat. Is this improved?

GEMM can actually perform well in a range of scenarios. However, we have also developed technology for smaller/sparse matrices:

https://www.sciencedirect.com/science/article/pii/S0010465515004506

Peter

Topic		Replies	Views
Mathlib in pyfr question Development cuda	1	190	20 April 2023
How to run pyFR with pure numPy operations? Cases config	1	202	21 April 2023
Regarding PyFR on Multi GPU Just Starting	2	203	11 January 2021
Sd7003 case performance with OpenMP Cases incompressible , openmp	8	395	9 February 2020
PyFR release: v1.15.0 News	5	341	10 October 2022

What would be a good explanation of PyFR having a good utilization of GPU acceleration?

Related topics