PyFR hanging while running cases

Frankx9 · 5 September 2024 11:41

Hello,

recently I’m having issues with PyFR hanging when running on many GPUs across nodes (240 GPUs and 30 nodes) whilst such a thing doesn’t happen with a lower node count. Cannot really understand where the hanging happens.

The plugins enabled are: turbulence, fluid force, nancheck, ascent

Any idea to troubleshoot the issue?

The PyFR version is 2.0.2 and the configuration file with the backend parameters is:

[backend]
precision = double
rank-allocator = linear

[backend-hip]
device-id = local-rank
mpi-type = hip-aware

Best regards,

fdw · 5 September 2024 12:35

Does it hang if you disable HIP-aware MPI?

Regards, Freddie.

Frankx9 · 5 September 2024 12:47

I’ll let you know.

Best regards

Topic		Replies	Views
GPU parallelization error, ranks not equal to gpu number Errors cuda , mpi	23	1270	20 September 2023
Runtime error about 2d euler vortex example General	3	428	12 July 2022
How to run Multi-GPU per node node with PyFR General	11	411	20 April 2017
Error Running PyFR Across Multiple Nodes Errors	3	49	5 September 2024
Install pyfr on HPC with AMD gpu Just Starting hpc	2	63	18 March 2025

PyFR hanging while running cases

Related topics