MPI run error with cuda

luli · 20 June 2022 05:17

I install pyfr1.14.0 and run examples/euler_vortex_2d case with mpi, and I got an error like this :

What’s wrong with me ?

fdw · 20 June 2022 13:08

How many GPUs do you have in your system?

Regards, Freddie.

WillT · 20 June 2022 13:45

Are there cuda enabled device available on the node you are working on? And is Cuda available?

Try nvcc --version and nvidia-smi

luli · 21 June 2022 01:45

In this system I have only one A100 GPU

luli · 21 June 2022 01:47

fdw · 21 June 2022 02:05

If you only have a single GPU then you should only be running with a single MPI rank. When two ranks are launched the first rank will request the first GPU and the second rank will request the second GPU. However, if your system only has a single GPU then this will yield an invalid device error.

Regards, Freddie.

luli · 21 June 2022 02:18

However, in Pyfr 1.13.0 I can run this example with 2 mpi rank with cuda, is mpi automatically assigned in the new version?

fdw · 21 June 2022 16:41

The behaviour of the CUDA backend was changed in the most recent release to be in line with that of HIP and OpenCL. Running multiple ranks on the same GPU results in very bad performance.

Regards, Freddie.

luli · 22 June 2022 10:33

Thank you very much !

nnunn · 27 June 2022 11:41

Hi @luli,

Just a thought: in case you are trying to develop or validate MPI codes, you can set your single A100 to run as 7 “distinct” GPUs, and then launch 7 MPI ranks:

This way (if one has an A100) one can develop for exascale on a (10kg micro ATX) shoebox!

Topic		Replies	Views
GPU parallelization error, ranks not equal to gpu number Errors cuda , mpi	23	1265	20 September 2023
About running PyFR on multiple GPUs General	7	279	17 April 2015
Error: CUDAERROR euler with MPI Errors cuda	5	337	1 July 2022
Error: CUDAOutofMemory on A30 Errors config , cuda	11	241	10 February 2024
Multi GPU calculation Cases hpc	5	330	10 April 2023

MPI run error with cuda

Related topics