MPI run error with cuda

I install pyfr1.14.0 and run examples/euler_vortex_2d case with mpi, and I got an error like this :


What’s wrong with me ?

How many GPUs do you have in your system?

Regards, Freddie.

Are there cuda enabled device available on the node you are working on? And is Cuda available?

Try nvcc --version and nvidia-smi

In this system I have only one A100 GPU

If you only have a single GPU then you should only be running with a single MPI rank. When two ranks are launched the first rank will request the first GPU and the second rank will request the second GPU. However, if your system only has a single GPU then this will yield an invalid device error.

Regards, Freddie.

However, in Pyfr 1.13.0 I can run this example with 2 mpi rank with cuda, is mpi automatically assigned in the new version?

The behaviour of the CUDA backend was changed in the most recent release to be in line with that of HIP and OpenCL. Running multiple ranks on the same GPU results in very bad performance.

Regards, Freddie.

Thank you very much !

Hi @luli,

Just a thought: in case you are trying to develop or validate MPI codes, you can set your single A100 to run as 7 “distinct” GPUs, and then launch 7 MPI ranks:

This way (if one has an A100) one can develop for exascale on a (10kg micro ATX) shoebox!