Hello, thank you very much for your reply.
I have tried mpiexec -n 6 pyfr run -b cuda -p x.pyfrm x.ini by splitting the mesh first, but it gives an error raise
RuntimeError(f'Mesh has {nparts} partitions but running '
RuntimeError: Mesh has 6 partitions but running with 1 MPI ranks
I tried mpirun -n 6 with the same error, i think it’s a mpi4py error because i initially installed mpi4py via conda, but i got no clue about installing it via pip (cf. this article,). Finally I have a couple of questions I would like to ask you, thank you for your advice.
Question 1: I think for gpu computing, one cpu core is enough, more cpu will add extra information communication, mpiexec -n 6 or mpirun are specifying 6 cpu cores, does it mean 6 gpu cards must have at least 6 cpu cores to use?
Question 2: I tested mpi4yr with this code and didn’t get the expected results, so I decided it was an installation problem with mpi4py.
from mpi4py import MPI
if name == “main”.
comm = MPI.COMM_WORLD
print(f “rank {comm.rank} of {comm.size}”)
comm.barrier()
rank 0 of 1
rank 0 of 1
rank 0 of 1
I tried disabling the mpi environment in the cuda suite and installing it directly with the default mpi.
nvc-Error-Unknown switch: -Wno-unused-result nvc-Error-Unknown switch: -fwrapv warning: build_ext: command '/opt/nvidia/hpc_sdk/Linux_x86_64/22.11/compilers/bin/nvc' failed with exit code 1 warning: build_ext: building optional extension "mpi4py.dl" failed checking for MPI compile and link ...
Unfortunately I couldn’t find the mpi4py.dl file
find / -name mpi4py.dl
find: ‘/run/user/1001/doc’: Permission denied
find: ‘/run/user/1001/gvfs’: Permission denied
It’s worth noting that I’m using a single A30 GPU to compute the small memory example with no problem, just as I did with the A100.
Regards, wgbb