GPU parallelization error, ranks not equal to gpu number

Thank you very much for your explanation, which is very clear and intuitive. I really learned a lot.
For mpiexec -n 2 pyfr run -b cuda … error : Mesh has two partitions but running with 1 MPI rank.
I still don’t have a clue. Any suggestions would be appreciated.
Best Regards.

Can you try a simple test of mpi4py, ie:

from mpi4py import MPI

if __name__ == "__main__":
    comm = MPI.COMM_WORLD
    print(f"rank {comm.rank} of {comm.size}")
    comm.barrier()

launched with:

$ mpiexec -n 4 python test_script.py

where test_script.py is the location of the python script above.

Hello, everyone!

If you encounter the same error as I did:
“RuntimeError: Mesh has 2 partitions but running with 1 MPI ranks,”
please make sure to check your setup using WillT’s example mpi4py program.

mpiexec -n 2 your_test_file_name.py

If you end up with an error result similar to mine, i.e.:

rank 0 of 1
rank 0 of 1

In that case, you can pretty much confirm that the issue lies with your mpi4py installation.
I can offer some troubleshooting ideas here.
I installed mpi4py using conda install mpi4py, and although it showed a successful installation, there were indeed some problems. Please uninstall it and then use pip install mpi4py to install it again. If you encounter any errors during the installation, try to resolve them.
Once you’ve successfully installed mpi4py, run the above test case. If the output result is:

rank 1 of 2
rank 0 of 2

Congratulations, you’ve successfully resolved the initial issue.
Enjoy PyFR on multiple MPI ranks!

By the way, I previously encountered version issues with h5py, which also turned out to be related to using conda install. I use Conda for convenience, but it has indeed caused me some headaches at times.

Lastly, many thanks to the community staff for their hard work. I’m truly grateful.
Best regards.

A post was split to a new topic: Redirecting progress to file