Whats the effect of CUDA-aware MPI?

Hello,

I have a quick exploratory question. At the moment our clusters have mvapich2
installed but it appears to me that none of them is an MPI-aware version. I was
just curious if anyone has already looked into potential performance benefits
when using CUDA-aware MPIs. I realize it's a new feature in your code and I am
also happy to act as a beachhead user.

Please let me know.

Best wishes,
Robert

Hi Robert,

To the best of my knowledge the most work on this area has been done by Filippo Spiga at Cambridge. If I recall (his findings were the subject of a GTC talk back in 2015 and should be available online) he found that using CUDA aware MPI did show an improvement in terms of the strong scalability of PyFR with performance improving by 10% or so in the limit. He also considered the impact of GPU direct over RDMA.

Regards, Freddie.