Problem with GiMMiK kernel

Hello,

I have a particular test case (involving a triangular aerofoil) that runs successfully with PyFR 1.10 but fails with PyFR 1.11/1.12. The problem occurs during an attempt to compile a gimmik_mm kernel.

CUDA_ERROR_INVALID_PTX = 218
This indicates that a PTX JIT compilation failed.

The simulation is running on two NVIDIA GPUs (Tesla V100-SXM2-16GB Volta) using OpenMPI 4.1.0.
The CUDA version is 10.2 (or 10.2.89 to be precise).

Do I need to have some specific nvidia cuda software in place before I can run the latest versions of PyFR?

Thanks in advance,
Michael

I was thinking about your issue, I have used pyfr 1.11 on Cuda 10.2 before I think, so I’m not sure what your issue might be.

Any chance you could send me the code produced by gimmik just before it fails? My email is xxx (see DM). You should be able to get this with a print just here

This is most likely driver related. To this end I suggest upgrading both CUDA (to version 11) and the driver version to the latest one available on NVIDIA’s website.

Regards, Freddie.

The code that is being produced looks fine, it was a bit of a long shot that there would be something wrong with the source produced by GiMMiK.

I agree with Freddie, try updating to the latest Cuda version. Something worth noting is that with PyFR version 1.11 we removed the dependency on PyCuda and instead calling the runtime compiler from the cuda library. So depending on how you have things set up this may be causing some issues.

Thank you Will and Freddie for your advice.

I currently looking into getting a suitable version of the GPU kernel driver (>= 450.80.02) installed on the Cirrus Tier-2 machine, i.e., one that is compatible with CUDA 11.0.

I’ll retest PyFR and let you know the result once the driver has been updated.

Just to confirm PyFR 1.12 is now working on Cirrus following a recent GPU kernel driver upgrade.
The driver for the NVIDIA Tesla V100-SXM2-16GB (Volta) GPU was upgraded from 440.64 to 460.73.01.
The new driver is compatible with CUDA 11.2.

Thanks again!