ENVIRONMENT
OS:
$ cat /etc/os-release
NAME="Red Hat Enterprise Linux"
VERSION="8.2 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.2"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.2 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8.2:GA"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.2
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.2"
PyFR:
Version 1.12.0, installed in a python virtual environment, along with all the required python packages
OpenMPI:
$ mpiexec --version
mpiexec (OpenRTE) 4.1.0
Report bugs to http://www.open-mpi.org/community/help/
gcc:
$ gcc --version
gcc (GCC) 10.2.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.
Python:
$ python --version
Python 3.7.9
CUDA:
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Tue_Sep_15_19:10:02_PDT_2020
Cuda compilation tools, release 11.1, V11.1.74
Build cuda_11.1.TC455_06.29069683_0
ucx:
version 1.8.1 (used for openmpi installation)
THE PROBLEM:
If I run the command reported in the PyFR documentation for the examples https://pyfr.readthedocs.io/en/latest/examples.html, i.e. running with the cuda backend, I get the following error messages (I here paste the error I am getting from the euler vortex example, ran with only 1 mpi task, in order to avoid a too long error message - in the case of running with n mpi tasks, the same error is repeated n times):
Traceback (most recent call last):
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/util.py", line 33, in __call__
KeyError: (<function CUDAKernelProvider._build_kernel at 0x15554aa035f0>, b'\x80\x03X\t\x00\x00\x00gimmik_mmq\x00Xl\x0c\x00\x00\n__global__ void\ngimmik_mm(int n,\n const double* __restrict__ b, int ldb,\n double* __restrict__ c, int ldc)\n{\n int i = blockDim.x*blockIdx.x + threadIdx.x;\n double dotp;\n\n if (i < n)\n {\n dotp = 1.5267881254572668*b[i + 0*ldb] + -0.8136324494869274*b[i + 4*ldb] + 0.40076152031165047*b[i + 8*ldb] + -0.11391719628199004*b[i + 12*ldb];\n c[i + 0*ldc] = dotp;\n dotp = 1.5267881254572668*b[i + 1*ldb] + -0.8136324494869274*b[i + 5*ldb] + 0.40076152031165047*b[i + 9*ldb] + -0.11391719628199004*b[i + 13*ldb];\n c[i + 1*ldc] = dotp;\n dotp = 1.5267881254572668*b[i + 2*ldb] + -0.8136324494869274*b[i + 6*ldb] + 0.40076152031165047*b[i + 10*ldb] + -0.11391719628199004*b[i + 14*ldb];\n c[i + 2*ldc] = dotp;\n dotp = 1.5267881254572668*b[i + 3*ldb] + -0.8136324494869274*b[i + 7*ldb] + 0.40076152031165047*b[i + 11*ldb] + -0.11391719628199004*b[i + 15*ldb];\n c[i + 3*ldc] = dotp;\n dotp = -0.11391719628199004*b[i + 0*ldb] + 0.40076152031165047*b[i + 1*ldb] + -0.8136324494869274*b[i + 2*ldb] + 1.5267881254572668*b[i + 3*ldb];\n c[i + 4*ldc] = dotp;\n dotp = -0.11391719628199004*b[i + 4*ldb] + 0.40076152031165047*b[i + 5*ldb] + -0.8136324494869274*b[i + 6*ldb] + 1.5267881254572668*b[i + 7*ldb];\n c[i + 5*ldc] = dotp;\n dotp = -0.11391719628199004*b[i + 8*ldb] + 0.40076152031165047*b[i + 9*ldb] + -0.8136324494869274*b[i + 10*ldb] + 1.5267881254572668*b[i + 11*ldb];\n c[i + 6*ldc] = dotp;\n dotp = -0.11391719628199004*b[i + 12*ldb] + 0.40076152031165047*b[i + 13*ldb] + -0.8136324494869274*b[i + 14*ldb] + 1.5267881254572668*b[i + 15*ldb];\n c[i + 7*ldc] = dotp;\n dotp = -0.11391719628199004*b[i + 0*ldb] + 0.40076152031165047*b[i + 4*ldb] + -0.8136324494869274*b[i + 8*ldb] + 1.5267881254572668*b[i + 12*ldb];\n c[i + 8*ldc] = dotp;\n dotp = -0.11391719628199004*b[i + 1*ldb] + 0.40076152031165047*b[i + 5*ldb] + -0.8136324494869274*b[i + 9*ldb] + 1.5267881254572668*b[i + 13*ldb];\n c[i + 9*ldc] = dotp;\n dotp = -0.11391719628199004*b[i + 2*ldb] + 0.40076152031165047*b[i + 6*ldb] + -0.8136324494869274*b[i + 10*ldb] + 1.5267881254572668*b[i + 14*ldb];\n c[i + 10*ldc] = dotp;\n dotp = -0.11391719628199004*b[i + 3*ldb] + 0.40076152031165047*b[i + 7*ldb] + -0.8136324494869274*b[i + 11*ldb] + 1.5267881254572668*b[i + 15*ldb];\n c[i + 11*ldc] = dotp;\n dotp = 1.5267881254572668*b[i + 0*ldb] + -0.8136324494869274*b[i + 1*ldb] + 0.40076152031165047*b[i + 2*ldb] + -0.11391719628199004*b[i + 3*ldb];\n c[i + 12*ldc] = dotp;\n dotp = 1.5267881254572668*b[i + 4*ldb] + -0.8136324494869274*b[i + 5*ldb] + 0.40076152031165047*b[i + 6*ldb] + -0.11391719628199004*b[i + 7*ldb];\n c[i + 13*ldc] = dotp;\n dotp = 1.5267881254572668*b[i + 8*ldb] + -0.8136324494869274*b[i + 9*ldb] + 0.40076152031165047*b[i + 10*ldb] + -0.11391719628199004*b[i + 11*ldb];\n c[i + 14*ldc] = dotp;\n dotp = 1.5267881254572668*b[i + 12*ldb] + -0.8136324494869274*b[i + 13*ldb] + 0.40076152031165047*b[i + 14*ldb] + -0.11391719628199004*b[i + 15*ldb];\n c[i + 15*ldc] = dotp;\n }\n}\nq\x01]q\x02(cnumpy\nint32\nq\x03cnumpy\nint64\nq\x04h\x03h\x04h\x03e\x87q\x05.', b'\x80\x03}q\x00.')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/ctypesutil.py", line 33, in _errcheck
KeyError: 222
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/davinci-1/home/cipollettaf/pyfr/bin/pyfr", line 33, in <module>
sys.exit(load_entry_point('pyfr==1.12.0', 'console_scripts', 'pyfr')())
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/__main__.py", line 117, in main
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/__main__.py", line 246, in process_run
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/__main__.py", line 227, in _process_common
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/solvers/__init__.py", line 16, in get_solver
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/integrators/__init__.py", line 36, in get_integrator
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/integrators/std/controllers.py", line 14, in __init__
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/integrators/std/base.py", line 28, in __init__
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/solvers/base/system.py", line 67, in __init__
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/solvers/base/system.py", line 188, in _gen_kernels
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/solvers/baseadvec/elements.py", line 53, in <lambda>
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/backends/base/backend.py", line 163, in kernel
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/backends/cuda/gimmik.py", line 38, in mul
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/util.py", line 35, in __call__
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/backends/cuda/provider.py", line 19, in _build_kernel
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/backends/cuda/compiler.py", line 114, in __init__
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/backends/cuda/driver.py", line 291, in load_module
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/backends/cuda/driver.py", line 166, in __init__
File "/davinci-1/home/cipollettaf/pyfr/lib/python3.7/site-packages/pyfr-1.12.0-py3.7.egg/pyfr/ctypesutil.py", line 35, in _errcheck
pyfr.backends.cuda.driver.CUDAError
Looking at the last line in the traceback, it seems that the issue is related to the cuda driver selection but I could not see any further details about that in the PyFR documentation. I wonder if that could be related to the CUDA toolkit version that I am trying to use (11.1). The reason why I am saying this is because my research group has already tried installing PyFR in a container and through several trials we found the following:
- the container with cudatoolkit version 11.1 or 11.2 gave the same error messages;
- the container with cudatoolkit 11.0 worked just fine.
I am not able to understand if the problem is realted to the cudatoolkit I am using or to the way in which the driver is initialized within the PyFR code.
Can someone comment about that or give some hints about possible solutions or workarounds?
FURTHER REFERENCES:
I noted that in the forum some problems with the cuda backend where already discussed and resolved at https://pyfr.discourse.group/t/problem-with-cuda-backend/306, but I am not able to view the answer given by Freddie (not sure of the reason).
LAST NOTE:
The OpenMP backend works just fine.
Regards,
Federico Cipolletta