Hello,
Apologies for the issues, I now have PyFR compiled but encountering problems during run with the CUDA backend. The log is attached below, but it appears there is some undefined symbol with the libnvrtc.so driver. Looking for some diagnostic guidance and if this might be related to incorrectly compiled CUDA version?
Thanks again!
nvcc version:
(pyfr1.15-venv) [roya3@gra-login3 2d-inc-cylinder]$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0
Error ouput along with nvidia-smi:
Wed Nov 8 01:44:48 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... On | 00000000:83:00.0 Off | 0 |
| N/A 38C P0 25W / 250W | 0MiB / 12288MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
--------------------------------------------------------------------------
WARNING: There is at least non-excluded one OpenFabrics device found,
but there are no active ports detected (or Open MPI was unable to use
them). This is most certainly not what you wanted. Check your
cables, subnet manager configuration, etc. The openib BTL will be
ignored for this job.
Local host: gra951
--------------------------------------------------------------------------
Traceback (most recent call last):
File "/home/roya3/.local/bin/pyfr", line 8, in <module>
sys.exit(main())
File "/home/roya3/.local/lib/python3.10/site-packages/pyfr/__main__.py", line 118, in main
args.process(args)
File "/home/roya3/.local/lib/python3.10/site-packages/pyfr/__main__.py", line 251, in process_run
_process_common(
File "/home/roya3/.local/lib/python3.10/site-packages/pyfr/__main__.py", line 230, in _process_common
backend = get_backend(args.backend, cfg)
File "/home/roya3/.local/lib/python3.10/site-packages/pyfr/backends/__init__.py", line 12, in get_backend
return subclass_where(BaseBackend, name=name.lower())(cfg)
File "/home/roya3/.local/lib/python3.10/site-packages/pyfr/backends/cuda/base.py", line 21, in __init__
self.nvrtc = NVRTC()
File "/home/roya3/.local/lib/python3.10/site-packages/pyfr/backends/cuda/compiler.py", line 54, in __init__
self.lib = NVRTCWrappers()
File "/home/roya3/.local/lib/python3.10/site-packages/pyfr/ctypesutil.py", line 18, in __init__
fn = getattr(lib, fname)
File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/python/3.10.2/lib/python3.10/ctypes/__init__.py", line 387, in __getattr__
func = self.__getitem__(name)
File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/python/3.10.2/lib/python3.10/ctypes/__init__.py", line 392, in __getitem__
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/cudacore/11.0.2/lib64/libnvrtc.so: undefined symbol: nvrtcGetCUBINSize. Did you mean: 'nvrtcGetPTXSize'?