Seg fault with OpenMP backend

Hello,

I am attempting to install PyFR and using the CUDA backend on a virtual environment installation on a CUDA workstation works well for the given Examples as well as validation cases.
However, trying to install then run on a GPU-less computer appears to create a seg fault issue every time trying to run. Import, export, and partitioning seems to work, but trying to use “pyfr run” results in the error log below, on all test cases and persists even with adding cc=gcc-12 to the ini file.
I thought it might be because I originally was compiling libxsmm with gcc-11, but compiling with the recommended gcc-12 still does not resolve the issue.

Hoping if anybody may have any ideas to what the issue might be, or any diagnosis steps?

Thank you!

(pyfr-venv) abhijit@abhijit-ThinkPad-T14-Gen-3:~/pyfr/PyFR-Test-Cases/2d-euler-vortex$ pyfr run --backend openmp --progress euler-vortex.pyfrm euler-vortex.ini

LIBXSMM_VERSION: feature_main_merge-1.17-4047 (25694159)
LIBXSMM_TARGET: hsw [AMD Ryzen 7 PRO 6850U with Radeon Graphics]
Registry and code: 13 MB (spmdm=1)
Command: /home/abhijit/pyfr/pyfr-venv/bin/python /home/abhijit/pyfr/pyfr-venv/bin/pyfr run --backend openmp --progress euler-vortex.pyfrm euler-vortex.ini
Uptime: 0.045900 s
[abhijit-ThinkPad-T14-Gen-3:224059] *** Process received signal ***
[abhijit-ThinkPad-T14-Gen-3:224059] Signal: Segmentation fault (11)
[abhijit-ThinkPad-T14-Gen-3:224059] Signal code:  (0)
[abhijit-ThinkPad-T14-Gen-3:224059] Failing at address: (nil)
[abhijit-ThinkPad-T14-Gen-3:224059] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7faf3b842520]
[abhijit-ThinkPad-T14-Gen-3:224059] [ 1] /home/abhijit/pyfr/libxsmm/lib/libxsmm.so(libxsmm_fsspmdm_create+0x35e)[0x7faf3426a25e]
[abhijit-ThinkPad-T14-Gen-3:224059] [ 2] /lib/x86_64-linux-gnu/libffi.so.8(+0x7e2e)[0x7faf3ba5fe2e]
[abhijit-ThinkPad-T14-Gen-3:224059] [ 3] /lib/x86_64-linux-gnu/libffi.so.8(+0x4493)[0x7faf3ba5c493]
[abhijit-ThinkPad-T14-Gen-3:224059] [ 4] /usr/lib/python3.10/lib-dynload/_ctypes.cpython-310-x86_64-linux-gnu.so(+0xa3e9)[0x7faf3b7a13e9]
[abhijit-ThinkPad-T14-Gen-3:224059] [ 5] /usr/lib/python3.10/lib-dynload/_ctypes.cpython-310-x86_64-linux-gnu.so(+0x13302)[0x7faf3b7aa302]
[abhijit-ThinkPad-T14-Gen-3:224059] [ 6] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyObject_MakeTpCall+0x25b)[0x55cd396e05eb]
[abhijit-ThinkPad-T14-Gen-3:224059] [ 7] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyEval_EvalFrameDefault+0x6aa1)[0x55cd396d91f1]
[abhijit-ThinkPad-T14-Gen-3:224059] [ 8] /home/abhijit/pyfr/pyfr-venv/bin/python(+0x16e4e1)[0x55cd396f84e1]
[abhijit-ThinkPad-T14-Gen-3:224059] [ 9] /home/abhijit/pyfr/pyfr-venv/bin/python(PyObject_Call+0x122)[0x55cd396f9192]
[abhijit-ThinkPad-T14-Gen-3:224059] [10] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyEval_EvalFrameDefault+0x2b71)[0x55cd396d52c1]
[abhijit-ThinkPad-T14-Gen-3:224059] [11] /home/abhijit/pyfr/pyfr-venv/bin/python(+0x16e4e1)[0x55cd396f84e1]
[abhijit-ThinkPad-T14-Gen-3:224059] [12] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyEval_EvalFrameDefault+0x1981)[0x55cd396d40d1]
[abhijit-ThinkPad-T14-Gen-3:224059] [13] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyFunction_Vectorcall+0x7c)[0x55cd396ea70c]
[abhijit-ThinkPad-T14-Gen-3:224059] [14] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyEval_EvalFrameDefault+0x6bd)[0x55cd396d2e0d]
[abhijit-ThinkPad-T14-Gen-3:224059] [15] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyFunction_Vectorcall+0x7c)[0x55cd396ea70c]
[abhijit-ThinkPad-T14-Gen-3:224059] [16] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyEval_EvalFrameDefault+0x802)[0x55cd396d2f52]
[abhijit-ThinkPad-T14-Gen-3:224059] [17] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyFunction_Vectorcall+0x7c)[0x55cd396ea70c]
[abhijit-ThinkPad-T14-Gen-3:224059] [18] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyObject_FastCallDictTstate+0x16d)[0x55cd396df82d]
[abhijit-ThinkPad-T14-Gen-3:224059] [19] /home/abhijit/pyfr/pyfr-venv/bin/python(+0x16a744)[0x55cd396f4744]
[abhijit-ThinkPad-T14-Gen-3:224059] [20] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyObject_MakeTpCall+0x1fc)[0x55cd396e058c]
[abhijit-ThinkPad-T14-Gen-3:224059] [21] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyEval_EvalFrameDefault+0x71b8)[0x55cd396d9908]
[abhijit-ThinkPad-T14-Gen-3:224059] [22] /home/abhijit/pyfr/pyfr-venv/bin/python(+0x16e62e)[0x55cd396f862e]
[abhijit-ThinkPad-T14-Gen-3:224059] [23] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyEval_EvalFrameDefault+0x2b71)[0x55cd396d52c1]
[abhijit-ThinkPad-T14-Gen-3:224059] [24] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyObject_FastCallDictTstate+0xc4)[0x55cd396df784]
[abhijit-ThinkPad-T14-Gen-3:224059] [25] /home/abhijit/pyfr/pyfr-venv/bin/python(+0x16a7e5)[0x55cd396f47e5]
[abhijit-ThinkPad-T14-Gen-3:224059] [26] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyObject_MakeTpCall+0x1fc)[0x55cd396e058c]
[abhijit-ThinkPad-T14-Gen-3:224059] [27] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyEval_EvalFrameDefault+0x6516)[0x55cd396d8c66]
[abhijit-ThinkPad-T14-Gen-3:224059] [28] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyFunction_Vectorcall+0x7c)[0x55cd396ea70c]
[abhijit-ThinkPad-T14-Gen-3:224059] [29] /home/abhijit/pyfr/pyfr-venv/bin/python(_PyEval_EvalFrameDefault+0x6bd)[0x55cd396d2e0d]
[abhijit-ThinkPad-T14-Gen-3:224059] *** End of error message ***
Segmentation fault (core dumped)

Update:

  • tried re-installing the libomp-dev and libgomp1 packages from Ubuntu through sudo apt-get
  • tried the actions on this thread by using SHARED=1 NOBLAS=1 instead of STATIC=0 BLAS=0 Failed to run example cases using openmp as backend
  • tried all of libxsmmext.so, libxsmm.so.1, and libxsmm.so.1.17.0 as the PYFR_XSMM_LIBRARY_PATH variable similar to this thread OpenMP xsmm, kernel mul has no providers but same seg fault for the latter 3 and a different error for the former

Wondering if I am missing something? (Is it an issue with AMD processors? As the following thread happens upon the same problem it seems When using AMD CPU with PyFR)

Thank you if anybody may have any tips!

Posting an update for what worked for me if anyone seems to be encountering similar issues:

I had to go and get the exact commit from July 5, 2022 mentioned in the install guide from libxsmm - commit 0db15a0da13e3d9b9e3d57b992ecb3384d2e15ea and build that using SHARED=1 NOBLAS=1.

Although the install guide mentions >= that commit, the latest master branch commit of libxsmm doesn’t seem to work (at least for my machine or settings), so it appears perhaps something has broken since then.

Hi,

libxsmm updated their interface a while back and hence the latest version of libxsmm doesn’t work with PyFR v1.15.0 (latest release). As you found out you need to use an older version e.g. checkout 0db15a0da13e3d9b9e3d57b992ecb3384d2e15ea as you did.

The head of PyFR develop branch has, however, now been updated to work with the latest version of libxsmm (and indeed won’t work with older versions such as 0db15a0da13e3d9b9e3d57b992ecb3384d2e15ea).

Hope that makes sense!

1 Like