Runtime Error on macOS with OpenMP

I’ve tried running the 2-D Euler vortex test case with PyFR 1.15.0 on an Apple Silicon Mac. With this architecture, I would ideally like to leverage the OpenMP backend. I modified the euler-vortex.ini file so that the [backend] section includes cc = gcc-13. However, PyFR still seems to want to use clang to compile the kernel, which is problematic as Apple’s clang doesn’t support OpenMP.

The other issue seems to be that the gcc-13 that I installed via homebrew seems to only support OpenMP v. 4.5, so my question is what is the most straightforward way of getting a gcc compiler > 12.0 with OpenMP 5.1 support built-in on the Mac platform? Also, why is PyFR still trying to call clang despite the specific instruction to use gcc-13? Thanks!

I think I may have solved part of my issue by ensuring that the libxsmm dependency was compiled using gcc-13 and g+±13 with libomp.

However, I’ve run into another error which states:

KeyError: 'Kernel "mul" has no providers'

The full stack trace looks like:

pyfr run -b openmp -p 2d-euler-vortex.pyfrm euler-vortex.ini 
Traceback (most recent call last):
  File "/Users/zdavis/Applications/PyFR/pyfr/pyfr", line 292, in <module>
    main()
  File "/Users/zdavis/Applications/PyFR/pyfr/pyfr", line 125, in main
    args.process(args)
  File "/Users/zdavis/Applications/PyFR/pyfr/pyfr", line 269, in process_run
    _process_common(
  File "/Users/zdavis/Applications/PyFR/pyfr/pyfr", line 254, in _process_common
    solver = get_solver(backend, rallocs, mesh, soln, cfg)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zdavis/Applications/PyFR/pyfr/solvers/__init__.py", line 14, in get_solver
    return get_integrator(backend, systemcls, rallocs, mesh, initsoln, cfg)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zdavis/Applications/PyFR/pyfr/integrators/__init__.py", line 34, in get_integrator
    return integrator(backend, systemcls, rallocs, mesh, initsoln, cfg)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zdavis/Applications/PyFR/pyfr/integrators/std/controllers.py", line 11, in __init__
    super().__init__(*args, **kwargs)
  File "/Users/zdavis/Applications/PyFR/pyfr/integrators/std/base.py", line 26, in __init__
    self.system.commit()
  File "/Users/zdavis/Applications/PyFR/pyfr/solvers/base/system.py", line 67, in commit
    self._gen_kernels(self.nregs, self.ele_map.values(), self._int_inters,
  File "/Users/zdavis/Applications/PyFR/pyfr/solvers/base/system.py", line 199, in _gen_kernels
    kern = kgetter(i)
           ^^^^^^^^^^
  File "/Users/zdavis/Applications/PyFR/pyfr/solvers/baseadvec/elements.py", line 81, in <lambda>
    kernels['disu'] = lambda uin: self._be.kernel(
                                  ^^^^^^^^^^^^^^^^
  File "/Users/zdavis/Applications/PyFR/pyfr/backends/base/backend.py", line 187, in kernel
    raise KeyError(f'Kernel "{name}" has no providers')
KeyError: 'Kernel "mul" has no providers'

It seems this may be related to libxsmm. In compiling libxsmm, I cloned the repository and simply compiled using:

make -j4 STATIC=0 BLAS=0 FC=gfortran CC=gcc-13 CXX=g++-13 LDFLAGS="-L/opt/homebrew/opt/libomp/lib -lomp" CPPFLAGS="-I/opt/homebrew/opt/libomp/include -Xpreprocessor -fopenmp"

FC, CC, and CXX were needed to explicitly use gcc instead of clang. The LDFLAGS and CPPFLAGS were needed to point to the libomp installation. Everything seems to have compiled successfully. I have the environment variable PYFR_XSMM_LIBRARY_PATH set to the shared library that was built. Not sure what else is needed…

Okay, I think I figured it out. The cc = gcc-13 parameter needs to be added to the [backend-openmp] section of the *.ini file instead of the [backend] section. Also, the -mprefer-vector-width=512 OpenMP cflag option is not recognized with gcc-13. I think it may have changed to -mvse-vector-bits=512.

One last question I have is that despite having the OMP_NUM_THREADS environment variable set to 24 in my .bash_profile, pyfr seems to only utilized ~433% of the CPU or approximately 4 threads when running the test cases. Is there a way for PyFR to utilize the entire CPU when running?

The -mprefer-vector-width=512 is only required on Intel/AMD CPUs with AVX-512. For ARM the defaults are appropriate.

The test cases which currently ship with PyFR are all relatively small. One would not expect them to scale past a couple of cores.

Regards, Freddie.