OpenMP xsmm, kernel mul has no providers

Similar to one of the issues described here, running PyFR-1.12.3 release tag:

When attempting to run with the OpenMP backend I’m hitting the same “KeyError: ‘Kernel “mul” has no providers’”. I have set the specified environment variable:

bash-4.4$ echo $PYFR_XSMM_LIBRARY_PATH
/home/mlohry/dev/libxsmm/lib/libxsmm.so

where libxsmm was compiled from the latest update. I’ve also specified cblas in the ini file to a system libmkl_rt.so (is this a necessary step?)

The setup works when running gimmik with nnz large.

Thanks,
Mark

Can you put a print statement around:

to catch exactly why xsmm is unwilling to handle the multiplication (it really should not fail); except NotSuitableError as e: print(e) should do it.

Regards, Freddie.

Hi Freddie, output below. Also confirmed that os.environ see the variable

 'PYFR_XSMM_LIBRARY_PATH': '/home/mlohry/dev/libxsmm/lib/libxsmm.so.1.17.0
Matrix too dense for GiMMiK
Traceback (most recent call last):
  File "/home/mlohry/dev/./PyFR-1.12.3/pyfr/pyfr", line 273, in <module>
    main()
  File "/home/mlohry/dev/./PyFR-1.12.3/pyfr/pyfr", line 117, in main
    args.process(args)
  File "/home/mlohry/dev/./PyFR-1.12.3/pyfr/pyfr", line 250, in process_run
    _process_common(
  File "/home/mlohry/dev/./PyFR-1.12.3/pyfr/pyfr", line 232, in _process_common
    solver = get_solver(backend, rallocs, mesh, soln, cfg)
  File "/home/mlohry/.local/lib/python3.9/site-packages/pyfr/solvers/__init__.py", line 16, in get_solver
    return get_integrator(backend, systemcls, rallocs, mesh, initsoln, cfg)
  File "/home/mlohry/.local/lib/python3.9/site-packages/pyfr/integrators/__init__.py", line 36, in get_integrator
    return integrator(backend, systemcls, rallocs, mesh, initsoln, cfg)
  File "/home/mlohry/.local/lib/python3.9/site-packages/pyfr/integrators/std/controllers.py", line 13, in __init__
    super().__init__(*args, **kwargs)
  File "/home/mlohry/.local/lib/python3.9/site-packages/pyfr/integrators/std/steppers.py", line 133, in __init__
    super().__init__(*args, **kwargs)
  File "/home/mlohry/.local/lib/python3.9/site-packages/pyfr/integrators/std/base.py", line 27, in __init__
    self.system = systemcls(backend, rallocs, mesh, initsoln,
  File "/home/mlohry/.local/lib/python3.9/site-packages/pyfr/solvers/base/system.py", line 68, in __init__
    self._gen_kernels(eles, int_inters, mpi_inters, bc_inters)
  File "/home/mlohry/.local/lib/python3.9/site-packages/pyfr/solvers/base/system.py", line 187, in _gen_kernels
    kernels[pn, kn].append(kgetter())
  File "/home/mlohry/.local/lib/python3.9/site-packages/pyfr/solvers/baseadvec/elements.py", line 45, in <lambda>
    kernels['disu'] = lambda: self._be.kernel(
  File "/home/mlohry/.local/lib/python3.9/site-packages/pyfr/backends/base/backend.py", line 170, in kernel
    raise KeyError(f'Kernel "{name}" has no providers')
KeyError: 'Kernel "mul" has no providers'

This seems to imply that libxsmm is not loaded. Can you put a print statement around:

which will allow us to see precisely why it is failing to load. Can you also confirm how libxsmm was built?

Regards, Freddie.

Ahha! So that gave:

/home/mlohry/dev/libxsmm/lib/libxsmm.so: undefined symbol: sgemv_

And sure enough it’s undefined:

readelf -Ws --dyn-syms ./lib/libxsmm.so | grep gemv
    14: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND sgemv_
    61: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND dgemv_
   106: 000000000002f710     5 FUNC    WEAK   DEFAULT   11 __real_sgemv_
   312: 000000000002f700     5 FUNC    WEAK   DEFAULT   11 __real_dgemv_
   385: 00000000003adf80     8 OBJECT  GLOBAL DEFAULT   25 libxsmm_original_sgemv_function
   483: 000000000002f9a0   157 FUNC    WEAK   DEFAULT   11 libxsmm_original_dgemv
   525: 000000000002fa40   157 FUNC    WEAK   DEFAULT   11 libxsmm_original_sgemv
   544: 000000000039da40     8 OBJECT  GLOBAL DEFAULT   25 libxsmm_original_dgemv_function

but it’s present in libxsmmext.so, not libxsmm.so. Their github mentions all OpenMP functionality goes into the former and not the latter, so maybe that’s the explanation?

Setting PYFR_XSMM_LIBRARY_PATH to libxsmmext.so seems to have cleared it up.

Thanks Freddie!

As noted in the build guide it is important to compile libxsmm as:

make STATIC=0 BLAS=0 CODE_BUF_MAXSIZE=262144

with BLAS=0 ensuring that it will not try to import any BLAS symbols.

Regards, Freddie.

Ahha, so it is! Sorry about that, thanks for the help. Looks like all is working.

Hi,

I am also using PyFR on another cpu cluster for small simulations. I have installed PyFR 1.13 with libxsmm as indicated in the user guide. But exactly kernel nul has no providers error ocurred.

I tried to print out its origin so I print out this line:

and what I get is:

libxssm unable to JIT a kernel
Traceback (most recent call last):
........
  File "/cfs/klemming/home/z/zhenyang/.conda/envs/pyfr_env/lib/python3.9/site-packages/pyfr-1.13.0-py3.9.egg/pyfr/backends/base/backend.py", line 162, in kernel
KeyError: 'Kernel "mul" has no providers'

And I think it is from here

But indeed I don’t know what does this mean. Can you tell me how to fix this problem?

Best wishes,
Zhenyang

Can you confirm you are using the latest master revision of libxsmm. Some series bugs have been fixed recently.

Regards, Freddie.

I tried latest master branch and commit 14b6cea61376653b2712e3eefa72b13c5e76e421 and also the latest release version. All gave me the same error.

I am running the example case inc_cylinder_2d serial.

Best wishes,
Zhenyang

PyFR v1.13 with the latest libxsmm:

❯ git rebase upstream/master
Successfully rebased and updated refs/heads/master.
❯ make STATIC=0 BLAS=0 CODE_BUF_MAXSIZE=262144 -j4 > /dev/null 2>&1
❯ cd ~/Programming/PyFR/examples/inc_cylinder_2d
❯ git checkout v1.13.0 
❯ export PYFR_XSMM_LIBRARY_PATH=$HOME/Programming/libxsmm/lib/libxsmm.so
❯ pyfr run -bopenmp -p inc_cylinder_2d.pyfrm inc_cylinder_2d.ini
100.0% [===============================================================================>] 75.00/75.00 ela: 00:07:24 rem: 00:00:00

Regards, Freddie.

1 Like

Hi Freddie,

I just delated everything and recreated environment and re-compiled the library and the code. Suddenly everything worked. I don’t know what was the problem but for now, it is working. Thanks again for your kindly help!

Best wishes,
Zhenyang