Questions on the TGV case

This post was moved here from here: TGV Performance Numbers

In the ini file provided, what’s the meaning of parameter gimmik-max-nnz = 512?

when i use this config run with openmp in CPU, it shows


why?

“h5dump -d stats nse_tgv_3d_p3-5.00.pyfrs” this command means that .pyfrs file is HDF5 File?

This ini file is a little old and gimmik-max-nnz is a GiMMiK configuration for the number of non zero entries that is longer supported.

Your error is most likely due to PyFR not being able to find a BLAS library. To understand more about the backends of PyFR this is a useful thread: What's the relationship between Gimmik and cublas?

Yes pyfrs and pyfrm file are in the HDF5 format. To quote the documentation:

To prevent situations where you have solutions files for unknown configurations, the contents of the .ini file are added as an attribute to .pyfrs files. These files use the HDF5 format and can be straightforwardly probed with tools such as h5dump.

I install openblas again and wirte the path in file .ini like
image
still error !
I also wonder why there is no error when I test the example case like euler-vortex provided in original source code ?

excuse me , I use AMD CPU run the case but get a error that


Have you change the .ini mentioned before?

I don’t think that this is a problem specific to AMD, what is happening is that on this case the gimmik kernels aren’t appropriate. So instead it’s trying to use blas but can’t find the relevant kernels.

On the examples, as the are all 2D, what is probably happening is that it is only using Gimmik as all the matrices are small and relatively sparse. This is why the issue is unlikely to show up.

libopenblas-dev is already the newest version (0.3.8+ds-1ubuntu0.20.04.1)the openblas is install


the ini is

So Could you please tell me what’s wrong happen?

Looking at the source, that method for setting the blas path was removed in 2015. Can you try setting this and see if it works? This should force pyfr to use gimmik.

[backend-openmp]
gimmik-max-nnz = 10000

OK ,it works .Thank you ! So I must setting the blas path through setting environment variables?

Yep, I’m not completely sure on way to set it, @fdw is the person who will know for sure. But I would guess that it would be an environment variable called PYFR_BLAS_LIBRARY_PATH

PyFR has not used BLAS in a long time. The recommended solution is to make libxsmm available (which will become a hard dependency in the next release).

Regards, Freddie.

1 Like

I thought it wasn’t using blas anymore looking at the code, but the cpu backend isn’t really the bit I know much about.

It seems that the libxsmm is also for dense and sparse matrix operations ,so what’s the difference between gimmik and libxsmm?
I try to compile the gimmik and got libxsmm.so, also set PYFR_XSMM_LIBRARY_PATH=/xsmm/lib/
I even run the test for pyfr in libxsmm
but the Keyerror "kernel mul has no providers " still exist

I think you need to do something like:
export PYFR_XSMM_LIBRARY_PATH="/xsmm/lib/libxsmm.a"

thank you very much !!!

It needs to be the path to the shared library (.so or .dylib). If you have a static library it is likely that libxsmm has been compiled incorrectly.

Regards, Freddie.

GiMMiK generates C code and just handles the case of sparse operators. libxsmm generates assembly code and handles sparse and dense operators. Its performance is almost always superior as it has complete control over register allocation and things such as unrolling.

Regards, Freddie.

1 Like

However, when I run the case euler_vortex_2d with pyfr1.12.3, with the config setting gimmik = 1(is that means that I am using libxsmm?) it is very slow.

If you are setting gimmik-max-nnz = 1, and -b openmp you will be using libxsmm.

On 2D quad cases gimmik can often out perform libxsmm, that is what you are seeing here. Although it does seem to be slower than I would have expected, @fdw is this consistent with your results?