Compilation failure with openmp backend

Dear all,

I am new to PyFR and have installed it these days. I d like to run the first tutorial with -b openmp but I received the following error. I am quite confused about it. By the way, I installed openblas in anaconda and the python version is 3.6 as well as the ubuntu version is 12.04.5 LTS. Could anyone help me out?

Thank you all in advance!

dellblack@ubuntu1204:~/Downloads/PyFR-1.6.0/examples_copy/couette_flow_2d$ pyfr run -b openmp couette_flow_2d.pyfrm couette_flow_2d.ini

Traceback (most recent call last):
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/util.py", line 33, in __call__
KeyError: (<function OpenMPKernelProvider._build_kernel at 0x7fd9b99da6a8>, b'\x80\x03X\t\x00\x00\x00gimmik_mmq\x00X\xc7\x07\x00\x00\nvoid\ngimmik_mm(int ncol,\n const double* restrict b, int ldb,\n double* restrict c, int ldc)\n{\n double dotp;\n\n #pragma omp parallel for simd private(dotp)\n for (int i = 0; i < ncol; i++)\n {\n dotp = 1.4788305577012362*b[i + 0*ldb] + -0.6666666666666666*b[i + 3*ldb] + 0.18783610896543051*b[i + 6*ldb];\n c[i + 0*ldc] = dotp;\n dotp = 1.4788305577012362*b[i + 1*ldb] + -0.6666666666666666*b[i + 4*ldb] + 0.18783610896543051*b[i + 7*ldb];\n c[i + 1*ldc] = dotp;\n dotp = 1.4788305577012362*b[i + 2*ldb] + -0.6666666666666666*b[i + 5*ldb] + 0.18783610896543051*b[i + 8*ldb];\n c[i + 2*ldc] = dotp;\n dotp = 0.18783610896543051*b[i + 0*ldb] + -0.6666666666666666*b[i + 1*ldb] + 1.4788305577012362*b[i + 2*ldb];\n c[i + 3*ldc] = dotp;\n dotp = 0.18783610896543051*b[i + 3*ldb] + -0.6666666666666666*b[i + 4*ldb] + 1.4788305577012362*b[i + 5*ldb];\n c[i + 4*ldc] = dotp;\n dotp = 0.18783610896543051*b[i + 6*ldb] + -0.6666666666666666*b[i + 7*ldb] + 1.4788305577012362*b[i + 8*ldb];\n c[i + 5*ldc] = dotp;\n dotp = 0.18783610896543051*b[i + 0*ldb] + -0.6666666666666666*b[i + 3*ldb] + 1.4788305577012362*b[i + 6*ldb];\n c[i + 6*ldc] = dotp;\n dotp = 0.18783610896543051*b[i + 1*ldb] + -0.6666666666666666*b[i + 4*ldb] + 1.4788305577012362*b[i + 7*ldb];\n c[i + 7*ldc] = dotp;\n dotp = 0.18783610896543051*b[i + 2*ldb] + -0.6666666666666666*b[i + 5*ldb] + 1.4788305577012362*b[i + 8*ldb];\n c[i + 8*ldc] = dotp;\n dotp = 1.4788305577012362*b[i + 0*ldb] + -0.6666666666666666*b[i + 1*ldb] + 0.18783610896543051*b[i + 2*ldb];\n c[i + 9*ldc] = dotp;\n dotp = 1.4788305577012362*b[i + 3*ldb] + -0.6666666666666666*b[i + 4*ldb] + 0.18783610896543051*b[i + 5*ldb];\n c[i + 10*ldc] = dotp;\n dotp = 1.4788305577012362*b[i + 6*ldb] + -0.6666666666666666*b[i + 7*ldb] + 0.18783610896543051*b[i + 8*ldb];\n c[i + 11*ldc] = dotp;\n }\n}\nq\x01]q\x02(cnumpy\nint32\nq\x03cnumpy\nint64\nq\x04h\x03h\x04h\x03e\x87q\x05.', b'\x80\x03}q\x00.')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/dellblack/anaconda3/bin/pyfr", line 11, in <module>
load_entry_point('pyfr==1.6.0', 'console_scripts', 'pyfr')()
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/__main__.py", line 110, in main
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/__main__.py", line 235, in process_run
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/__main__.py", line 216, in _process_common
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/solvers/__init__.py", line 16, in get_solver
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/integrators/__init__.py", line 43, in get_integrator
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/integrators/std/controllers.py", line 14, in __init__
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/integrators/std/steppers.py", line 8, in __init__
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/integrators/std/base.py", line 12, in __init__
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/integrators/base.py", line 59, in __init__
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/solvers/base/system.py", line 65, in __init__
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/solvers/base/system.py", line 174, in _gen_kernels
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/solvers/baseadvec/elements.py", line 57, in <lambda>
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/backends/base/backend.py", line 166, in kernel
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/backends/openmp/gimmik.py", line 34, in mul
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/util.py", line 35, in __call__
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/backends/openmp/provider.py", line 13, in _build_kernel
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/backends/openmp/compiler.py", line 58, in __init__
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pytools-2016.2.6-py3.6.egg/pytools/prefork.py", line 223, in call_capture_output
return forker.call_capture_output(cmdline, cwd, error_on_nonzero)
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pytools-2016.2.6-py3.6.egg/pytools/prefork.py", line 180, in call_capture_output
error_on_nonzero)
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pytools-2016.2.6-py3.6.egg/pytools/prefork.py", line 162, in _remote_invoke
raise result
pytools.prefork.ExecError: error invoking 'gcc -shared -std=c99 -Ofast -march=native -fopenmp -fPIC -o libtmp.so tmp.c -lm': status 1 invoking 'gcc -shared -std=c99 -Ofast -march=native -fopenmp -fPIC -o libtmp.so tmp.c -lm': b'tmp.c: In function \xe2\x80\x98gimmik_mm\xe2\x80\x99:\ntmp.c:9:30: error: expected \xe2\x80\x98#pragma omp\xe2\x80\x99 clause before \xe2\x80\x98simd\xe2\x80\x99\n'

Dear all,

When I checked if I installed openblas well, I used the following script from python - Compiling numpy with OpenBLAS integration - Stack Overflow

script:

#!/usr/bin/env python
import numpy
from numpy.distutils.system_info import get_info
import sys
import timeit

print("version: %s" % numpy.__version__)
print("maxint: %i\n" % sys.maxsize)

info = get_info('blas_opt')
print('BLAS info:')
for kk, vv in info.items():
print(' * ' + kk + ' ' + str(vv))

setup = "import numpy; x = numpy.random.random((1000, 1000))"
count = 10

t = timeit.Timer("numpy.dot(x, x.T)", setup=setup)
print("\ndot: %f sec" % (t.timeit(count) / count))

And when I typed the following command,

OMP_NUM_THREADS=1 python build/test_numpy.py
The output is:

BLAS info:
 * libraries ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'iomp5', 'pthread']
 * library_dirs ['/home/dellblack/anaconda3/lib']
 * define_macros [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
 * include_dirs ['/home/dellblack/anaconda3/include']

dot: 0.040585 sec

It is not same as what the author did. So I am not sure if I installed openblas well. If not, what should I do then?
And the reason why I chose to use Anaconda is because Python 3.6 is always installed in usr/local/lib when I use pip or apt-get.,
In that folder, Python 3.6 cannot be set to be default python version since it’s not in usr/lib.

Thank you all.

Hi Henry,

The issue appears to be that your version of GCC is too old to compile the kernels which are generated by PyFR. As per the user guide PyFR requires GCC 4.9 or newer (alternatively, a recent version of the Intel compiler may also be used).

Regards, Freddie.

Dear Freddle,

I updated the gcc to 4.9 version while it shows me another error,

dellblack@ubuntu1204:~/Downloads/PyFR-1.6.0/examples_copy/couette_flow_2d$ pyfr run -b openmp -p couette_flow_2d.pyfrm couette_flow_2d.ini

Traceback (most recent call last):
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/util.py", line 33, in __call__
KeyError: (<function OpenMPKernelProvider._build_kernel at 0x7f97006d9730>, b'\x80\x03X\t\x00\x00\x00gimmik_mmq\x00X\xc7\x07\x00\x00\nvoid\ngimmik_mm(int ncol,\n const double* restrict b, int ldb,\n double* restrict c, int ldc)\n{\n double dotp;\n\n #pragma omp parallel for simd private(dotp)\n for (int i = 0; i < ncol; i++)\n {\n dotp = 1.4788305577012362*b[i + 0*ldb] + -0.6666666666666666*b[i + 3*ldb] + 0.18783610896543051*b[i + 6*ldb];\n c[i + 0*ldc] = dotp;\n dotp = 1.4788305577012362*b[i + 1*ldb] + -0.6666666666666666*b[i + 4*ldb] + 0.18783610896543051*b[i + 7*ldb];\n c[i + 1*ldc] = dotp;\n dotp = 1.4788305577012362*b[i + 2*ldb] + -0.6666666666666666*b[i + 5*ldb] + 0.18783610896543051*b[i + 8*ldb];\n c[i + 2*ldc] = dotp;\n dotp = 0.18783610896543051*b[i + 0*ldb] + -0.6666666666666666*b[i + 1*ldb] + 1.4788305577012362*b[i + 2*ldb];\n c[i + 3*ldc] = dotp;\n dotp = 0.18783610896543051*b[i + 3*ldb] + -0.6666666666666666*b[i + 4*ldb] + 1.4788305577012362*b[i + 5*ldb];\n c[i + 4*ldc] = dotp;\n dotp = 0.18783610896543051*b[i + 6*ldb] + -0.6666666666666666*b[i + 7*ldb] + 1.4788305577012362*b[i + 8*ldb];\n c[i + 5*ldc] = dotp;\n dotp = 0.18783610896543051*b[i + 0*ldb] + -0.6666666666666666*b[i + 3*ldb] + 1.4788305577012362*b[i + 6*ldb];\n c[i + 6*ldc] = dotp;\n dotp = 0.18783610896543051*b[i + 1*ldb] + -0.6666666666666666*b[i + 4*ldb] + 1.4788305577012362*b[i + 7*ldb];\n c[i + 7*ldc] = dotp;\n dotp = 0.18783610896543051*b[i + 2*ldb] + -0.6666666666666666*b[i + 5*ldb] + 1.4788305577012362*b[i + 8*ldb];\n c[i + 8*ldc] = dotp;\n dotp = 1.4788305577012362*b[i + 0*ldb] + -0.6666666666666666*b[i + 1*ldb] + 0.18783610896543051*b[i + 2*ldb];\n c[i + 9*ldc] = dotp;\n dotp = 1.4788305577012362*b[i + 3*ldb] + -0.6666666666666666*b[i + 4*ldb] + 0.18783610896543051*b[i + 5*ldb];\n c[i + 10*ldc] = dotp;\n dotp = 1.4788305577012362*b[i + 6*ldb] + -0.6666666666666666*b[i + 7*ldb] + 0.18783610896543051*b[i + 8*ldb];\n c[i + 11*ldc] = dotp;\n }\n}\nq\x01]q\x02(cnumpy\nint32\nq\x03cnumpy\nint64\nq\x04h\x03h\x04h\x03e\x87q\x05.', b'\x80\x03}q\x00.')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/dellblack/anaconda3/bin/pyfr", line 11, in <module>
load_entry_point('pyfr==1.6.0', 'console_scripts', 'pyfr')()
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/__main__.py", line 110, in main
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/__main__.py", line 235, in process_run
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/__main__.py", line 216, in _process_common
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/solvers/__init__.py", line 16, in get_solver
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/integrators/__init__.py", line 43, in get_integrator
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/integrators/std/controllers.py", line 14, in __init__
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/integrators/std/steppers.py", line 8, in __init__
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/integrators/std/base.py", line 12, in __init__
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/integrators/base.py", line 59, in __init__
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/solvers/base/system.py", line 65, in __init__
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/solvers/base/system.py", line 174, in _gen_kernels
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/solvers/baseadvec/elements.py", line 57, in <lambda>
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/backends/base/backend.py", line 166, in kernel
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/backends/openmp/gimmik.py", line 34, in mul
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/util.py", line 35, in __call__
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/backends/openmp/provider.py", line 13, in _build_kernel
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/backends/openmp/compiler.py", line 64, in __init__
File "/home/dellblack/anaconda3/lib/python3.6/site-packages/pyfr-1.6.0-py3.6.egg/pyfr/backends/openmp/compiler.py", line 127, in _cache_set_and_loadlib
File "/home/dellblack/anaconda3/lib/python3.6/ctypes/__init__.py", line 344, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /home/dellblack/anaconda3/bin/../lib/libgomp.so.1: version `GOMP_4.0' not found (required by /home/dellblack/.cache/pyfr/libc109c2a0864201c2d29f1b5cc6ee57133e70c853d42a55bfa4013aab42133b7a.so)

It seems that libgomp.so.1 is not not found. And I find in this link Error of missing GOMP on centos 6 x86_64 with anaconda python 3, I add this **cflags****=-Wl,-rpath /usr/****lib64 in configuration line.** And fortunately it works. But it runs very slowly.

**![Auto Generated Inline Image 1.png|745x51](upload://gRQVaf8cGfsGYh3X0pvT0w5IxMK.png)**
**And I do not quite follow what you mean**
**![Auto Generated Inline Image 2.png|417x129](upload://nT5ESJmEhGdchGorTFynHMKfwUP.png)**

Cheers,
Henry

Dear Freddle,

I find my CPU usage has reached to 1081%CPU. What I am running is the first tutorial - couette flow 2d.

Auto Generated Inline Image 1.png

Hi Henry,

I am glad that it is now all working. Define, however, what you mean by
"very slowly"? The couette flow test case is a simple 2D example which
is designed to run on a single core. As such, it is not suitable for
any kinds of performance comparisons.

Regards, Freddie.

Dear Freddle,

Thank you for your reply. I did not do any performance comparison and just ran the tutorial following steps in “User Guide”. It should run very quickly but it becomes very slow, nearly about 9 hours to complete. I do not know if adding cflags**=-Wl,-rpath /usr/**lib64 in configuration line would impact the speed. Or this is because the timestep is very small (4e-5 seconds) and the complete time is comparatively long, say, about 4 seconds. But without that configuration line, it will come to errors shown previously.

Regards,
Henry

Dear Freddle,

My original intention is to run the couette tutorial with one single core, but I do not know why it runs with 10 cores. I only changed the openmp shown as follows,

[backend-openmp]
cc = gcc
cblas = /home/dellblack/anaconda3/openblas/lib/libopenblas.so
cblas-type = parallel
cflags=-Wl,-rpath /usr/lib/x86_64-linux-gnu/

Auto Generated Inline Image 1.png

Thank you.

Regards,
Henry

Hi Henry,

Thank you for your reply. I did not do any performance comparison and just ran the tutorial following steps in "User Guide". It should run very quickly but it becomes very slow, nearly about 9 hours to complete. I do not know if adding *cflags**=-Wl,-rpath /usr/**lib64 *in configuration line would impact the speed.* Or this is because *the timestep is very small (4e-5 seconds) and the complete time is comparatively long, say, about 4 seconds.**But without that configuration line, it will come to errors shown previously.

On my Linux system:

freddie@fluorine ~/Programming $ virtualenv pyfr-venv
freddie@fluorine ~/Programming $ . pyfr-venv/bin/activate
(pyfr-venv) freddie@fluorine ~/Programming $ pip install pyfr
(pyfr-venv) freddie@fluorine ~/Programming $ cd pyfr-venv/
(pyfr-venv) freddie@fluorine ~/Programming/pyfr-venv $ wget https://github.com/vincentlab/PyFR/archive/v1.6.0.tar.gz
(pyfr-venv) freddie@fluorine ~/Programming/pyfr-venv $ tar xf v1.6.0.tar.gz
(pyfr-venv) freddie@fluorine ~/Programming/pyfr-venv $ cd PyFR-1.6.0/examples/couette_flow_2d/
(pyfr-venv) freddie@fluorine ~/Programming/pyfr-venv/PyFR-1.6.0/examples/couette_flow_2d $ pyfr import couette_flow_2d.{msh,pyfrm}
(pyfr-venv) freddie@fluorine ~/Programming/pyfr-venv/PyFR-1.6.0/examples/couette_flow_2d $ OMP_NUM_THREADS=1 pyfr run -p -bopenmp couette_flow_2d.pyfrm couette_flow_2d.ini
100.0% [==================] 4.00/4.00 ela: 00:03:21 rem: 00:00:00

where you can see that on a single core of my CPU the simulation took 3 minutes. From the above posts it looks as if you are using the Anaconda distribution. I would strongly advise against this as it has been a source of problems for our users in the past. Instead, start with a recent version of Python and then use a virtualenv and pip to install the relevant dependencies (as I did above).

Regards, Freddie.

Hi Freddie,

Thank you for your reply and help. I will try this later.

Cheers,
Henry