cPickle has me in a pickle

Jacob_Crabill · 28 July 2014 20:27

Hi All,

I’ve managed to compile & run PyFR on one GPU-enabled desktop, but my simple Ubuntu machine is giving me the following error when I try to run the code using the openmp backend (just the last few lines of error messages shown):

File "/usr/local/lib/python2.7/dist-packages/pyfr-0.2.1-py2.7.egg/pyfr/backends/openmp/cblas.py", line 114, in mul
par_gemm = self._build_kernel('par_gemm', src, argt)
File "/usr/local/lib/python2.7/dist-packages/pyfr-0.2.1-py2.7.egg/pyfr/util.py", line 26, in __call__
key = (self.func, pickle.dumps(args[1:], 1), pickle.dumps(kwargs, 1))
cPickle.PicklingError: Can't pickle <type 'numpy.int32'>: it's not the same object as numpy.int32

I believe the only BLAS that I have installed on this machine is ATLAS BLAS, in case that’s relevant (installed from the Ubuntu repository). Additionally, my default mpi is mpich2 instead of openmpi. Any tips are appreciated!

Thanks,
Jacob Crabill

fdw · 28 July 2014 20:58

It is unlikely to be your BLAS library. The error is coming when PyFR
attempts to call (simplified):

cPickle.dumps([numpy.int32], 1)

with the module complaining that there appears to be two different
numpy.int32 classes floating around. This is strongly indicative of a
configuration issue. Perhaps stale .pyc files floating around, an
incorrect PYTHONPATH, or two versions of NumPy on the system (both of
which are somehow getting imported).

Regards, Freddie.

Jacob_Crabill · 28 July 2014 21:33

Okay, thanks. I do have a couple installs of numpy (one of them an older version); that older version must be getting imported like you said. Hopefully removing the older version will fix it.

Karl_Napf · 15 May 2015 12:25

Dear folks,

did removing the old numpy libraries solve the problem?

I’m getting the same error for PyFr 0.8.0 using Python3.4 and numpy-1.9.2.
I have also installed pyfr in a virtual environment.

I also tried

`
import pickle
import numpy as np

pickle.dumps([np.int32], 1)

`

seperately and it did not cause errors.
Any ideas whats going wrong here or how to narrow down (or reproduce) the error?

Best regards

p.vincent · 20 May 2015 00:47

Hi Karl,

So just to clarify - you have a virtual env pointing to Python 3.4, and you have pip installed numpy-1.9.2 within the virtual env?

Cheers

Peter

Karl_Napf · 17 June 2015 22:03

Dear Peter,

thank you for your reply!
I recently switched to anaconda as my package manager. I installed the following packages:

h5py 2.5.0 np19py34_2
mako 1.0.1
mpi4py 1.3.1
numpy 1.9.2 py34_0
python 3.4.3 1
pytools 2014.3.5
mpmath 0.19

Then I went on installing PyFr 0.8.0 by issuing python setup.py install. There was no error and PyFr also appeared in conda list afterwards.
I also installed OpenBLAS-0.2.14 and I have gcc 4.8.2.

Since I’m currently working on a laptop without any notable graphics acceleration I wanted to use the OpenMP backend for multicore CPUs.

I generally followed the couette flow instructions. I edited the couette_flow_2d.ini and modified the openmp-backend section:

`
[backend-openmp]
cblas = /opt/OpenBLAS/lib/libopenblas.so
cc = gcc
cblas-type = parallel

`

I was able to convert the Gmsh file without any error. I then tried to run pyfr using
pyfr run --backend openmp couette_flow_2d.pyfrm couette_flow_2d.ini

This then resulted in the known numpy.int32 error:
_pickle.PicklingError: Can’t pickle <class ‘numpy.int32’>: it’s not the same object as numpy.int32

I hope this helps. Any idea what I did wrong? I would really love to play around with PyFR a bit!

Best regards

fdw · 17 June 2015 22:11

Hi Karl,

thank you for your reply!
I recently switched to anaconda as my package manager. I installed the
following packages:

It is likely that if you're using anaconda then you probably have more
than one version of Python floating around on your system. This,
combined with a broken search path, is causing the problem.

My advice would be to use your system package manager to get Python 3.3
or 3.4 (and nothing else!) along with virtualenv. Then create a
virtualenv, activate it, and use pip inside this virtualenv to install
the dependencies along with PyFR.

This way everything will be pulled in fresh ignoring anything else which
might be on the system.

Regards, Freddie.

arvind_iyer · 17 June 2015 22:24

Hi,

Please do as Freddie suggested, that is indeed the recommended way. But could you please mail the output of
uname -a

and the output of the following when typed in ipython?

import numpy as np
np
np.intp
np.intp==np.int32

I ask because I have seen this problem on one particular architecture.

Regards
Arvind

Miquel_Vidal · 18 June 2015 15:47

Hi,

I’ve been checking this post in hope for a solution, because I’m experiencing the same problem.

Versions 0.2.4, 0.3.0 and 0.8.0 all have this problem.

I’ve been trying to execute the program in an ARM architecture without success. The output of ‘uname -a’ is the following:

Linux mb-1101 3.11.0-bsc_opencl+ #1 SMP Wed Dec 10 16:31:28 CET 2014 armv7l armv7l armv7l GNU/Linux

And the output of the Python commands:

import numpy as np
np
<module ‘numpy’ from ‘/usr/local/lib/python3.4/dist-packages/numpy/init.py’>
np.intp
<class ‘numpy.int32’>
np.intp==np.int32
False

Could it be an architecture problem, as Arvind comments in the previous message?

I’ve tried uninstalling and reinstalling all numpy instances, as well as python3.4 itself. None of that solutions worked. Is there anything else that I could try?

Thank you very much.

Miquel Vidal

El dijous, 18 juny de 2015 0:24:46 UTC+2, arvind iyer va escriure:

arvind_iyer · 18 June 2015 17:41

Hi,

We had tried installing PyFR on the RaspberryPI (just for fun :), and
that was the peculiar architecture I was talking about) and faced this issue.

The common thing in all these cases is the ARM architecture and 32Bit.
My understanding is that this a bug in numpy ARM version where
as stated above np.intp==np.int32 gives a false, where np.intp is np.int32 ,
whereas in other machines it gives a positive, which seems to be the bug.

Ok about how we got it running? It a ghastly hack. Replace all

occurances of np.intp to np.int32. A quick grep gives:

./backends/base/generator.py:100: argt.append([np.intp]*(2 + va.ncdim))
./backends/base/generator.py:103: argt.append([np.intp])
./backends/base/generator.py:106: argt.append([np.intp, np.int32])
./backends/openmp/blasext.py:21: [np.int32] + [np.intp, y.dtype]*(1 + nv))
./backends/openmp/blasext.py:52: 'errest', src, [np.int32] + [np.intp]*3 + [dtype]*2, restype=dtype
./backends/openmp/cblas.py:88: np.intp, np.int32, np.int32, np.int32,
./backends/openmp/cblas.py:89: a.dtype, np.intp, np.int32, np.intp, np.int32,
./backends/openmp/cblas.py:90: a.dtype, np.intp, np.int32

Just replace those intps, and hopefully things should work. But then
that version wont work on the 64bit machines

I do not remember now of any other problem that we faced on RPI.

Let us know.

Regards

Arvind

Karl_Napf · 18 June 2015 19:28

Dear all,

seems like I’m not the only one having that problem

My uname -n gives
Linux ProtonBox 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:31:42 UTC 2014 i686 i686 i686 GNU/Linux

Upon comparing np.intp and np.int32 I get the following:

Python 3.4.3 |Anaconda 2.2.0 (32-bit)| (default, Jun 4 2015, 15:28:02)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type “help”, “copyright”, “credits” or “license” for more information.

import numpy as np
np
<module ‘numpy’ from ‘/home/tim/python/anaconda3/lib/python3.4/site-packages/numpy/init.py’>
np.intp
<class ‘numpy.int32’>
np.intp==np.int32
False

I also tested this with my 64 bit machine at work today and on this architecture, np.intp is indeed equal to np.int64.
If I find some time next week I will try out Arvinds fix.

Thanks for your help!

fdw · 18 June 2015 19:36

This is a problem that will afflict all 32-bit systems due to a bug in
numpy. You can reproduce with

import pickle
import numpy

pickle.dumps([np.intp, np.int32])

At least on x86 systems 32-bit should be avoided on account of it having
half the number of vector registers compared to 64-bit. This affects
the performance of both PyFR and the underlying BLAS library.

Regards, Freddie.

Dani_Ruiz · 30 June 2015 13:41

Hi,

Regarding the issue Miquel had, chaning the intp instances by int32 solves the problem with pickle on ARM 32 bits and, therefore, the application runs correctly with OpenMP backend.

However, when running MPI+OpenMP it crashes again due to some key error with int32. Below the specific error:

└┌(%:~/SCC/PyFR-1.0.0/couette_flow_2d)┌- mpirun -n 2 ~/.local/bin/pyfr run -b openmp -p couette_flow_2d.pyfrm couette_flow_2d.ini

Traceback (most recent call last):
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-1.0.0-py3.4.egg/pyfr/util.py", line 32, in __call__
KeyError: (<function OpenMPKernelProvider._build_kernel at 0xb55a1858>, b'(X\t\x00\x00\x00pack_viewq\x00XU\n\x00\x00\n\n#include <omp.h>\n#include <stdlib.h>\n#include <tgmath.h>\n\n#define PYFR_ALIGN_BYTES 32\n#define PYFR_NOINLINE __attribute__ ((noinline))\n\n#define min(a, b) ((a) < (b) ? (a) : (b))\n#define max(a, b) ((a) > (b) ? (a) : (b))\n\n// Typedefs\ntypedef double fpdtype_t;\n\n// OpenMP static loop scheduling functions\n\nstatic inline int\ngcd(int a, int b)\n{\n return (a == 0) ? b : gcd(b % a, a);\n}\n\nstatic inline void\nloop_sched_1d(int n, int align, int *b, int *e)\n{\n int tid = omp_get_thread_num();\n int nth = omp_get_num_threads();\n\n // Round up n to be a multiple of nth\n int rn = n + nth - 1 - (n - 1) % nth;\n\n // Nominal tile size\n int sz = rn / nth;\n\n // Handle alignment\n sz += align - 1 - (sz - 1) % align;\n\n // Assign the starting and ending index\n *b = sz * tid;\n *e = min(*b + sz, n);\n\n // Clamp\n if (*b >= n)\n *b = *e = 0;\n}\n\nstatic inline void\nloop_sched_2d(int nrow, int ncol, int colalign,\n int *rowb, int *rowe, int *colb, int *cole)\n{\n int tid = omp_get_thread_num();\n int nth = omp_get_num_threads();\n\n // Distribute threads\n int nrowth = gcd(nrow, nth);\n int ncolth = nth / nrowth;\n\n // Row and column indices for our thread\n int rowix = tid / ncolth;\n int colix = tid % ncolth;\n\n // Round up ncol to be a multiple of ncolth\n int rncol = ncol + ncolth - 1 - (ncol - 1) % ncolth;\n\n // Nominal tile size\n int ntilerow = nrow / nrowth;\n int ntilecol = rncol / ncolth;\n\n // Handle column alignment\n ntilecol += colalign - 1 - (ntilecol - 1) % colalign;\n\n // Assign the starting and ending row to each thread\n *rowb = ntilerow * rowix;\n *rowe = *rowb + ntilerow;\n\n // Assign the starting and ending column to each thread\n *colb = ntilecol * colix;\n *cole = min(*colb + ntilecol, ncol);\n\n // Clamp\n if (*colb >= ncol)\n *colb = *cole = 0;\n}\n\n\n\n\nvoid\npack_view(int n, int nrv, int ncv,\n const fpdtype_t *__restrict__ v,\n const int *__restrict__ vix,\n const int *__restrict__ vcstri,\n const int *__restrict__ vrstri,\n fpdtype_t *__restrict__ pmat)\n{\n if (ncv == 1)\n for (int i = 0; i < n; i++)\n pmat[i] = v[vix[i]];\n else if (nrv == 1)\n for (int i = 0; i < n; i++)\n for (int c = 0; c < ncv; c++)\n pmat[c*n + i] = v[vix[i] + vcstri[i]*c];\n else\n for (int i = 0; i < n; i++)\n for (int r = 0; r < nrv; r++)\n for (int c = 0; c < ncv; c++)\n pmat[(r*ncv + c)*n + i] = v[vix[i] + vrstri[i]*r\n + vcstri[i]*c];\n}\n\nq\x01X\x08\x00\x00\x00iiiPPPPPq\x02tq\x03.', b'}q\x00.')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/druiz/.local/bin/pyfr", line 9, in <module>
load_entry_point('pyfr==1.0.0', 'console_scripts', 'pyfr')()
File "/usr/lib/python3/dist-packages/mpmath/ctx_mp.py", line 1301, in g
return f(*args, **kwargs)
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/scripts/main.py", line 126, in main
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/scripts/main.py", line 247, in process_run
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/scripts/main.py", line 231, in _process_common
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/solvers/__init__.py", line 14, in get_solver
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/integrators/__init__.py", line 29, in get_integrator
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/integrators/controllers.py", line 14, in __init__
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/integrators/steppers.py", line 9, in __init__
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/integrators/writers.py", line 15, in __init__
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/integrators/base.py", line 47, in __init__
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/solvers/base/system.py", line 59, in __init__
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/solvers/base/system.py", line 158, in _gen_kernels
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/solvers/baseadvec/inters.py", line 55, in <lambda>
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/backends/base/backend.py", line 173, in kernel
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/backends/openmp/packing.py", line 17, in pack
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/util.py", line 34, in __call__
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/backends/openmp/provider.py", line 14, in _build_kernel
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/backends/openmp/compiler.py", line 42, in function
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/backends/openmp/compiler.py", line 42, in <listcomp>
File "/home/druiz/.local/lib/python3.4/site-packages/pyfr-0.8.0-py3.4.egg/pyfr/nputil.py", line 123, in npdtype_to_ctypestype
KeyError: <class 'numpy.int32'>

After this message, OpenMPI invokes on rank 1 an MPI_ABORT, finishing the execution. Could this be related with the modifications made before? I also noticed that he MPI_ABORT is never invoked on the MPI process with rank 0, not pretty sure if this could help.

Best regards,
-Dani

fdw · 30 June 2015 14:47

It seems to be related to the same numpy issue that breaks regular
PyFR on 32-bit systems. As a workaround can you try, against a stock
copy of PyFR, opening up pyfr/__main__.py and before

import pyfr.scripts.main

insert

import numpy as np
np.intp = np.int32

and see if that works. The good news is that if it solves the problem
it should solve it everywhere and with only minor code modifications.

Regards, Freddie.

Dani_Ruiz · 1 July 2015 09:06

Hi Freddie,

Your change leaded again to the error with pickle… So I added the “np.intp = np.int32” line to every file importing numpy module and that solved the issue.

Anyhow, now I’m facing another problem with MPI (most probably related with mpi4py Python module) which I’m trying to solve right now.

Thanks a lot for all the help

Best regards,
-Dani

Dani_Ruiz · 3 July 2015 09:31

Hi all,

I’m writing just to let you know I finally got an execution with MPI. I’m attaching a patch with the changes I’ve made to the code to make PyFR work on a 32 bits platform (in this case, also ARM).

I found that the problem with the key error was caused by some other “bug” in numpy. Here how you can reproduce it:

import numpy as np
np.dtype(‘P’).type
<class ‘numpy.uint32’>
np.uint32
<class ‘numpy.uint32’>
np.dtype(‘P’).type == np.uint32
False

This also happens with ‘i’ dtype (int32), so when trying to access to a map that has numpy types as keys, the comparison fails and the key error exception happens.

Thanks for all the help provided.

Best regards,
-Dani

pyfr1.0.0.patch (1.1 KB)

arvind_iyer · 3 July 2015 09:51

Filed a bug report to numpy:

https://github.com/numpy/numpy/issues/6038

arvind_iyer · 14 July 2015 13:43

Still no comment on this from numpy’s side or others.
Comments/confirmation may help
bring it to the attention of the numpy team.

Regards

Topic		Replies	Views
UnpicklingError, running pyfr with MPI Errors	3	280	1 September 2022
PyFR 0.8.0: Openmp backend compilation failure General	3	163	26 June 2015
Compilation failure with openmp backend Just Starting	9	354	27 May 2017
'pyfr run' KeyError for CUDA kernel General	4	175	6 July 2015
PyFR 1.3.0: OpenMP backend error General	9	226	11 April 2016

cPickle has me in a pickle

Related topics