Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: an integer is required (got type bytes) #404

Open
louismartin opened this issue Dec 3, 2020 · 7 comments
Open

TypeError: an integer is required (got type bytes) #404

louismartin opened this issue Dec 3, 2020 · 7 comments

Comments

@louismartin
Copy link

Since upgrading to python 3.8 I can't access pickle files created with python 3.7.
This originated from this issue and might be related to this one.

Repro

In python 3.7 (cloudpickle==1.6.0):

import cloudpickle

def foo():
    pass

with open("foo.pkl", "wb") as ofile:
    cloudpickle.dump(foo, ofile)

In python 3.8 (cloudpickle==1.6.0):

import cloudpickle

with open("foo.pkl", "rb") as ifile:
    print(cloudpickle.load(ifile))

Throws

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
TypeError: an integer is required (got type bytes)

Is there any way to migrate those pickle files to use them with python 3.8?

@gwenzek
Copy link

gwenzek commented Dec 4, 2020

Can you reproduce this with standard pickle ? You would need to move foo to a module first.

Also could you print cloudpickle.DEFAULT_PROTOCOL and pickle.DEFAULT_PROTOCOL ? (they should be different IIUC)
And try with cloudpickle.dump(foo, ofile, protocol=pickle.DEFAULT_PROTOCOL) ?
Similarly try also with pickle.dump(foo, ofile, protocol=cloudpickle.DEFAULT_PROTOCOL).

@louismartin
Copy link
Author

louismartin commented Dec 4, 2020

If I move foo() to foo.py it works both with cloudpickle and standard pickle, so this might be related with functions defined in the main scope.

In python 3.7:

>>> print(cloudpickle.DEFAULT_PROTOCOL)
4
>>> print(pickle.DEFAULT_PROTOCOL)
3

In python 3.8:

>>> print(cloudpickle.DEFAULT_PROTOCOL)
5
>>> print(pickle.DEFAULT_PROTOCOL)
4

The same error happens when using cloudpickle.dump(foo, ofile, protocol=pickle.DEFAULT_PROTOCOL) (I haven't try with pickle.dump because it does not throw an error when moving foo() to a module).

@chiwhalee
Copy link

I have encountered the same issue.

@reedox
Copy link

reedox commented Mar 5, 2021

I replicated this with a clean virtual env install with python 3.8. Downgrading to 3.7 fixes the bug.

@petrmitrichev
Copy link

I have debugged why this happens:

When reducing a code object for pickling, cloudpickle stores it as a list of arguments to CodeType constructor: code. As you can see in the linked code, the list of those arguments changes between Python versions. Therefore pickling using one version of Python and unpickling using a different version does not work.

It could be resolved by creating a custom function to be used in reconstructing code objects, and having it deal with different Python source versions. However, I'm not sure if this would be a helpful improvement, or a false promise if passing pickled objects between different Python versions (or different cloudpickle versions, for that matter) is not expected to work anyway. Looking forward for the thoughts of cloudpickle maintainers on this issue.

@Ankur-singh
Copy link

I am facing the same issue. I have used mlflow to log my pytorch model (a few months back). I am not able to load the model back after updating to python 3.8 (from 3.7). This is very restrictive. Just changing the python version or cloudpickle version will break all the code.

I would really appreciate any help.

@xiaoyongzhu
Copy link

Same issue here (cloudpickling objects from 3.7 and unpickle it from 3.8). Feel it's probably an intrinsic issue for cloudpickle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants