Skip to content
This repository has been archived by the owner on Jun 4, 2023. It is now read-only.

Commit

Permalink
Merge pull request #179 from Peque/cloudpickle
Browse files Browse the repository at this point in the history
Add cloudpickle serialization support
  • Loading branch information
irmen committed Aug 12, 2017
2 parents 502e1e7 + a1f424c commit 480cd84
Show file tree
Hide file tree
Showing 11 changed files with 127 additions and 44 deletions.
15 changes: 10 additions & 5 deletions docs/source/clientcode.rst
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,7 @@ For normal usage, there's not a single line of Pyro specific code once you have
.. index::
single: object serialization
double: serialization; pickle
double: serialization; cloudpickle
double: serialization; dill
double: serialization; serpent
double: serialization; marshal
Expand Down Expand Up @@ -167,6 +168,9 @@ on what objects you can use.
but it's safe to add to the accepted serializers config item if you have it installed.
* **pickle**: the legacy serializer. Fast and supports almost all types. Part of the standard library.
Has security problems, so it's better to avoid using it.
* **cloudpickle**: See https://pypi.python.org/pypi/cloudpickle It is similar to pickle serializer, but more capable. Extends python's 'pickle' module
for serializing and de-serializing python objects to the majority of the built-in python types.
Has security problems though, just as pickle.
* **dill**: See https://pypi.python.org/pypi/dill It is similar to pickle serializer, but more capable. Extends python's 'pickle' module
for serializing and de-serializing python objects to the majority of the built-in python types.
Has security problems though, just as pickle.
Expand All @@ -177,6 +181,7 @@ You select the serializer to be used by setting the ``SERIALIZER`` config item.
The valid choices are the names of the serializer from the list mentioned above.
If you're using pickle or dill, and need to control the protocol version that is used,
you can do so with the ``PICKLE_PROTOCOL_VERSION`` or ``DILL_PROTOCOL_VERSION`` config items.
If you're using cloudpickle, you can control the protocol version with ``PICKLE_PROTOCOL_VERSION`` as well.
By default Pyro will use the highest one available.

It is possible to override the serializer on a particular proxy. This allows you to connect to one server
Expand All @@ -192,15 +197,15 @@ serializer, for instance. Set the desired serializer name in ``proxy._pyroSerial
.. note::
The serializer(s) that a Pyro server/daemon accepts, is controlled by a different
config item (``SERIALIZERS_ACCEPTED``). This can be a set of one or more serializers.
By default it accepts the set of 'safe' serializers, so "``pickle``" and "``dill``" are excluded.
If the server doesn't accept the serializer that you configured
By default it accepts the set of 'safe' serializers, so "``pickle``", "``cloudpickle``"
and "``dill``" are excluded. If the server doesn't accept the serializer that you configured
for your client, it will refuse the requests and respond with an exception that tells
you about the unsupported serializer choice. If it *does* accept your requests,
the server response will use the same serializer that was used for the request.

.. note::
Because the name server is just a regular Pyro server as well, you will have to tell
it to allow the pickle or dill serializers if your client code uses them.
it to allow the pickle, cloudpickle or dill serializers if your client code uses them.
See :ref:`nameserver-pickle`.


Expand All @@ -212,7 +217,7 @@ Changing the way your custom classes are (de)serialized
-------------------------------------------------------

.. note::
The information in this paragraph is not relevant when using the pickle or dill serialization protocols,
The information in this paragraph is not relevant when using the pickle, cloudpickle or dill serialization protocols,
they have their own ways of serializing custom classes.

By default, custom classes are serialized into a dict.
Expand Down Expand Up @@ -356,7 +361,7 @@ The signature of the batch proxy call is as follows:
Invoke the batch and when done, returns a generator that produces the results of every call, in order.
If ``oneway==True``, perform the whole batch as one-way calls, and return ``None`` immediately.
If ``async==True``, perform the batch asynchronously, and return an asynchronous call result object immediately.

**Simple example**::

batch = Pyro4.batch(proxy)
Expand Down
2 changes: 1 addition & 1 deletion docs/source/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ PREFER_IP_VERSION int 4 The IP address type th
THREADPOOL_SIZE int 40 For the thread pool server: maximum number of threads running
THREADPOOL_SIZE_MIN int 4 For the thread pool server: minimum number of threads running
FLAME_ENABLED bool False Should Pyro Flame be enabled on the server
SERIALIZER str serpent The wire protocol serializer to use for clients/proxies (one of: serpent, json, marshal, msgpack, pickle, dill)
SERIALIZER str serpent The wire protocol serializer to use for clients/proxies (one of: serpent, json, marshal, msgpack, pickle, cloudpickle, dill)
SERIALIZERS_ACCEPTED set json,marshal,serpent The wire protocol serializers accepted in the server/daemon. In your code it should be a set of strings,
use a comma separated string instead when setting the shell environment variable.
PICKLE_PROTOCOL_VERSION int highest possible The pickle protocol version to use, if pickle is selected as serializer. Defaults to pickle.HIGHEST_PROTOCOL
Expand Down
6 changes: 3 additions & 3 deletions docs/source/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ It will probably not work with Jython 2.7 at this time of writing. If you need t


.. note::
When Pyro is configured to use pickle, dill or marshal as its serialization format, it is required to have the same *major* Python versions
on your clients and your servers. Otherwise the different parties cannot decipher each others serialized data.
When Pyro is configured to use pickle, cloudpickle, dill or marshal as its serialization format, it is required to have the same
*major* Python versions on your clients and your servers. Otherwise the different parties cannot decipher each others serialized data.
This means you cannot let Python 2.x talk to Python 3.x with Pyro, when using those serializers.
However it should be fine to have Python 3.3 talk to Python 3.4 for instance.
The other protocols (serpent, json) don't have this limitation!
Expand Down Expand Up @@ -76,4 +76,4 @@ It contains:
and a couple of other files:
a setup script and other miscellaneous files such as the license (see :doc:`license`).

If you don't want to download anything, you can view all of this `online on Github <https://github.com/irmen/Pyro4>`_.
If you don't want to download anything, you can view all of this `online on Github <https://github.com/irmen/Pyro4>`_.
10 changes: 5 additions & 5 deletions docs/source/intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ Here's a quick overview of Pyro's features:
- works between different system architectures and operating systems.
- able to communicate between different Python versions transparently.
- defaults to a safe serializer (`serpent <https://pypi.python.org/pypi/serpent>`_) that supports many Python data types.
- supports different serializers (serpent, json, marshal, msgpack, pickle, dill).
- support for all Python data types that are serializable when using the 'pickle' or 'dill' serializers [1]_.
- supports different serializers (serpent, json, marshal, msgpack, pickle, cloudpickle, dill).
- support for all Python data types that are serializable when using the 'pickle', 'cloudpickle' or 'dill' serializers [1]_.
- can use IPv4, IPv6 and Unix domain sockets.
- optional secure connections via SSL/TLS (encryption, authentication and integrity), including certificate validation on both ends (2-way ssl).
- lightweight client library available for .NET and Java native code ('Pyrolite', provided separately).
Expand Down Expand Up @@ -93,7 +93,7 @@ Remote controlling resources or other programs is a nice application as well.
For instance, you could write a simple
remote controller for your media server that is running on a machine somewhere in a closet.
A simple remote control client program could be used to instruct the media server
to play music, switch playlists, etc.
to play music, switch playlists, etc.

Another example is the use of Pyro to implement a form of `privilege separation <http://en.wikipedia.org/wiki/Privilege_separation>`_.
There is a small component running with higher privileges, but just able to execute the few tasks (and nothing else)
Expand Down Expand Up @@ -257,9 +257,9 @@ Experiment with the ``benchmark``, ``batchedcalls`` and ``hugetransfer`` example

.. rubric:: Footnotes

.. [1] When configured to use the :py:mod:`pickle` or :py:mod:`dill` serializer,
.. [1] When configured to use the :py:mod:`pickle`, :py:mod:`cloudpickle` or :py:mod:`dill` serializer,
your system may be vulnerable
because of the security risks of the pickle and dill protocols (possibility of arbitrary
because of the security risks of these serialization protocols (possibility of arbitrary
code execution).
Pyro does have some security measures in place to mitigate this risk somewhat.
They are described in the :doc:`security` chapter. It is strongly advised to read it.
Expand Down
19 changes: 10 additions & 9 deletions docs/source/nameserver.rst
Original file line number Diff line number Diff line change
Expand Up @@ -520,16 +520,17 @@ You can control its behavior by setting certain Pyro config items before startin

.. index::
double: name server; pickle
double: name server; cloudpickle
double: name server; dill

.. _nameserver-pickle:

Using the name server with pickle or dill serializers
=====================================================
If you find yourself in the unfortunate situation where you absolutely have to use the pickle
Using the name server with pickle, cloudpickle or dill serializers
==================================================================
If you find yourself in the unfortunate situation where you absolutely have to use the pickle, cloudpickle
or dill serializers, you have to pay attention when also using the name server.
Because pickle and dill are disabled by default, the name server will not reply to messages from clients
that are using those serializers, unless you enable them in the name server as well.
Because these serializers are disabled by default, the name server will not reply to messages from clients
that are using them, unless you enable them in the name server as well.

The symptoms are usually that your client code seems unable to contact the name server::

Expand All @@ -546,15 +547,15 @@ And if you enable logging for the name server you will likely see in its logfile
...
Pyro4.errors.ProtocolError: message used serializer that is not accepted: [4,5]

The way to solve this is to stop using the pickle and dill serializers, or if you must use them,
The way to solve this is to stop using the these serializers, or if you must use them,
tell the name server that it is okay to accept them. You do that by
setting the ``SERIALIZERS_ACCEPTED`` config item to a set of serializers that includes pickle or dill,
setting the ``SERIALIZERS_ACCEPTED`` config item to a set of serializers that includes them,
and then restart the name server. For instance::

$ export PYRO_SERIALIZERS_ACCEPTED=serpent,json,marshal,pickle,dill
$ export PYRO_SERIALIZERS_ACCEPTED=serpent,json,marshal,pickle,cloudpickle,dill
$ pyro4-ns

If you enable logging you will then see that the name server says that pickle and dill are among
If you enable logging you will then see that the name server says that pickle, cloudpickle and dill are among
the accepted serializers.


Expand Down
13 changes: 7 additions & 6 deletions docs/source/security.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,15 @@ Security

.. index::
double: security; pickle
double: security; cloudpickle
double: security; dill

Pickle and dill as serialization formats (optional)
===================================================
When configured to do so, Pyro is able to use the :py:mod:`pickle` module or the
:py:mod:`dill` module to serialize objects and then sends them over the network.
It is well known that using pickle or dill for this purpose is a security risk.
The main problem is that allowing a program to unpickle or undill arbitrary data
Pickle, cloudpickle and dill as serialization formats (optional)
================================================================
When configured to do so, Pyro is able to use the :py:mod:`pickle`, :py:mod:`cloudpickle`
or :py:mod:`dill` modules to serialize objects and then sends them over the network.
It is well known that using these serializers for this purpose is a security risk.
The main problem is that allowing a program to deserialize this type of serialized data
can cause arbitrary code execution and this may wreck or compromise your system.
Because of this the default serializer is serpent, which doesn't have this security problem.
Some other means to enhance security are discussed below.
Expand Down
15 changes: 8 additions & 7 deletions docs/source/tipstricks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -142,16 +142,16 @@ The success ratio of all this depends heavily on your network setup.

.. index:: same Python version

Same major Python version required when using pickle, dill or marshal
=====================================================================
Same major Python version required when using pickle, cloudpickle, dill or marshal
==================================================================================

When Pyro is configured to use pickle, dill or marshal as its serialization format, it is required to have the same *major* Python versions
When Pyro is configured to use pickle, cloudpickle, dill or marshal as its serialization format, it is required to have the same *major* Python versions
on your clients and your servers. Otherwise the different parties cannot decipher each others serialized data.
This means you cannot let Python 2.x talk to Python 3.x with Pyro when using pickle, dill or marshal as serialization protocols. However
This means you cannot let Python 2.x talk to Python 3.x with Pyro when using these serializers. However
it should be fine to have Python 3.3 talk to Python 3.4 for instance.
It may still be required to specify the pickle or dill protocol version though, because that needs to be the same on both ends as well.
For instance, Python 3.4 introduced version 4 of the pickle protocol and as such won't be able to talk to Python 3.3 which is stuck
on version 3 pickle protocol. You'll have to tell the Python 3.4 side to step down to protocol 3. There is a config item for that. The same will apply for dill protocol versions.
on version 3 pickle protocol. You'll have to tell the Python 3.4 side to step down to protocol 3. There is a config item for that. The same will apply for dill protocol versions. If you are using cloudpickle, you can just set the pickle protocol version (as pickle is used under the hood).

The implementation independent serialization protocols serpent and json don't have these limitations.

Expand Down Expand Up @@ -488,8 +488,9 @@ So if you want to use them with Pyro, and pass them over the wire, you'll have t
``list(na)`` doesn't work: it seems to return a regular python list but the elements are still numpy datatypes.
You have to use the full conversions as mentioned earlier.
#. Don't return arrays at all. Redesign your API so that you might perhaps only return a single element from it.
#. Tell Pyro to use :py:mod:`pickle` or :py:mod:`dill` as serializer. Pickle and Dill can deal with numpy datatypes. However they have security implications.
See :doc:`security`. If you choose to use pickle or dill anyway, also be aware that you must tell your name server
#. Tell Pyro to use :py:mod:`pickle`, :py:mod:`cloudpickle` or :py:mod:`dill` as serializer. These serializers
can deal with numpy datatypes. However they have security implications.
See :doc:`security`. If you choose to use them anyway, also be aware that you must tell your name server
about it as well, see :ref:`nameserver-pickle`.


Expand Down
39 changes: 39 additions & 0 deletions src/Pyro4/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -356,6 +356,8 @@ def dict_to_class(cls, data):
return JsonSerializer()
elif classname == "Pyro4.util.MsgpackSerializer":
return MsgpackSerializer()
elif classname == "Pyro4.util.CloudpickleSerializer":
return CloudpickleSerializer()
elif classname == "Pyro4.util.DillSerializer":
return DillSerializer()
elif classname.startswith("Pyro4.errors."):
Expand Down Expand Up @@ -459,6 +461,36 @@ def copyreg_function(obj):
pass


class CloudpickleSerializer(SerializerBase):
"""
A (de)serializer that wraps the Cloudpickle serialization protocol.
It can optionally compress the serialized data, and is thread safe.
"""
serializer_id = 7 # never change this

def dumpsCall(self, obj, method, vargs, kwargs):
return cloudpickle.dumps((obj, method, vargs, kwargs), config.PICKLE_PROTOCOL_VERSION)

def dumps(self, data):
return cloudpickle.dumps(data, config.PICKLE_PROTOCOL_VERSION)

def loadsCall(self, data):
return cloudpickle.loads(data)

def loads(self, data):
return cloudpickle.loads(data)

@classmethod
def register_type_replacement(cls, object_type, replacement_function):
def copyreg_function(obj):
return replacement_function(obj).__reduce__()

try:
copyreg.pickle(object_type, copyreg_function)
except TypeError:
pass


class DillSerializer(SerializerBase):
"""
A (de)serializer that wraps the Dill serialization protocol.
Expand Down Expand Up @@ -720,6 +752,13 @@ def get_serializer_by_id(sid):
_ser = MarshalSerializer()
_serializers["marshal"] = _ser
_serializers_by_id[_ser.serializer_id] = _ser
try:
import cloudpickle
_ser = CloudpickleSerializer()
_serializers["cloudpickle"] = _ser
_serializers_by_id[_ser.serializer_id] = _ser
except ImportError:
pass
try:
import dill
_ser = DillSerializer()
Expand Down
1 change: 1 addition & 0 deletions test_requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
-r requirements.txt
cloudpickle>=0.4.0
dill>=0.2.6
msgpack-python>=0.4.6
8 changes: 8 additions & 0 deletions tests/PyroTests/test_daemon.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ def testSerializerConfig(self):
def testSerializerAccepted(self):
self.assertIn("marshal", config.SERIALIZERS_ACCEPTED)
self.assertNotIn("pickle", config.SERIALIZERS_ACCEPTED)
self.assertNotIn("cloudpickle", config.SERIALIZERS_ACCEPTED)
self.assertNotIn("dill", config.SERIALIZERS_ACCEPTED)
with Pyro4.core.Daemon(port=0) as d:
msg = Pyro4.message.Message(Pyro4.message.MSG_INVOKE, b"", Pyro4.util.MarshalSerializer.serializer_id, 0, 0, hmac_key=d._pyroHmacKey)
Expand All @@ -75,6 +76,13 @@ def testSerializerAccepted(self):
except Pyro4.errors.ProtocolError as x:
self.assertIn("serializer that is not accepted", str(x))
pass
msg = Pyro4.message.Message(Pyro4.message.MSG_INVOKE, b"", Pyro4.util.CloudpickleSerializer.serializer_id, 0, 0, hmac_key=d._pyroHmacKey)
cm = ConnectionMock(msg)
try:
d.handleRequest(cm)
self.fail("should crash")
except Pyro4.errors.ProtocolError as x:
self.assertTrue("no serializer available for id" in str(x) or "serializer that is not accepted" in str(x))
msg = Pyro4.message.Message(Pyro4.message.MSG_INVOKE, b"", Pyro4.util.DillSerializer.serializer_id, 0, 0, hmac_key=d._pyroHmacKey)
cm = ConnectionMock(msg)
try:
Expand Down
Loading

0 comments on commit 480cd84

Please sign in to comment.