14 PEP 307: Pickle Enhancements

The pickle and cPickle modules received some attention during the 2.3 development cycle. In 2.2, new-style classes could be pickled without difficulty, but they weren't pickled very compactly; PEP 307 quotes a trivial example where a new-style class results in a pickled string three times longer than that for a classic class.

The solution was to invent a new pickle protocol. The pickle.dumps() function has supported a text-or-binary flag for a long time. In 2.3, this flag is redefined from a Boolean to an integer: 0 is the old text-mode pickle format, 1 is the old binary format, and now 2 is a new 2.3-specific format. A new constant, pickle.HIGHEST_PROTOCOL, can be used to select the fanciest protocol available.

Unpickling is no longer considered a safe operation. 2.2's pickle provided hooks for trying to prevent unsafe classes from being unpickled (specifically, a __safe_for_unpickling__ attribute), but none of this code was ever audited and therefore it's all been ripped out in 2.3. You should not unpickle untrusted data in any version of Python.

To reduce the pickling overhead for new-style classes, a new interface for customizing pickling was added using three special methods: __getstate__, __setstate__, and __getnewargs__. Consult PEP 307 for the full semantics of these methods.

As a way to compress pickles yet further, it's now possible to use integer codes instead of long strings to identify pickled classes. The Python Software Foundation will maintain a list of standardized codes; there's also a range of codes for private use. Currently no codes have been specified.

See Also:

PEP 307, Extensions to the pickle protocol
Written and implemented by Guido van Rossum and Tim Peters.

See About this document... for information on suggesting changes.