MacPython Logo from __future__ import *

buy music albums Silver Apples buy mp3 albums Tarrus Riley buy tracks mp3 Kravits buy Reaper albums mp3 buy Kravits albums music buy music Evita CD online albums mp3 Silver Apples download Madonna CD music buy tracks music Kravits download music albums Silver Apples

2005-12-26

simple_json 1.0

Filed under: python, simplejson — bob @ 12:29 am

simple_json is a simple, fast, complete, correct and extensible JSON encoder/decoder for Python 2.3+. It's pure Python code with no dependencies.

simple_json exists because json-py sucks. simple_json is a drop-in replacement for json-py, but it also exposes more sanely named APIs, and can be extended by subclassing.

Here are the issues I found in json-py after evaluating the source:

  • LGPL (does this license have a clear interpretation for Python modules?)
  • Doesn't have a proper egg (or source) distribution on Cheese Shop.
  • Wonky API. read and write are very bad names to call something that doesn't act file-like!
  • No streaming encoder support.
  • The decoder is extremely inefficient as it invokes at least one method call per character of input.
  • The encoder supports exactly these types: dict, list, tuple, str, unicode, int, long, float plus the singletons True, False, and None. It can't be made to support anything else, not even subclasses of those types. The implementation is in a single function and has no extensibility hook.
  • The encoder has no clue about unicode. Depending on the input, it may return a str or unicode. It has no option to escape the output.
  • The decoder similarly has no clue about unicode. If it ain't ASCII or escaped, then BOOM!
  • It uses custom exception subclasses that descend directly from Exception, so will not be caught by traditional ValueError clauses.
  • The source code mixes tabs and spaces. That's uh.. reassuring :)

simple_json is designed to address all of those issues:

  • MIT license
  • It's on Cheese Shop, so setuptools users can depend on it with a simple install_requires
  • The official API follows the familiar convention of marshal and pickle
  • Encoding can be streamed (via dump or iterator)
  • The decoder is fast, because it uses regular expressions rather than processing each character with Python code
  • The encoder can be subclassed and extended to support serialization of any type, and it supports subclasses of dict, list, str, etc. by default
  • The encoder outputs ASCII by default, with unicode characters escaped with \uXXXX. Optionally, it can also output a unicode string with ensure_ascii=False.
  • The decoder understands encoded strings (and unicode). It defaults to UTF-8, but can use anything ASCII-based. If the input is of an encoding that is not ASCII-based (such as UCS-2), it can be decoded to unicode first.
  • Exceptions during encoding or decoding are simply ValueError (though a future version could provide more informative messages)

4 Comments »

  1. Hmm, interesting. This summer I wrote a JSONReader C module (basically it will parse into native Python structures through a Python extension) with very high efficiency. I offered it to be included into json-py, and the author said yes, but I haven’t heard from him since.
    The module is still unreleased because of this, so maybe it can find it’s way into Simple-JSON? It’s only a decoder, not an encoder. Mail me if you’re interested (I’ll be on vacation the coming week though).

    Comment by Koen — 2005-12-26 @ 12:53 pm

  2. It would be nice to have as an optional extension, like elementree vs cElementTree. Do you have benchmarks?

    Comment by bob — 2005-12-26 @ 6:14 pm

  3. This is a much better implementation, good job. But to be fair, json-py is a straight Python implementation of the Javascript reference implementation.

    Comment by nick — 2005-12-29 @ 9:19 am

  4. Okay, I have just checked my JSONReader against SimpleJSON on a 6MB and a 4.5MB file.
    On a Athlon XP 2000+ I get the following times:
    6 MB file: 1.2s for JSONReader, 31.5s for SimpleJSON.
    4.5 MB file: 1.1s for JSONReader, 24.1s for SimpleJSON.

    I have to make some notes:
    - JSONReader uses the standard C API to open a file through a filename. If I read the file into memory completely in Python and then use JSONReader’s from-string reading API (e.g. not using the standard C file API), then times are 1.3s, 1.1s respectively. Supporting a StringIO API is non-trivial because of the lex/yacc basis, I haven’t looked at this yet. Perhaps it is possible to make it use a .read() interface, but I think it will be hard.
    - Only works on ASCII strings. Characters 128-255 are kept as normal characters and not interpreted at all. So they should come out the way they come in (I do not rely on this behavior, I only use \uXXXX).

    Comment by Koen — 2006-01-03 @ 7:32 am

RSS feed for comments on this post.

Leave a comment

WP-Hashcash: protecting you from spam.

Powered by WordPress