Python’s SQLAlchemy vs Other ORMs

Overview of Python ORMs

As a wonderful language, Python has lots of ORM libraries besides SQLAlchemy. In this article, we are going to take a look at several popular alternative ORM libraries to better understand the big picture of the Python ORM landscape. By writing a script that reads and writes to a simple database with two tables, person and address, we will gain a better understanding about the pros and cons of each ORM library.

SQLObject

SQLObject is a Python ORM that maps objects between a SQL database and Python. It is becoming more popular in the programming community due to its similarity to Ruby on Rails’ ActiveRecord pattern. The first version of SQLObject was released in October 2002. It is licensed under LGPL.

In SQLObject, database concepts are mapped into Python in a way that’s very similar to SQLAlchemy, where tables are mapped as classes, rows as instances and columns as attributes. It also provides a Python-object-based query language that makes SQL more abstract, thus providing database agnosticity for applications.

$ pip install sqlobject
Downloading/unpacking sqlobject
Downloading SQLObject-1.5.1.tar.gz (276kB): 276kB downloaded
Running setup.py egg_info for package sqlobject
 
warning: no files found matching '*.html'
warning: no files found matching '*.css'
warning: no files found matching 'docs/*.html'
warning: no files found matching '*.py' under directory 'tests'
Requirement already satisfied (use --upgrade to upgrade): FormEncode>=1.1.1 in /Users/xiaonuogantan/python2-workspace/lib/python2.7/site-packages (from sqlobject)
Installing collected packages: sqlobject
Running setup.py install for sqlobject
changing mode of build/scripts-2.7/sqlobject-admin from 644 to 755
changing mode of build/scripts-2.7/sqlobject-convertOldURI from 644 to 755
 
warning: no files found matching '*.html'
warning: no files found matching '*.css'
warning: no files found matching 'docs/*.html'
warning: no files found matching '*.py' under directory 'tests'
changing mode of /Users/xiaonuogantan/python2-workspace/bin/sqlobject-admin to 755
changing mode of /Users/xiaonuogantan/python2-workspace/bin/sqlobject-convertOldURI to 755
Successfully installed sqlobject
Cleaning up...
>>> from sqlobject import StringCol, SQLObject, ForeignKey, sqlhub, connectionForURI
>>> sqlhub.processConnection = connectionForURI('sqlite:/:memory:')
>>>
>>> class Person(SQLObject):
...     name = StringCol()
...
>>> class Address(SQLObject):
...     address = StringCol()
...     person = ForeignKey('Person')
...
>>> Person.createTable()
[]
>>> Address.createTable()
[]

The code above created two simple tables: person and address. To create or insert records into these two tables, we simply instantiate a person and an address like normal Python objects:

>>> p = Person(name='person')
>>> a = Address(address='address', person=p)
>>> p
 
>>> a
<address>

To get or retrieve the new records from the database, we use the magical q object attached to the Person and Address classes:

>>> persons = Person.select(Person.q.name == 'person')
>>> persons
 
>>> list(persons)
[]
>>> p1 = persons[0]
>>> p1 == p
True
>>> addresses = Address.select(Address.q.person == p1)
>>> addresses
 
>>> list(addresses)
[<address>]
>>> a1 = addresses[0]
>>> a1 == a
True

Storm

Storm is a Python ORM that maps objects between one or more databases and Python. It allows developers to construct complex queries across multiple database tables to support dynamic storage and retrieval of object information. It was developed in Python at Canonical Ltd., the company behind Ubuntu, for use in the Launchpad and Landscape applications and subsequently released in 2007 as free software. The project is released under the LGPL license and contributors are required to assign copyrights to Canonical.

Like SQLAlchemy and SQLObject, Storm also maps tables to classes, rows to instances and columns to attributes. Compared to the other two, Storm’s table classes do not have to be subclasses of a special framework-specific superclass. In SQLAlchemy, every table class is a subclass of sqlalchemy.ext.declarative.declarative_bas. In SQLObject, every table class is a subclass of sqlobject.SQLObject.

Similar to SQLAlchemy, Storm’s Store object acts as a surrogate to the backend database, where all the operations are cached in-memory and committed into the database once the method commit is called on the store. Each store holds its own set of mapped Python database objects, just like a SQLAlchemy session holding different sets of Python objects.

Specific versions of Storm can be downloaded from the download page. In this article, the example code is written in Storm version 0.20.

>>> from storm.locals import Int, Reference, Unicode, create_database, Store
>>>
>>>
>>> db = create_database('sqlite:')
>>> store = Store(db)
>>>
>>>
>>> class Person(object):
...     __storm_table__ = 'person'
...     id = Int(primary=True)
...     name = Unicode()
...
>>>
>>> class Address(object):
...     __storm_table__ = 'address'
...     id = Int(primary=True)
...     address = Unicode()
...     person_id = Int()
...     person = Reference(person_id, Person.id)
...

The code above created an in-memory sqlite database and a store to reference that database object. A Storm store is similar to a SQLAlchemy DBSession object, both of which manage the life cycles of instance objects attached to them. For example, the following code creates a person and an address, and inserts both records into the database by flushing the store.

>>> store.execute("CREATE TABLE person "
... "(id INTEGER PRIMARY KEY, name VARCHAR)")
 
>>> store.execute("CREATE TABLE address "
... "(id INTEGER PRIMARY KEY, address VARCHAR, person_id INTEGER, "
... " FOREIGN KEY(person_id) REFERENCES person(id))")
 
>>> person = Person()
>>> person.name = u'person'
>>> print person
 
>>> print "%r, %r" % (person.id, person.name)
None, u'person' # Notice that person.id is None since the Person instance is not attached to a valid database store yet.
>>> store.add(person)
>>>
>>> print "%r, %r" % (person.id, person.name)
None, u'person' # Since the store hasn't flushed the Person instance into the sqlite database yet, person.id is still None.
>>> store.flush()
>>> print "%r, %r" % (person.id, person.name)
1, u'person' # Now the store has flushed the Person instance, we got an id value for person.
>>> address = Address()
>>> address.person = person
>>> address.address = 'address'
>>> print "%r, %r, %r" % (address.id, address.person, address.address)
None, , 'address'
>>> address.person == person
True
>>> store.add(address)
>>>
>>> store.flush()
>>> print "%r, %r, %r" % (address.id, address.person, address.address)
1, , 'address'

To get or retrieve the inserted Person and Address objects, we call store.find() to find them:

>>> print "%r, %r" % (person.id, person.name)
1, u'person'
>>> store.find(Address, Address.person == person).one()
 
>>> address = store.find(Address, Address.person == person).one()
>>> print "%r, %r" % (address.id, address.address)
1, u'address'

Django’s ORM

Django is a free and open source web application framework whose ORM is built tightly into the system. After its initial release, Django becomes more and more popular due to its straightforward design and easy-to-use web-ready features. It was released under the BSD license in July 2005. Since Django’s ORM is built tightly into the web framework, it’s not recommended, although possible, to use its ORM in a standalone non-Django Python application.

Django, one of the most popular Python web frameworks, has its own dedicated ORM. Compared to SQLAlchemy, Django’s ORM is more geared towards direct SQL object manipulation where it exposes a plain and direct mapping between database tables and Python classes.

$ django-admin.py startproject demo
$ cd demo
$ python manage.py syncdb
Creating tables ...
Creating table django_admin_log
Creating table auth_permission
Creating table auth_group_permissions
Creating table auth_group
Creating table auth_user_groups
Creating table auth_user_user_permissions
Creating table auth_user
Creating table django_content_type
Creating table django_session
 
You just installed Django's auth system, which means you don't have any superusers defined.
Would you like to create one now? (yes/no): no
Installing custom SQL ...
Installing indexes ...
Installed 0 object(s) from 0 fixture(s)
$ python manage.py shell

Since we cannot execute Django’s code without creating a project first, we created a Django project ‘demo’ in the previous shell and entered the Django shell to test our ORM example.

# demo/models.py
>>> from django.db import models
>>>
>>>
>>> class Person(models.Model):
...     name = models.TextField()
...
...     class Meta:
...         app_label = 'demo'
...
>>>
>>> class Address(models.Model):
...     address = models.TextField()
...     person = models.ForeignKey(Person)
...
...     class Meta:
...         app_label = 'demo'

The code above declared two Python classes, Person and Address, each of which maps to a database table. Before execute any database manipulation code, we need to create the tables in a local sqlite database.

python manage.py syncdb
Creating tables ...
Creating table demo_person
Creating table demo_address
Installing custom SQL ...
Installing indexes ...
Installed 0 object(s) from 0 fixture(s)

To insert a person and an address into the database, we instantiate the corresponding objects and call the save() methods of those objects.

>>> from demo.models import Person, Address
>>> p = Person(name='person')
>>> p.save()
>>> print "%r, %r" % (p.id, p.name)
1, 'person'
>>> a = Address(person=p, address='address')
>>> a.save()
>>> print "%r, %r" % (a.id, a.address)
1, 'address'

To get or retrieve the person and address objects, we use the magical objects attribute of the model classes to fetch the objects from the database.

>>> persons = Person.objects.filter(name='person')
>>> persons
[]
>>> p = persons[0]
>>> print "%r, %r" % (p.id, p.name)
1, u'person'
>>> addresses = Address.objects.filter(person=p)
>>> addresses
[<address>]
>>> a = addresses[0]
>>> print "%r, %r" % (a.id, a.address)
1, u'address'

peewee

peewee is a small, expressive ORM. Compared to other ORMs, peewee focuses on the principal of minimalism where the API is simple and the library is easy to use and understand.

pip install peewee
Downloading/unpacking peewee
Downloading peewee-2.1.7.tar.gz (1.1MB): 1.1MB downloaded
Running setup.py egg_info for package peewee
 
Installing collected packages: peewee
Running setup.py install for peewee
changing mode of build/scripts-2.7/pwiz.py from 644 to 755
 
changing mode of /Users/xiaonuogantan/python2-workspace/bin/pwiz.py to 755
Successfully installed peewee
Cleaning up...

To create a database model mapping, we implement a Person class and an Address class that map to the corresponding database tables.

>>> from peewee import SqliteDatabase, CharField, ForeignKeyField, Model
>>>
>>> db = SqliteDatabase(':memory:')
>>>
>>> class Person(Model):
...     name = CharField()
...     
...     class Meta:
...         database = db
...
>>>
>>> class Address(Model):
...     address = CharField()
...     person = ForeignKeyField(Person)
...
...     class Meta:
...         database = db
...
>>> Person.create_table()
>>> Address.create_table()

To insert objects into the database, we instantiate the objects and call their save() methods. From object creation point of view, peewee is very similar to Django.

>>> p = Person(name='person')
>>> p.save()
>>> a = Address(address='address', person=p)
>>> a.save()

To get or retrieve the objects from the database, we select the objects from their respective classes.

>>> person = Person.select().where(Person.name == 'person').get()
>>> person
>>>
>>> print '%r, %r' % (person.id, person.name)
1, u'person'
>>> address = Address.select().where(Address.person == person).get()
>>> print '%r, %r' % (address.id, address.address)
1, u'address'
PonyORM
PonyORM allows you to query the database using Python generators. These generators are translated into SQL and the results are automatically mapped into Python objects. Writing queries as Python generators makes it easy for programmers to quickly construct certain queries.

For example, let's use PonyORM to query the previous Person and Address models in a SQLite database.
>>> from pony.orm import Database, Required, Set
>>>
>>> db = Database('sqlite', ':memory:')
>>>
>>>
>>> class Person(db.Entity):
...     name = Required(unicode)
...     addresses = Set("Address")
...
>>>
>>> class Address(db.Entity):
...     address = Required(unicode)
...     person = Required(Person)
...
>>> db.generate_mapping(create_tables=True)

Now we have a SQLite database in memory and two tables mapped to the db object, we can insert two objects into the database.

>>> p = Person(name="person")
>>> a = Address(address="address", person=p)
>>> db.commit()

The call db.commit() actually commits the new objects p and a into the database. Now we can query the database using the generator syntax.

>>> from pony.orm import select
>>> select(p for p in Person if p.name == "person")[:]
[Person[1]]
>>> select(p for p in Person if p.name == "person")[:][0].name
u'person'
>>> select(a for a in Address if a.person == p)[:]
[Address[1]]
>>> select(a for a in Address if a.person == p)[:][0].address
u'address'

SQLAlchemy

SQLAlchemy is an open source SQL toolkit and ORM for the Python programming language released under the MIT license. It was released initially in February 2006 and written by Michael Bayer. It provides “a full suite of well known enterprise-level persistence patterns, designed for efficient and high-performing database access, adapted into a simple and Pythonic domain language”. It has adopted the data mapper pattern (like Hibernate in Java) rather than the active record pattern (like the one in Ruby on Rails).

SQLAlchemy’s unit-of-work principal makes it essential to confine all the database manipulation code to a specific database session that controls the life cycles of every object in that session. Similar to other ORMs, we start by defining subclasses of declarative_base() in order to map tables to Python classes.

>>> from sqlalchemy import Column, String, Integer, ForeignKey
>>> from sqlalchemy.orm import relationship
>>> from sqlalchemy.ext.declarative import declarative_base
>>>
>>> Base = declarative_base()
>>>
>>>
>>> class Person(Base):
...     __tablename__ = 'person'
...     id = Column(Integer, primary_key=True)
...     name = Column(String)
...
>>>
>>> class Address(Base):
...     __tablename__ = 'address'
...     id = Column(Integer, primary_key=True)
...     address = Column(String)
...     person_id = Column(Integer, ForeignKey(Person.id))
...     person = relationship(Person)
...

Before we write any database code, we need to create an database engine for our db session.

>>> from sqlalchemy import create_engine
>>> engine = create_engine('sqlite:///')

Once we have created a database engine, we can proceed to create a database session and create tables for all the database classes previously defined as Person and Address.

>>> from sqlalchemy.orm import sessionmaker
>>> session = sessionmaker()
>>> session.configure(bind=engine)
>>> Base.metadata.create_all(engine)

Now the session object becomes our unit-of-work constructor and all the subsequent database manipulation code and objects will be attached to a db session constructed by calling its __init__() method.

>>> s = session()
>>> p = Person(name='person')
>>> s.add(p)
>>> a = Address(address='address', person=p)
>>> s.add(a)

To get or retrieve the database objects, we call query() and filter() methods from the db session object.

>>> p = s.query(Person).filter(Person.name == 'person').one()
>>> p
 
>>> print "%r, %r" % (p.id, p.name)
1, 'person'
>>> a = s.query(Address).filter(Address.person == p).one()
>>> print "%r, %r" % (a.id, a.address)
1, 'address'

Notice that so far we haven’t committed any changes to the database yet so that the new person and address objects are not actually stored in the database yet. Calling s.commit() will actually commit the changes, i.e., inserting a new person and a new address, into the database.

>>> s.commit()
>>> s.close()

Comparison Between Python ORMs

For each Python ORM presented in this article, we are going to list their pros and cons here:

SQLObject

Pros:

  1. Adopted the easy-to-understand ActiveRecord pattern
  2. A relatively small codebase

Cons:

  1. Naming of methods and classes follow Java’s camelCase style
  2. Does not support database sessions to isolate unit of work

Storm

Pros:

  1. A clean and lightweight API leading to short learning curve and long-term maintainability
  2. Does not need special class constructors, nor imperative base classes

Cons:

  1. Forcing the programmer to write manual table-creation DDL statements instead of automatically deriving it from the model class
  2. Contributors of Storm have to give their contributions’ copyrights to Canonical Ltd.

Django’s ORM

Pros:

  1. Easy-to-use with a short learning curve
  2. Tightly integrated with Django to make it the de-factor standard when dealing with databases in Django

Cons:

  1. Does not handle complex queries very well; forcing the developer to go back to raw SQL
  2. Tightly integrated with Django; making it hard to use outside of a Django context

peewee

Pros:

  1. A Django-ish API; making it easy-to-use
  2. A lightweight implementation; making it easy to integrate with any web framework

Cons:

  1. Does not support automatic schema migrations
  2. Many-to-Many queries are not intuitive to write

SQLAlchemy

Pros:

  1. Enterprise-level APIs; making the code robust and adaptable
  2. Flexible design; making it painless to write complex queries

Cons:

  1. The Unit-of-work concept is not common
  2. A heavyweight API; leading to a long learning curve

PonyORM

Pros:

  1. A very convenient syntax for writing queries
  2. Automatic query optimization
  3. Simplified setup and usage

Cons:

  1. Not designed to process hundreds of thousands or millions of records simultaneously

Summary and Tips

Compared to other ORMs, SQLAlchemy stands out in its focus on the unit-of-work concept which is prevalent whenever you write SQLAlchemy code. The DBSession concept might be hard to understand and use correctly initially, but later you will appreciate the additional complexity which reduces accidental database commit-timing-related bugs to almost zero. Dealing with multiple databases in SQLAlchemy can be tricky since each DB session is confined to one database connection. However, this kind of limitation is actually a good thing since it forces you to think hard about the interaction between multiple databases and make it easier to debug database interaction code.

In the future articles, we are going to fully explore more advanced use cases of SQLAlchemy to truly grasp its immensely powerful APIs.

Leave a Reply

Your email address will not be published. Required fields are marked *