Overview of Python ORMs
As a wonderful language, Python has lots of ORM libraries besides SQLAlchemy. In this article, we are going to take a look at several popular alternative ORM libraries to better understand the big picture of the Python ORM landscape. By writing a script that reads and writes to a simple database with two tables,
address, we will gain a better understanding about the pros and cons of each ORM library.
SQLObject is a Python ORM that maps objects between a SQL database and Python. It is becoming more popular in the programming community due to its similarity to Ruby on Rails’ ActiveRecord pattern. The first version of SQLObject was released in October 2002. It is licensed under LGPL.
In SQLObject, database concepts are mapped into Python in a way that’s very similar to SQLAlchemy, where tables are mapped as classes, rows as instances and columns as attributes. It also provides a Python-object-based query language that makes SQL more abstract, thus providing database agnosticity for applications.
$ pip install sqlobject Downloading/unpacking sqlobject Downloading SQLObject-1.5.1.tar.gz (276kB): 276kB downloaded Running setup.py egg_info for package sqlobject warning: no files found matching '*.html' warning: no files found matching '*.css' warning: no files found matching 'docs/*.html' warning: no files found matching '*.py' under directory 'tests' Requirement already satisfied (use --upgrade to upgrade): FormEncode>=1.1.1 in /Users/xiaonuogantan/python2-workspace/lib/python2.7/site-packages (from sqlobject) Installing collected packages: sqlobject Running setup.py install for sqlobject changing mode of build/scripts-2.7/sqlobject-admin from 644 to 755 changing mode of build/scripts-2.7/sqlobject-convertOldURI from 644 to 755 warning: no files found matching '*.html' warning: no files found matching '*.css' warning: no files found matching 'docs/*.html' warning: no files found matching '*.py' under directory 'tests' changing mode of /Users/xiaonuogantan/python2-workspace/bin/sqlobject-admin to 755 changing mode of /Users/xiaonuogantan/python2-workspace/bin/sqlobject-convertOldURI to 755 Successfully installed sqlobject Cleaning up...
>>> from sqlobject import StringCol, SQLObject, ForeignKey, sqlhub, connectionForURI >>> sqlhub.processConnection = connectionForURI('sqlite:/:memory:') >>> >>> class Person(SQLObject): ... name = StringCol() ... >>> class Address(SQLObject): ... address = StringCol() ... person = ForeignKey('Person') ... >>> Person.createTable()  >>> Address.createTable() 
The code above created two simple tables:
address. To create or insert records into these two tables, we simply instantiate a person and an address like normal Python objects:
>>> p = Person(name='person') >>> a = Address(address='address', person=p) >>> p >>> a <address>
To get or retrieve the new records from the database, we use the magical
q object attached to the
>>> persons = Person.select(Person.q.name == 'person') >>> persons >>> list(persons)  >>> p1 = persons >>> p1 == p True >>> addresses = Address.select(Address.q.person == p1) >>> addresses >>> list(addresses) [<address>] >>> a1 = addresses >>> a1 == a True
Storm is a Python ORM that maps objects between one or more databases and Python. It allows developers to construct complex queries across multiple database tables to support dynamic storage and retrieval of object information. It was developed in Python at Canonical Ltd., the company behind Ubuntu, for use in the Launchpad and Landscape applications and subsequently released in 2007 as free software. The project is released under the LGPL license and contributors are required to assign copyrights to Canonical.
Like SQLAlchemy and SQLObject, Storm also maps tables to classes, rows to instances and columns to attributes. Compared to the other two, Storm’s table classes do not have to be subclasses of a special framework-specific superclass. In SQLAlchemy, every table class is a subclass of
sqlalchemy.ext.declarative.declarative_bas. In SQLObject, every table class is a subclass of
Similar to SQLAlchemy, Storm’s
Store object acts as a surrogate to the backend database, where all the operations are cached in-memory and
committed into the database once the method commit is called on the store. Each store holds its own set of mapped Python database objects, just like a SQLAlchemy session holding different sets of Python objects.
Specific versions of Storm can be downloaded from the download page. In this article, the example code is written in Storm version 0.20.
>>> from storm.locals import Int, Reference, Unicode, create_database, Store >>> >>> >>> db = create_database('sqlite:') >>> store = Store(db) >>> >>> >>> class Person(object): ... __storm_table__ = 'person' ... id = Int(primary=True) ... name = Unicode() ... >>> >>> class Address(object): ... __storm_table__ = 'address' ... id = Int(primary=True) ... address = Unicode() ... person_id = Int() ... person = Reference(person_id, Person.id) ...
The code above created an in-memory sqlite database and a store to reference that database object. A Storm store is similar to a SQLAlchemy DBSession object, both of which manage the life cycles of instance objects attached to them. For example, the following code creates a person and an address, and inserts both records into the database by flushing the store.
>>> store.execute("CREATE TABLE person " ... "(id INTEGER PRIMARY KEY, name VARCHAR)") >>> store.execute("CREATE TABLE address " ... "(id INTEGER PRIMARY KEY, address VARCHAR, person_id INTEGER, " ... " FOREIGN KEY(person_id) REFERENCES person(id))") >>> person = Person() >>> person.name = u'person' >>> print person >>> print "%r, %r" % (person.id, person.name) None, u'person' # Notice that person.id is None since the Person instance is not attached to a valid database store yet. >>> store.add(person) >>> >>> print "%r, %r" % (person.id, person.name) None, u'person' # Since the store hasn't flushed the Person instance into the sqlite database yet, person.id is still None. >>> store.flush() >>> print "%r, %r" % (person.id, person.name) 1, u'person' # Now the store has flushed the Person instance, we got an id value for person. >>> address = Address() >>> address.person = person >>> address.address = 'address' >>> print "%r, %r, %r" % (address.id, address.person, address.address) None, , 'address' >>> address.person == person True >>> store.add(address) >>> >>> store.flush() >>> print "%r, %r, %r" % (address.id, address.person, address.address) 1, , 'address'
To get or retrieve the inserted Person and Address objects, we call
store.find() to find them:
>>> print "%r, %r" % (person.id, person.name) 1, u'person' >>> store.find(Address, Address.person == person).one() >>> address = store.find(Address, Address.person == person).one() >>> print "%r, %r" % (address.id, address.address) 1, u'address'
Django is a free and open source web application framework whose ORM is built tightly into the system. After its initial release, Django becomes more and more popular due to its straightforward design and easy-to-use web-ready features. It was released under the BSD license in July 2005. Since Django’s ORM is built tightly into the web framework, it’s not recommended, although possible, to use its ORM in a standalone non-Django Python application.
Django, one of the most popular Python web frameworks, has its own dedicated ORM. Compared to SQLAlchemy, Django’s ORM is more geared towards direct SQL object manipulation where it exposes a plain and direct mapping between database tables and Python classes.
$ django-admin.py startproject demo $ cd demo $ python manage.py syncdb Creating tables ... Creating table django_admin_log Creating table auth_permission Creating table auth_group_permissions Creating table auth_group Creating table auth_user_groups Creating table auth_user_user_permissions Creating table auth_user Creating table django_content_type Creating table django_session You just installed Django's auth system, which means you don't have any superusers defined. Would you like to create one now? (yes/no): no Installing custom SQL ... Installing indexes ... Installed 0 object(s) from 0 fixture(s) $ python manage.py shell
Since we cannot execute Django’s code without creating a project first, we created a Django project ‘demo’ in the previous shell and entered the Django shell to test our ORM example.
# demo/models.py >>> from django.db import models >>> >>> >>> class Person(models.Model): ... name = models.TextField() ... ... class Meta: ... app_label = 'demo' ... >>> >>> class Address(models.Model): ... address = models.TextField() ... person = models.ForeignKey(Person) ... ... class Meta: ... app_label = 'demo'
The code above declared two Python classes,
Address, each of which maps to a database table. Before execute any database manipulation code, we need to create the tables in a local sqlite database.
python manage.py syncdb Creating tables ... Creating table demo_person Creating table demo_address Installing custom SQL ... Installing indexes ... Installed 0 object(s) from 0 fixture(s)
To insert a person and an address into the database, we instantiate the corresponding objects and call the
save() methods of those objects.
>>> from demo.models import Person, Address >>> p = Person(name='person') >>> p.save() >>> print "%r, %r" % (p.id, p.name) 1, 'person' >>> a = Address(person=p, address='address') >>> a.save() >>> print "%r, %r" % (a.id, a.address) 1, 'address'
To get or retrieve the person and address objects, we use the magical
objects attribute of the model classes to fetch the objects from the database.
>>> persons = Person.objects.filter(name='person') >>> persons  >>> p = persons >>> print "%r, %r" % (p.id, p.name) 1, u'person' >>> addresses = Address.objects.filter(person=p) >>> addresses [<address>] >>> a = addresses >>> print "%r, %r" % (a.id, a.address) 1, u'address'
peewee is a small, expressive ORM. Compared to other ORMs,
peewee focuses on the principal of minimalism where the API is simple and the library is easy to use and understand.
pip install peewee Downloading/unpacking peewee Downloading peewee-2.1.7.tar.gz (1.1MB): 1.1MB downloaded Running setup.py egg_info for package peewee Installing collected packages: peewee Running setup.py install for peewee changing mode of build/scripts-2.7/pwiz.py from 644 to 755 changing mode of /Users/xiaonuogantan/python2-workspace/bin/pwiz.py to 755 Successfully installed peewee Cleaning up...
To create a database model mapping, we implement a
Person class and an
Address class that map to the corresponding database tables.
>>> from peewee import SqliteDatabase, CharField, ForeignKeyField, Model >>> >>> db = SqliteDatabase(':memory:') >>> >>> class Person(Model): ... name = CharField() ... ... class Meta: ... database = db ... >>> >>> class Address(Model): ... address = CharField() ... person = ForeignKeyField(Person) ... ... class Meta: ... database = db ... >>> Person.create_table() >>> Address.create_table()
To insert objects into the database, we instantiate the objects and call their
save() methods. From object creation point of view,
peewee is very similar to Django.
>>> p = Person(name='person') >>> p.save() >>> a = Address(address='address', person=p) >>> a.save()
To get or retrieve the objects from the database, we
select the objects from their respective classes.
>>> person = Person.select().where(Person.name == 'person').get() >>> person >>> >>> print '%r, %r' % (person.id, person.name) 1, u'person' >>> address = Address.select().where(Address.person == person).get() >>> print '%r, %r' % (address.id, address.address) 1, u'address'
PonyORM PonyORM allows you to query the database using Python generators. These generators are translated into SQL and the results are automatically mapped into Python objects. Writing queries as Python generators makes it easy for programmers to quickly construct certain queries. For example, let's use PonyORM to query the previous Person and Address models in a SQLite database.
>>> from pony.orm import Database, Required, Set >>> >>> db = Database('sqlite', ':memory:') >>> >>> >>> class Person(db.Entity): ... name = Required(unicode) ... addresses = Set("Address") ... >>> >>> class Address(db.Entity): ... address = Required(unicode) ... person = Required(Person) ... >>> db.generate_mapping(create_tables=True)
Now we have a SQLite database in memory and two tables mapped to the
db object, we can insert two objects into the database.
>>> p = Person(name="person") >>> a = Address(address="address", person=p) >>> db.commit()
db.commit() actually commits the new objects
a into the database. Now we can query the database using the generator syntax.
>>> from pony.orm import select >>> select(p for p in Person if p.name == "person")[:] [Person] >>> select(p for p in Person if p.name == "person")[:].name u'person' >>> select(a for a in Address if a.person == p)[:] [Address] >>> select(a for a in Address if a.person == p)[:].address u'address'
SQLAlchemy is an open source SQL toolkit and ORM for the Python programming language released under the MIT license. It was released initially in February 2006 and written by Michael Bayer. It provides “a full suite of well known enterprise-level persistence patterns, designed for efficient and high-performing database access, adapted into a simple and Pythonic domain language”. It has adopted the data mapper pattern (like Hibernate in Java) rather than the active record pattern (like the one in Ruby on Rails).
SQLAlchemy’s unit-of-work principal makes it essential to confine all the database manipulation code to a specific database session that controls the life cycles of every object in that session. Similar to other ORMs, we start by defining subclasses of
declarative_base() in order to map tables to Python classes.
>>> from sqlalchemy import Column, String, Integer, ForeignKey >>> from sqlalchemy.orm import relationship >>> from sqlalchemy.ext.declarative import declarative_base >>> >>> Base = declarative_base() >>> >>> >>> class Person(Base): ... __tablename__ = 'person' ... id = Column(Integer, primary_key=True) ... name = Column(String) ... >>> >>> class Address(Base): ... __tablename__ = 'address' ... id = Column(Integer, primary_key=True) ... address = Column(String) ... person_id = Column(Integer, ForeignKey(Person.id)) ... person = relationship(Person) ...
Before we write any database code, we need to create an database engine for our db session.
>>> from sqlalchemy import create_engine >>> engine = create_engine('sqlite:///')
Once we have created a database engine, we can proceed to create a database session and create tables for all the database classes previously defined as
>>> from sqlalchemy.orm import sessionmaker >>> session = sessionmaker() >>> session.configure(bind=engine) >>> Base.metadata.create_all(engine)
session object becomes our unit-of-work constructor and all the subsequent database manipulation code and objects will be attached to a db session constructed by calling its
>>> s = session() >>> p = Person(name='person') >>> s.add(p) >>> a = Address(address='address', person=p) >>> s.add(a)
To get or retrieve the database objects, we call
filter() methods from the db session object.
>>> p = s.query(Person).filter(Person.name == 'person').one() >>> p >>> print "%r, %r" % (p.id, p.name) 1, 'person' >>> a = s.query(Address).filter(Address.person == p).one() >>> print "%r, %r" % (a.id, a.address) 1, 'address'
Notice that so far we haven’t committed any changes to the database yet so that the new person and address objects are not actually stored in the database yet. Calling
s.commit() will actually commit the changes, i.e., inserting a new person and a new address, into the database.
>>> s.commit() >>> s.close()
Comparison Between Python ORMs
For each Python ORM presented in this article, we are going to list their pros and cons here:
- Adopted the easy-to-understand ActiveRecord pattern
- A relatively small codebase
- Naming of methods and classes follow Java’s camelCase style
- Does not support database sessions to isolate unit of work
- A clean and lightweight API leading to short learning curve and long-term maintainability
- Does not need special class constructors, nor imperative base classes
- Forcing the programmer to write manual table-creation DDL statements instead of automatically deriving it from the model class
- Contributors of Storm have to give their contributions’ copyrights to Canonical Ltd.
- Easy-to-use with a short learning curve
- Tightly integrated with Django to make it the de-factor standard when dealing with databases in Django
- Does not handle complex queries very well; forcing the developer to go back to raw SQL
- Tightly integrated with Django; making it hard to use outside of a Django context
- A Django-ish API; making it easy-to-use
- A lightweight implementation; making it easy to integrate with any web framework
- Does not support automatic schema migrations
- Many-to-Many queries are not intuitive to write
- Enterprise-level APIs; making the code robust and adaptable
- Flexible design; making it painless to write complex queries
- The Unit-of-work concept is not common
- A heavyweight API; leading to a long learning curve
- A very convenient syntax for writing queries
- Automatic query optimization
- Simplified setup and usage
- Not designed to process hundreds of thousands or millions of records simultaneously
Summary and Tips
Compared to other ORMs, SQLAlchemy stands out in its focus on the unit-of-work concept which is prevalent whenever you write SQLAlchemy code. The DBSession concept might be hard to understand and use correctly initially, but later you will appreciate the additional complexity which reduces accidental database commit-timing-related bugs to almost zero. Dealing with multiple databases in SQLAlchemy can be tricky since each DB session is confined to one database connection. However, this kind of limitation is actually a good thing since it forces you to think hard about the interaction between multiple databases and make it easier to debug database interaction code.
In the future articles, we are going to fully explore more advanced use cases of SQLAlchemy to truly grasp its immensely powerful APIs.