SQLAlchemy Association Tables

Association Tables

In our previous articles, we used an association table to model many-to-many relationships between tables, such as the relationship between Department and Employee. In this article, we are going to dive deeper into the association table concept and see how we can use it to further solve more complicated problems.

DepartmentEmployeeLink and Extra Data

In our previous article, we created the following SQLAlchemy models:

import os
 
from sqlalchemy import Column, DateTime, String, Integer, ForeignKey, func
from sqlalchemy.orm import relationship, backref
from sqlalchemy.ext.declarative import declarative_base
 
 
Base = declarative_base()
 
 
 
class Department(Base):
    __tablename__ = 'department'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    employees = relationship(
        'Employee',
        secondary='department_employee_link'
    )
 
 
class Employee(Base):
    __tablename__ = 'employee'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    hired_on = Column(DateTime, default=func.now())
    departments = relationship(
        Department,
        secondary='department_employee_link'
    )
 
 
class DepartmentEmployeeLink(Base):
    __tablename__ = 'department_employee_link'
    department_id = Column(Integer, ForeignKey('department.id'), primary_key=True)
    employee_id = Column(Integer, ForeignKey('employee.id'), primary_key=True)

Notice that the DepartmentEmployeeLink class contains two foreign key columns which are enough to model the many-to-many relationship between Department and Employee. Now let’s add one more column extra_data and two more relationships department and employee.

import os
 
from sqlalchemy import Column, DateTime, String, Integer, ForeignKey, func
from sqlalchemy.orm import relationship, backref
from sqlalchemy.ext.declarative import declarative_base
 
 
Base = declarative_base()
 
 
 
class Department(Base):
    __tablename__ = 'department'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    employees = relationship(
        'Employee',
        secondary='department_employee_link'
    )
 
 
class Employee(Base):
    __tablename__ = 'employee'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    hired_on = Column(DateTime, default=func.now())
    departments = relationship(
        Department,
        secondary='department_employee_link'
    )
 
 
class DepartmentEmployeeLink(Base):
    __tablename__ = 'department_employee_link'
    department_id = Column(Integer, ForeignKey('department.id'), primary_key=True)
    employee_id = Column(Integer, ForeignKey('employee.id'), primary_key=True)
    extra_data = Column(String(256))
    department = relationship(Department, backref=backref("employee_assoc"))
    employee = relationship(Employee, backref=backref("department_assoc"))

With one more extra column and two more relationships on the DepartmentEmployeeLink association model, we can store more information and be more liberal with how we want to use it. For example, suppose we have an employee John who works part-time in the IT department, we can insert the string ‘part-time’ into the column extra_data and create an DepartmentEmployeeLink object to represent this relationship.

>>> fp = 'orm_in_detail.sqlite'
>>> if os.path.exists(fp):
...     os.remove(fp)
...
>>> from sqlalchemy import create_engine
>>> engine = create_engine('sqlite:///association_tables.sqlite')
>>>
>>> from sqlalchemy.orm import sessionmaker
>>> session = sessionmaker()
>>> session.configure(bind=engine)
>>> Base.metadata.create_all(engine)
>>>
>>>
>>> IT = Department(name="IT")
>>> John = Employee(name="John")
>>> John_working_part_time_at_IT = DepartmentEmployeeLink(department=IT, employee=John, extra_data='part-time')
>>> s = session()
>>> s.add(John_working_part_time_at_IT)
>>> s.commit()

Then, we can find John by querying the IT department or the DepartmentEmployeeLink model.

>>> IT.employees[0].name
u'John'
>>> de_link = s.query(DepartmentEmployeeLink).join(Department).filter(Department.name == 'IT').one()
>>> de_link.employee.name
u'John'
>>> de_link = s.query(DepartmentEmployeeLink).filter(DepartmentEmployeeLink.extra_data == 'part-time').one()
>>> de_link.employee.name
u'John'

Finally, adding an IT employee using the relationship Department.employees still works, as shown in the previous article:

>>> Bill = Employee(name="Bill")
>>> IT.employees.append(Bill)
>>> s.add(Bill)
>>> s.commit()

Linking Relationships with Backref

One common keyword argument we have used so far in relationship definitions is backref. A backref is a common shortcut to place a second relationship() onto the destination table. For example, the following code puts a second relationship() “posts” onto the user table by specifying a backref on Post.owner:

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String(256))
 
 
class Post(Base):
    __tablename__ = 'post'
    id = Column(Integer, primary_key=True)
    owner_id = Column(Integer, ForeignKey('user.id'))
    owner = relationship(User, backref=backref('posts', uselist=True))

This is equivalent to the following definition:

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String(256))
    posts = relationship("Post", back_populates="owner")
 
 
class Post(Base):
    __tablename__ = 'post'
    id = Column(Integer, primary_key=True)
    owner_id = Column(Integer, ForeignKey('user.id'))
    owner = relationship(User, back_populates="posts")

Now we have a one-to-many relationship between User and Post. We can interact with these two models in the following way:

>>> s = session()
>>> john = User(name="John")
>>> post1 = Post(owner=john)
>>> post2 = Post(owner=john)
>>> s.add(post1)
>>> s.add(post2)
>>> s.commit()
>>> s.refresh(john)
>>> john.posts
[, ]
>>> john.posts[0].owner
 
>>> john.posts[0].owner.name
u'John'

One-to-One

Creating a one-to-one relationship between models is very similar to creating many-to-one relationships. By modifying the uselist argument’s value to False in a backref(), we force the database models to be mapped to each other in a one-to-one relationship.

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String(256))
 
 
class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    address = Column(String(256))
    user_id = Column(Integer, ForeignKey('user.id'))
    user = relationship('User', backref=backref('address', uselist=False))

Then, we can use the models in the following way:

>>> s = session()
>>> john = User(name="John")
>>> home_of_john = Address(address="1234 Park Ave", user=john)
>>> s.add(home_of_john)
>>> s.commit()
>>> s.refresh(john)
>>> john.address.address
u'1234 Park Ave'
>>> john.address.user.name
u'John'
>>> s.close()

Relationship Update Cascades

In a relational database, referential integrity guarantees that when the primary key of a referenced object in a one-to-many or many-to-many relationship changes, the refering objects’ foreign keys that reference the primary key will change as well. However, for databases that do not support referential integrity, such as SQLite or MySQL with their referential integrity option turned off, changing the primary key values of a referenced object does not trigger updates of the refering objects. In this case, we can use the passive_updates flag in relationship or backref to inform the database to execute extra SELECT and UPDATE statements that will update the values of the refering objects’ foreign keys.

In the following example, we construct a one-to-many relationship between User and Address and not specifying the passive_updates flag in the relationship. The database backend is SQLite.

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String(256))
 
 
class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    address = Column(String(256))
    user_id = Column(Integer, ForeignKey('user.id'))
    user = relationship(
        'User', backref=backref('addresses', uselist=True)
    )

Then, when we change the primary key value of a User object, its Address objects’ user_id foreign key values will not change. Therefore, when you want to access an address‘s user object again, you will get an AttributeError.

>>> s = session()
>>> john = User(name='john')
>>> home_of_john = Address(address='home', user=john)
>>> office_of_john = Address(address='office', user=john)
>>> s.add(home_of_john)
>>> s.add(office_of_john)
>>> s.commit()
>>> s.refresh(john)
>>> john.id
1
>>> john.id = john.id + 1
>>> s.commit()
>>> s.refresh(home_of_john)
>>> s.refresh(office_of_john)
>>> home_of_john.user.name
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'NoneType' object has no attribute 'name'
>>> s.close()

If we specify the passive_updates flag in the Address model, then we can change the primary key of john and expect SQLAlchemy to issue extra SELECT and UPDATE statements to keep home_of_john.user and office_of_john.user up-to-date.

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String(256))
 
 
class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    address = Column(String(256))
    user_id = Column(Integer, ForeignKey('user.id'))
    user = relationship(
        'User', backref=backref('addresses', uselist=True, passive_updates=False)
    )
>>> s = session()
>>> john = User(name='john')
>>> home_of_john = Address(address='home', user=john)
>>> office_of_john = Address(address='office', user=john)
>>> s.add(home_of_john)
>>> s.add(office_of_john)
>>> s.commit()
>>> s.refresh(john)
>>> john.id
1
>>> john.id = john.id + 1
>>> s.commit()
>>> s.refresh(home_of_john)
>>> s.refresh(office_of_john)
>>> home_of_john.user.name
u'john'
>>> s.close()

Summary

In this article, we dig a little deeper into SQLAlchemy’s association tables and the backref keyword argument. It’s often crucial to understand the mechanism behind these two concepts to fully master complex join queries, as what will be shown in future articles.

Leave a Reply

Your email address will not be published. Required fields are marked *