SQLAlchemy ORM: Setting Up Self-Referential Many-to-Many Relationships

Joel Ramos
ramosly | blog
Published in
4 min readFeb 10, 2017

--

If you’re new to SQLAlchemy, or are trying to get started, this may not be the post for you as I assume some familiarity with Python, SQL, and SQLAlchemy. Perhaps the tutorials section on the package’s website is a better place to start. SQLAlchemy has proven to be pretty handy once you figure out how to do what you’re trying to do. However, getting there can be a bit challenging, at least it was for me.

In this post, I will go over one challenge that I recently ran into, and that is how to set up models for database tables with many-to-many relationships to themselves.

I will assume that you are familiar with relational database concepts and somewhat familiar with Python and SQLAlchemy’s ORM. So, let’s get to it! I’ll start with what the tables look like.

The Database Tables

Table 1: Person table

PersonId    PersonType    Name
1 Parent Bob
2 Parent Mary
3 Child Tommy
4 Child Sally

Table 2: ParentChild (intersect) table

ParentChildId    ParentId    ChildId
1 1 3
2 2 3
3 1 4
4 2 4

These tables aren’t exactly the ones I was working with, but they should better illustrate the idea that the Person table has a many-to-many relationship to itself. I.e., one parent can have many children, and one child can have many parents.

Joining this table to itself to get a child’s parents, or a parent’s children, would look like the following:

At first, it wasn’t clear to me whether SQLAlchemy could handle this scenario, but as it turns out, it can! So, why don’t we take a look at the code first, then I’ll explain.

Building the SQLAlchemy Models

Imports

If we start at the top, the first 4 lines are just the necessary imports from SQLAlchemy. On the 3rd line, I am just importing the MS SQL Server datatypes provided by SQLAlchemy. You should be able to use those from whatever dialect you are using (e.g., MySQL, Postgres, etc.)

Base object

Next, in a similar fashion to the documentation, we create a Baseobject via the declarative_base()function. This Baseclass will be the class that our database-table model classes inherit from. For more info, see the Declare a Mapping section of the documentation.

Intersect table

After that, we create a variable for the intersect table called parent_child. Interestingly, SQLAlchemy seems to require us to represent intersect tables this way, where we have a variable that we set equal to a Table object which is initialized with the name of the table, the Base.metadata, and then the Column objects. This as opposed to being created like the other tables, which is to say, as classes that subclass the Base object.

Notice that the two Columns after the primary key column also specify a ForeignKey param, each pointing back to the primary key of the Person table.

Note: Some people might find it confusing that the value you pass to the ForeignKey class must be the table name and column name as they appear in the database, not how you specified them in your Python models. So here, that value is ForeignKey(‘Person.PersonId’), which is the actual table and column name in the database.

Table Model class

Lastly, the Person class is the model of the database table that we are trying to represent. In this class, we add the required __tablename__ attribute, add Column objects for each column making sure to specify the person_id as the primary key, and finally the relationship.

For the relationship, we need to add an attribute to the table model class Person to represent the list of foreign objects, which in this case are more Person objects. So, let’s call it parents.

Next, we set parents equal to SQLAlchemy’s relationship function, which we imported at the top.

The relationship function takes a variable number of arguments depending on the type of relationship you’re setting up. The first argument to pass is the string name of the foreign table model’s class name. Since we are creating a relationship from Person back to itself, this is just ‘Person’.

The next argument is a keyword argument called secondary that must be set equal to the intersect-table variable, parent_child.

In my experience, this is all that would be required for your average many-to-many relationship, but for our case, we also need to specify the primaryjoin and secondaryjoin keyword arguments so that SQLAlchemy knows what to do with the two ForeignKey columns in our intersect-table variable. And this makes sense if you were to recreate the join order in SQL, which I suggest doing so that you have a clear picture of the relationships you’re trying to represent.

Note: for anyone reading this that might be new to SQLAlchemy and has made it this far, everything described above only pertains to setting up the table models, and does not touch on how to execute queries against them using the ORM syntax. For that, I would check out the Object Relational Tutorial.

Conclusion

Usually, I find myself linking one table model to another via a one-to-many relationship, or many-to-one, with the occasional many-to-many relationship via an intersect table. This case was unusual for me, as I needed to link the table to itself as a many-to-many relationship.

However, each time I’ve come across an edge-case, SQLAlchemy has had something to support that scenario. It can take a bit of hunting to find the solution in the documentation, though, so I thought I’d document this one here just in case I need to come back to it, or someone else finds it useful.

--

--

• Economics grad from UC San Diego • QA Engineer • Pythonista • Always learning!