Oh very clever Worf … Eat any good books lately? – Q

A Hack for Multi-Table Inheritance in Django

November 25, 2009 - 1:44am

If you create a model class in Django and then create a model subclass of that, then Django creates a 1:1 relationship to a table with the extra fields defined in the subclass. This results in there being an instance of the parent class as well as an “extended version” of that object in the subclass. That works out rather well for most cases, but there’s a few edge cases where it’s not sufficient.

Say I have an object A and a subclass B:


class A(Model): created = DateTimeField(auto_now_add=True)

class B(Model): modified = DateTimeField(auto_now=True)

I’ll wind up with some tables like this:

A

id created
1 14535143
2 14536222

B

id modified a_ptr_id
1 34253245 1

What this means is that if I instantiate PK 1 via A’s manager, I’ll get an A and only ever an A because the information about what kind of object it is exists only in B’s table. So if I have, say, a blog engine like Tumblr where there are text, image, video, audio, and other kinds of posts and I’ve made them all model classes based off RootBlog, I can’t do a lookup by PK and expect the right object; I’ll always get a RootBlog object.

Traditionally, if you had this problem, you’d use abstract classes and just have a different object series for each kind of post. That works, but it means you can’t do shortcuts that just pass around an ID and get the right object — you have to pass around the type as well.

However, there is one trick that’s a result of the 1:1 relationship that could be of some assistance here. When you create an instance of our example object A, it will have a property called ‘b’ that is a link to its corresponding full instance of B. The problem is that we can’t easily follow this without giving the ancestor direct knowledge of all its descendants’ names. If the blog app tried this, RootBlog would have to have get_text_post, get_image_post, get_video_post and so on, and not only would only one would work, but you’d have to extend the root object every time you added something!

So, facing this problem, I pondered a hackish solution. It took some hours of looking at dir() and _meta output for it to click, but the following code works (in Django 1.1, at least):


class RootObject(models.Model): object_class = models.CharField(max_length=100)

def save(self, *args, **kwargs): if not self.object_class: self.object_class = self._meta.module_name super(RootObject, self).save(*args, **kwargs) def get_object(self): if self._meta.module_name == self.object_class: return self else: return getattr(self, self.object_class)

What we do here is make the expectation that when the object is first created, it is created in its proper class (as a B object, say). So when we go to save for the first time, we save the name of the class in a field.

As this is at the root level, if we instantiate this root object with a PK that is for some child class then we can use the get_object method to go a lookup based on the old class name to follow the same-named relationship back up to the proper child class instance.

Voila! I can share a PK series with all my post types, thus uniquing all content in the same sequence. I can now, also, create shortcut URLs to any class that derives from this base class because I can look it up then ask it for its absolute URL and redirect to it (think about /s/a45 as a shortcut to a specific object on the site, no matter the class).

Oh, the future tricks you can do when every object in your graph has a completely unique identifier. Just remember, the lookup has a cost, so don’t go crazy with it. But when you need this, you’ll know. Smiling

“He is a very shallow critic who cannot see an eternal rebel in the heart of a conservative.” — Varied Types – G. K. Chesterton

Syndicate content