Index ¦ Archives  ¦ Atom  ¦ RSS

Django and Stored Procedures

There is no code to go along with this post because I haven't had the time or use-case to develop it yet, but it was an idea spawned from the ORM of Django and some maintenance I was doing in a PHP project.

Basically, Django's built-in ORM is awesome, but lacking. There are many reasons to move to SQLAlchemy, but not if you're more interested in speed of development. One feature that Django's ORM is missing is stored procedure support, including the ability to make them quickly.

A stored procedure is basically a function definition on the database side that runs a few to hundreds of lines of SQL and returns the result, decreasing the amount of SQL sent over the network. This is very useful for complicated statements that are called often, say once for every single page, but require only a few variables if any. A use-case in my PHP project was where 99% of the pages ran the same lookup query which consisted of about 10 lines of SQL; instead of sending that SQL over (around half a KB) each time, I replaced the SQL with "CALL generic_lookup(\$pageid)" or in Python "CALL generic_lookup(%s)" % pageid.

Now, to port this to Django we need a two-phrase process. The first one creates the procedures in the database, the second replaces normal ORM calls with stored procedures.

I have not made this, but here are the specifications for the code that would take only a few hours to create and a few more to debug (if only I had the time):

A decorator for models.Manager functions. Called @stored, it will function in both phases differently. For the creation phase, it will simply tag the function it is wrapping with func.is_stored = True, allowing introspecting code to identify it as a stored procedure. It will also make the wrapped function available like so:

@stored
def all(self):
    return self.get_query_set()

all.is_stored == True
str(all.func.__class__) == <type 'function'>

The first phase also involves a management command that goes through every model's managers and looks for stored procedures using the .is_stored property created above. The .is_stored and .func properties could be combined.

This management command, having found all the stored procedures, will then proceed to get the SQL for each QuerySets and wrap it with the DB-specific code for stored-procedures (MySQL's is provided earlier). Once the SQL has been wrapped, it is sent to the DB for creation and the procedure is now available. The SQL will need some filter to allow for arguments to the stored procedures, but that's a whole other can of worms for a possible continuation of this post.

Once the procedures have been created on the DB, the @stored decorator now just ignores the function it's wrapping save for its name and arguments and returns the dataset returned by the database after a call to the stored procedure. Since most databases don't allow further filtering on procedure output, returning the dataset is the only way to easily keep loose-coupling with the DB.

I hope this post was useful to someone else who can code in Python/Django. If you create a decorator using these specifications, please post a link in the comments so that others can benefit from it too. If I write it, I'll definitely provide the app somehow, though I don't see it in my near future.

© Fahrzin Hemmati. Built using Pelican. Theme by Giulio Fidente on github.