Railers Need To Stop Not Caring About The Database

For many of us, databases are no fun… surely not as fun as ActiveRecord or Rails in general. Every railers I know love the “dot notation” offered by ActiveRecord associations immensely… while refering to the vast majority of raw SQL queries as “Some ugly non-sense gibberish that we would all prefer not know the existence”.

Our hate towards the SQL language and databases is not a valid excuse to not assume our responsabilities as application developers. Recently I was working on a Rails plugin when the sentiment of being incompetent struck my body exactly like the lightning strikes an adventurous golfer who happily plays golf during a thunderstorm.

It all happened when I had a query that was generating a lot of results… and was slow as hell to process… even though I was using the :include option for Eager Loading.

@stuff = ParentStuff.find(:all, :include => :child_stuff, :order => “created_at DESC”)

Then, in my view, I had something like :

It was awfully slow! I started looking at the logs and saw this typical monstruous sql query. Look at it… you have to look at it. I know you don’t want to… but it won’t go away :

SELECT parent_stuff.id AS t0_r0, parent_stuff.field1 AS t0_r1, parent_stuff.field2 AS t0_r2, parent_stuff.field3 AS t0_r3, parent_stuff.field4 AS t0_r4, child_stuff.id AS t1_r0, child_stuff.field5 AS t1_r1, child_stuff.field6 AS t1_r2, child_stuff.field7 AS t1_r3, child_stuff.field8 AS t1_r4, child_stuff.field9 AS t1_r5, child_stuff.field10 AS t1_r6, child_stuff.field11 AS t1_r7 FROM parent_stuff LEFT OUTER JOIN child_stuff ON child_stuff.the_foreign_key = parent_stuff.id WHERE (parent_stuff.created_at >= ’2008-04-24 00:00:00′ AND parent_stuff.created_at <= '2008-04-24 09:44:56') ORDER BY parent_stuff.created_at

What's the problem with this? It's just one query... cannot take that much time. I thought it was a "Rails problem" until I ran the query directly in a MySql web interface. It took the same amount of time, meaning that the problem was within the query itself. Oh sh**... it was a database problem! Like many Rails Developers, my database skills are, while not inexistant, inadequate.

I tried to figure out what could be the problem with this query. I replaced every LEFT OUTER JOIN with INNER JOIN and Boom! the query executed in less than half a second.

My urgent desire of leaving the MySql web interface made me go back to my Rails code. I then tried some stupid random stuff like :

@stuff = ParentStuff.find(:all, :include => :child_stuff, :joins => “INNER JOIN child_stuff ON child_stuff.parent_stuff_id=id” :order => “created_at DESC”) but the slow LEFT OUTER JOINs remain in the query instead of being replace by the faster INNER JOIN. Then, I learned that in Rails 2.0, you could do Eager Loading with INNER JOIN by passing an association name to the :joins option, like that :

@stuff = ParentStuff.find(:all, :joins => :child_stuff, :order => “created_at DESC”)

Problem is I was working on a plugin… and I’d like it to be compatible with older versions of Rails. So I did what I was scared of doing since the beginning : Going to some MySql forums to read the advice of some DBA’s …

Then I found the answer to my problem… and this answer proved me that I needed to stop playing blindly with ActiveRecord like if the backend database was none of my business…. because it is. The answer is so obvious… but given the fact that I never wanted to care about databases since Rails exists, I didn’t think about it :

Answer : Add an index to the foreign key in the child table!

This is just one case that shows the dangers of not being aware of what’s going on in the database. It’s sad, but you cannot blindly let ActiveRecord manage everything that is Database related for you. It’s only a question of being in control of the entire application… and the relational database is a part of it.

UPDATE
This post has kind of turned against me. The only thing I wanted to say with this article is that ActiveRecord needs our supervision… not that I am a moron who can’t understand databases. I know what joins are, I know what indexes are… but in most situations I don’t have to “care” about them with ActiveRecord. Sorry but I felt the need to defend myself about this issue :)

Rubyize this : 6th edition

Oh, my… God.

What is this? First post in a century? What’s going on here Fleebie? Our beloved readers are going to stop respecting us, and for good reasons.

For those who care, I am not dead, I am not sick… I am just working like crazy on Azanka, the company I started with Dan recently.

Ok now it’s time for a brand new edition of Rubyize this. Let’s get the ball rolling. Hold on… I’m going to google images to get a picture of a ball rolling.

364421257_8cad73e64f.jpg

Hmm… I’m afraid this isn’t going to roll at all

Anyway, consider this :

I am an old and unhappy programmer from the late eighties and you secretly watch me coding the following lines :

Bring the light to this man who retires in 2 days

Refactor this on RMC!