This is really just the ‘story’ of how I fixed a very slight performance issue with the Recycling Group Finder site that I run, but I figured it would be worth a post as an example or motivation to anyone else who needs to get started investigating their own Ruby on Rails app performance issues.
The performance problem
I’ve been very happy with the responsiveness of the Recycling Group Finder, so just out of interest, just to see what it would tell me, I installed the NewRelic RPM plugin and activated the free Bronze account available to EngineYard customers. The results were pretty satisfying as my average response time for the most popular page was 163ms maximum with the second most popular page at 90ms. Those are good response times and fall well within the 37signals response time rule:
Our general rule of thumb is that most pages should render their HTML on the server in less than 200ms and almost all in less than 500ms.
One of the great things about data visualisations is it can make it really easy to spot patterns. Take this New Relic graph for example:
The yellow on the graph represents time spent in the database, the blue is time spent in Ruby, ie. rendering, controllers etc. Memcached accesses are on there too but they’re so fast they hardly appear. This graph looked suspicious to me, I’d normally expect database time to be a much smaller proportion of the overall request time. So it looks like there may be some optimisation that can be done, but in order to optimise I first need to know what to optimise.
- Don’t optimise yet
- If you need to optimise, profile first.
I’m never going to be able to optimise my code unless I know what to optimise. If I trawl through looking for places that might be slow and trying to make them faster the chances are I’m going to spend hours changing code for no benefit. I might even make it slower. I need to know exactly where the bottleneck is, I need to profile my code.
There are a bunch of ways of finding out where your code is slow and I’ve personally used ruby-prof before with good results. However I know that the issue here is in the database, and I know that Rack::Bug will show me SQL queries that have run for an action, and importantly how long they took, so that’s what I’m going to try first. I install the plugin, configure it and load it up. The issue is immediately obvious:
Almost all of the SQL that is executed is under 0.5ms per query, there are a few queries at ~4ms but he one query that really stands out is the third one down. At 44.75ms it is more than half of the overall SQL time. Bingo! Now I know what is slow I need to know why it is slow. Time to break out the query analyser.
I needed to dig deeper into that SQL statement to see what it was doing, so I opened up a postgres shell and ran an
explain analyse on the query:
The issue seems pretty clear. There is a Sequential scan on groups:
Seq Scan on groups (cost=0.00..626.75 rows=4885 width=363) (actual time=0.038..26.495 rows=5126 loops=1)
A Sequential scan on a large table is going to sink performance. I can see that the sequential scan is definitely the issue in this case as the cost and time taken are significant proportions of the overall query time. I need to eliminate it. Here’s the code that generates that query:
@groups = Group.find(:all, :include => :group_page, :origin => [@location.lat, @location.lng], :limit => 30, :conditions => ["defunct = false AND lat is not null and lng is not null and full_address is not null and full_address != '' and country_code = ?", @location.country_code], :order => 'distance ASC, num_members DESC')
I wrote this code ages ago and re-reading it now I can see that although I am limiting the returned results to 30 rows the query will have to hit every row in the table to determine which rows are in the returned 30 as there are no conditions to the query. Whoops. Looking over the Geokit docs I see there’s a
:within condition so I added a
:within => 100 to the find. Testing the resultant query in the postgres shell using
explain analyse again and the query has dropped to 10ms. Not bad but it’s still using a sequential scan. Adding an index on the conditions speeds up the query further to ~1.2ms:
Not bad when it started out at nearly 45ms. Here is the result reflected in the New Relic graph:
I deployed the new code approximately in the middle of the graph, it should be pretty obvious where.
Before you can optimise your Ruby on Rails app (or your app in any other framework/language for that matter) you need to know know where to optimise. Tools like Rack::Bug and NewRelic allow you to do this effectively and easily allowing you to direct your attention only on those parts of your app that need the attention.
On the Recycling Group Finder I cut response times drastically in about half an hour. Without knowing exactly where to make the change I would have been left guessing and may never have made the optimisation I did.