Avoid N + 1 issues in Ruby on Rails when checking existance

in code


How perfect and clear, hindsight! This was another “oh, duh” moment for me. Stupid means humble means another step closer to enlightenment, so that’s good.

In our application a user may favourite content in a manner à la Facebook. When rendering views, the naive way (the Mark way before this post) is to test for existence using exists?:

Returns true if a record exists in the table that matches the id or conditions given, or false otherwise.

Emphasis mine. @sparkle.like.exists? id: @user.id will hit the database each and every time I call it. In the case of our seeded data, for each given view there are:

  • 1 parent Group.
  • 10 Orgies with 5 Comments each.
  • 10 Rainbows with 5 Comments each.

There will be an extra

(10 * 5) + (10 * 5) + 10 + 10 + 1 = 121

in each view, and that in seeded data with a minimum of records. The extra hits on the database added whole seconds to rendering! This was not optimal.

Solution

There aren’t many good Stack Overflow posts which detail this problem, hence this post. My own solution is to include the association with the initial query, and from there query whether the collection includes the given association:

@sparkle = Orgy.include(:likes).where(id: 1).first
@sparkle.likes.map(&:id).include? @user.id

While this is a Good Enough solution for small-medium applications, it still loads a crapload of data for each page. What if @sparkle has a thousand likes? Five thousand?

Ware Ye

  1. This fucks up all existing tests for existence.
  2. This fucks up the tests because include? only looks in the array in memory. If you didn’t load the association, then you don’t get a true answer.

Listos Solution

Check inclusion instead of existence.


Update, June 2017

Listos: Didn’t work; reverted.

This didn’t work too well for us, and we went back to checking existence So, yes, we won on N + 1 queries, but lost when we eager loaded a fucklot of extra records. A like reduces to one of two questions:

  1. As a user, does my like exist within the post’s likes?
  2. As a post, does this user’s like exist within my likes?

In our application, per page, we load 5 of each kind of primary record (Orgies and Ponies), along with their Comments. A User can

  1. Like a Pony.
  2. Like an Orgy.
  3. Like a Comment.
  4. Attend an Orgy (same thing as a Like).

If I check from the User’s side (and the User has 300 Likes): that’s 1 + 301 records loaded every, single, time.

  1:User + 301:Likes =
                 301

If I check from the side of each Pony, Orgy and Comment (and let’s say each primary record has 10 likes and 5 comments each with 5 likes:

  (5:Ponies * 5:Comments) + (5:Orgies * 5:Comments) + (5:Ponies * 10:Likes) + (5:Orgies * 10:Likes) + (25:Comments * 5:Likes) =
                                                                           (5 * 5) + (5 * 5) + (5 * 10) + (5 * 10) + (25 * 5) =
                                                                                                                          275

Performance-wise, checking from the User’s side is bad and the Pony’s side worse. That’s 2-300 extra records on every page load. Lesson learned. After two weeks of watching load times creep up, I decided to cache the likes instead:

module Likeable
  extend ActiveSupport::Concern

  included do
      has_and_belongs_to_many :likes, after_add: :increment_likes, after_remove: :decrement_likes

      def liked_by?(user)
        Rails.cache.fetch("#{cache_key}/likes") { likes.map(&:user_id) }.include? user.like.id
      end

      private

      def increment_likes(*)
        cache_likes
        increment! :likes_count
      end

      def decrement_likes(*)
        cache_likes
        decrement! :likes_count
      end

      def cache_likes
        Rails.cache.write "#{cache_key}/likes", likes.map(&:user_id)
      end
    end
  end
end 


Replace Odd Numbers of Spaces Only

in code


Your email address will not be published. Required fields are marked *