Complex preloading strategies in Rails using custom Active Record scopes

TL;DR; There are instances where the eager_load and preload Active Record directives are not enough to suit your preloading requirements. Don't be afraid to write your own class methods to define your own preloading strategies. Even if these methods return an array instead of an Active Record relation, these methods will still be more efficient than having N+1 queries.

Active Record offer two main ways of preventing N+1 queries: eager_load and preload.

The difference between these two is subtle but important:

  • eager_load loads related associations via LEFT OUTER JOIN. This is often used for belongs_to associations.
  • preload loads related associations by collecting foreign IDs, making a bulk request for the records and injecting them in the parent records. This is often used for has_many associations.

Both of these methods rely on defining "standard" associations (belongs_to and has_many).

Now what happens if you have custom associations? A typical example of custom association are many to many associations where the foreign_keys are stored on the parent model as an array.

Why would you do that? Well maybe because you actually don't need to define unnecessary joint tables for secondary associations? Maybe because the association needs to be polymorphic (= no has_many_and_belongs_to) but is not important enough deserve a joint model? Or maybe because your app was initially designed like this and recently migrated to Rails?

No matter the reason, the question is: how to still properly preload these custom associations?

And more generally the question is: is it possible to define non-ActiveRecord logic on scopes while still maintaining a pseudo Active Record interface?

Let's see what we can do, using our custom association preloading as an example.

Custom scopes to the rescue

Rails allow you to define custom scopes. These scopes can be used to define reusable filtering options but can also be used to define common preloading strategies.

Here is a basic example:

If your abstract API controller is configured to invoke the for_api scope on your models, then that's one quick and efficient way of defining how your models should be preloaded when collections are requested on your API.

Now let's consider our Product model with our custom labels association:

We cannot use eager_load or preload here due to the custom nature of our association. But we can still manually preload the associations in bulk by manually injecting related records.

This is what it looks like:

The concept is simple: you load custom associations in bulk and inject the relevant associated models manually. Chaining works as long as your custom scope is last.

That is:

This solution works well when you need to quickly put together a custom scope. But there are two main drawbacks:

  1. Chaining only works if your custom scope is last
  2. Pagination using find_in_batches and find_each doesn't work as intended (N+1 will happen because these methods will reset our modified records)

Fixing pagination: the cheap way

If all you need is pagination, there is a quick way to do it using find_in_batches and yield.

Just edit your scope method to call find_in_batches and - based on the presence of a block - either return results or yield them.

You can then paginate like this:

This approach is not the most elegant one but is simple enough if you need something off the ground quickly.

An ActiveRecord-like solution

To get a completely neat solution, we need a class that wraps the results and mimics ActiveRecord::Relation. This approach would allow us to chain our scope wherever we want and use pagination the same way we use it with Active Record.

The following class is exactly that. It's a proxy class that delegates filtering methods to Active Record and defer our custom processing logic till the very end, when results must be returned.

Using this proxy class you can rewrite your Product scope in the following way:

Then use your scope in almost the same way as you would with an Active Record relation:

Easy! We can now define custom preloading strategies which rely on non-Rails patterns while still loading data in bulk and benefiting from ActiveRecord-like syntax.

Important note: The proxy above is not a full implementation of the Active Record relation interface. Query directives such as group or select will likely tamper with the data expected by our processing block. Therefore I omitted them from our Proxy implementation. Invoking these methods on the query chain will fail the query.

I also omitted scoping-specific methods such as unscoped and default_scoped for ease of reading - but these could perfectly be added to the proxy.

Feel free to expand the proxy implementation to support more use cases.

About us

Keypup is on a mission to help developers and tech leads work better with each others on development projects. Our platform automatically centralizes, prioritizes and assigns people and actions on issues and pull requests to optimize your development flow.

Don't get lost because you have to juggle with twenty pull requests across five development projects. We'll clean and organize that for you to ensure a smooth landing.

---

Code snippets hosted with ❤ by GitHub

Don't miss these stories: