The In-depth Guide to ActiveRecord load_async in Rails 7

The new ActiveRecord load_async method is represented by trains Photo by Avi Waxman on Unsplash

The new ActiveRecord load_async method is represented by trains Photo by Avi Waxman on Unsplash

Rails 7 introduces ActiveRecord load_async API that runs SQL queries asynchronously in the background thread. This seemingly simple change of just adding a single new method that takes no arguments has profound implications for database layer interactions. In this tutorial, we’ll deep dive into the intricacies of this new API. We’ll discuss lazy-loaded queries, Ruby threading model, blocking IO, database pool vs. max connections limit, and performance impact of concurrent database clients. I’ll also try to suggest scenarios where introducing async SQL queries could have the most benefit without sacrificing the stability of your Rails app.

We have a lot of ground to cover, so let’s get started!

ActiveRecord load_async 101

It is common for Rails controller action to trigger multiple queries for different models:

class PagesController < ApplicationController
  def home
    @users = User.slow
    @comments = Comment.slow
    @posts = Post.slow

and later display data in the view like that:

  <%- @users.each do -%>
      <li><%= %></li>
  <%- end -%>

We’re using a slow scope method with the following implementation:

scope :slow, -> {
  where("SELECT true FROM pg_sleep(1)").limit(10)

The slow scope returns the rows from a table after an artificial one-second delay. This approach will allow us to deterministically measure the performance impact of adding load_async without populating a database with thousands of records.

Executing this controller action should print out similar server logs:

User Load (1006.0ms)  SELECT "users".* FROM "users" ...
Comment Load (1003.2ms)  SELECT "comments".* FROM "comments" ...
Post Load (1041.2ms)  SELECT "posts".* FROM "posts" ...
Completed 200 OK in 3204ms (Views: 51.2ms ...)

We can see that queries were executed sequentially, resulting in a response time of over 3 seconds. Most of the time was spent in the ActiveRecord layer. Let’s now see how using load_async can speed things up.

Optimizing response time with asynchronous SQL

We can rewrite our controller to leverage the new API:

class PagesController < ApplicationController
  def home
    @users = User.slow.load_async
    @comments = Comment.slow.load_async
    @posts = Post.slow.load_async

Make sure to enable asynchronous execution in your environment config file:


config.active_record.async_query_executor = :global_thread_pool

and now we should get the following logs:

User Load (1008.7ms) ...
ASYNC Comment Load (62.1ms) (db time 1013.8ms) ...
ASYNC Post Load (0.0ms) (db time 1010.5ms) ...
Completed 200 OK in 1085ms (Views: 64.4ms ...)

It looks like we’re down to one second response time, ~3x improvement. Does it mean that Rails does scale? Just update to version 7 and sprinkle load_async all over the codebase!

All the things load_async

Not really… Overusing the new load_async API to magically improve response times is likely to crash your app and slow the database to a crawl.

To understand why we’ll need to take a step back and first recap Ruby threads and ActiveRecord basics. But you’ve been warned that this is a deep dive

You can also skip the basics and go straight to the section describing load_async by clicking here.

Ruby threads in a nutshell

This blog post is by no means trying to be a comprehensive introduction to the Ruby threading model. I’ll focus on the basics relevant to understanding the inner workings of load_async.

Ruby supports parallel execution for so-called blocking I/O. Any operation that does not directly use the CPU cycles from its thread but instead delegates the work to external processes is a blocking I/O. Typical examples in the context of Ruby on Rails web apps are SQL database queries, reading/writing to files, or HTTP requests. On the contrary, you cannot speed up, e.g., generation of md5 hashes using Ruby multithreading because it is a CPU-bound operation. I’ve covered Ruby threading, GIL, and blocking I/O in much more depth in my other blog post.

Here’s a basic example of doing a blocking IO inside a separate thread:

t1 = { sleep 2 }
sleep 2

In this case, sleep is our work, our blocking IO. If you copy-paste the above snippet into the Ruby IRB, it will freeze for 2 seconds. It means that we managed to do 4 seconds of sleeping in 2 seconds because our work was parallelized using a separate thread. While sleeping might not seem like the most useful thing to do, you could replace sleep with an SQL or HTTP call and get a similar result.

Let’s see another example:

t1 = { sleep 3 }
sleep 2

Running it would freeze IRB for 3 seconds. We tried to join our t1 thread into the main thread before it completed its execution. That’s why it took one more second of the main thread execution time.

So, the key takeaways here are that blocking IO can be parallelized using a separate Ruby thread, and if it is joined into the main thread while still running, it will block until execution completes.

Let’s now move on to the ActiveRecord chunk of our load_async puzzle.

ActiveRecord lazy-loaded queries

To better explain how load_async works, we’ll now discuss ActiveRecord lazy loading behavior. It is critical to understand that running:

@users = User.slow

does not execute an SQL query by itself but only creates an ActiveRecord_Relation object. The actual SQL is only triggered when necessary. This behavior was introduced in Rails 3.

Elrond remembers Rails 3

Lazy loading allows for many cool features like merging scopes and passing ActiveRecord query objects around without running the SQL before the results are needed.

You can try it yourself in the console. First make sure to enable ActiveRecord STDOUT logging:

ActiveRecord::Base.logger =

and now run:

@users = User.slow

You should see a similar output after a one-second delay:

User Load (1006.2ms)

So it looks like the query DID run. But did not I just tell you that it wouldn’t? In this case, the Ruby IRB is to blame. By default, it prints the result of the last command, effectively executing the query. You can circumvent this behavior by running:

@users = User.slow; nil

and now the command executes instantly, and you should not see the SQL log anymore. By appending nil as output for IRB to display, we prevented query execution. Only trying to display the contents of @users would trigger the database interaction and take around a second to run.

Alternatively, you can force a query to load eagerly by appending to_a:

@users = User.slow.to_a; nil

You can easily check if relation object has already been loaded by reading its loaded? attribute:

@lazy_users = User.slow; nil
@lazy_users.loaded? # => false

@loaded_users = User.slow.to_a
@loaded_users.loaded # => true

Asynchronous ActiveRecord queries under the microscope

We’ve now covered the basics necessary to understand lazy_load. Let’s finally move on to explaining the method itself. First try to trigger asynchronous query in the Rails console:

@users = User.slow.lazy_load

After a one-second delay result of the query is printed out, and logs show:

User Load (1001.2ms)

So how come appending lazy_load did not trigger our query in the background? Before explaining it, let’s first force asynchronous execution by running:

@users = User.slow.load_async; sleep 2; @users

and now finally we get the following output:

ASYNC User Load (0.0ms) (db time 1003.1ms)

We can see the ASYNC prefix and (0.0ms) indicating that the query did not take any of the main thread processing time. 1006.5ms was spent in the background (described as db time). Let’s see how these numbers change when we modify sleep duration:

@users = User.slow.load_async; sleep 0.2; @users
ASYNC User Load (807.8ms) (db time 1003.8ms)

Our query was joined into the main thread after 200ms, so it took additional 807.8ms to complete execution. Just like the barebone threads we discussed earlier, asynchronous SQL can be joined into the main thread before it completes and, as a result, block it for the remaining of its execution time.

This is a critical distinction. load_async queries are never lazy loaded but work similarly to calling to_a on an ActiveRecord_Relation object in a separate thread. Calling load_async always sets a corresponding relation object’s loaded parameter to true. You can consult the PR implementing load_async for details. It means that load_async are always triggered immediatelly when called and joined into the main thread when their results are necessary.

Now you should understand why load_async sometimes does not schedule queries in the background. If the result is needed right away, the background thread will immediatelly be joined into the main thread and continue with blocking execution.

While playing around with load_async API for the first time, I could not always force queries to run asynchronously. Extracting these simple examples helped me grasp what factors decide if the background thread will be triggered or not.

Now we can better understand why we’ve seen the following log output:

User Load (1008.7ms) ...
ASYNC Comment Load (62.1ms) (db time 1013.8ms) ...
ASYNC Post Load (0.0ms) (db time 1010.5ms) ...

The results of the User query were displayed on top of the HTML view. So, before the corresponding query was scheduled in the background it moved to the main thread. While it was executing in a blocking way, the Comment and Post queries had time to complete their execution in the background and did not add any more blocking to the main thread. That’s where the ~3x speedup came from.

Now that we have a pretty solid understanding of the inner workings of load_async, let’s discuss how misusing it could blow up your application.

When not to load_async

load_async is a kind of horizontal scaling for our SQL pipeline. Instead of sequentially scheduling queries through a single connection, we’re using threads to execute multiple queries at the same time. You can illustrate the difference by digging into PostgreSQL pg_stat_activity metatable:

select pid,
from pg_stat_activity
where state = 'active' and
application_name = 'bin/rails';

The following SQL displays currently active client connections to your database established by the rails console process. You can compare the outcome of this query for the following Ruby snippets (increasing the slow scope delay to a few seconds will make it easier to run both commands at the same time):

@users = User.slow
@comments = Comment.slow
@posts = Post.slow

puts @users
puts @comments
puts @posts


 pid | application_name |         backend_start         | state
 226 | bin/rails        | 2022-02-23 00:10:34.900532+00 | active

and running:

@users = User.slow.load_async
@comments = Comment.slow.load_async
@posts = Post.slow.load_async

puts @users
puts @comments
puts @posts

would display something similar to:

 pid | application_name |         backend_start         | state
 226 | bin/rails        | 2022-02-23 00:10:34.900532+00 | active
 230 | bin/rails        | 2022-02-23 00:12:30.44271+00  | active
 231 | bin/rails        | 2022-02-23 00:12:30.443108+00 | active

This is how we can empirically check that asynchronous queries spawn more separate database connections.

Balancing database pool, threads and max connections for async queries (╯°□°)╯︵ ┻━┻

Things get a bit more complicated when we want to reason about the impact of running asynchronous SQL in the context of multithreaded processes like Puma server or Sidekiq worker. Rails provide a global pool config that can be defined inside a config/database.yml file. This value determines how many database connections can be spawned per Ruby process. Each Puma or Sidekiq worker is a single process that can initialize a predefined number of threads. A common suggestion is to set pool to the max value of concurrency supported by your application processes. For example, if your Sidekiq worker uses a maximum of 25 threads, then the pool value should be at least 25. Otherwise, your process might not be able to connect to the database because all the connections are busy, and the pool limit does not allow spawning more. As a result, you’ll see the following error:

  (could not obtain a connection from the pool within 5.000 seconds (waited 5.003 seconds);
    all pooled connections were in use):

You might be wondering why cannot you set the pool to some arbitrarily high value so that there’s never a shortage of available connections?

The answer is that the database itself also has a global limit on how many concurrent clients it can handle. For example, the Heroku PostgreSQL addon imposes a hardcoded limit of MAX_CONNECTIONS setting per its database plan. AWS RDS gives much more flexibility on configuring the internals of PostgreSQL, and it’s just one of the reasons why I usually recommend migrating the Heroku database to AWS RDS.

I’m describing this setting in more detail because using load_async means that we’re now spawning threads within threads

Threads all the way down

Each asynchronous SQL query is a separate thread that needs a new connection. There’s a new config option available:


that lets you define how many concurrent SQL queries can be executed per process. The math is getting a bit more confusing now so let’s try it with a simple example.

Imagine you’re using a single Performance-L Heroku web dyno running Puma server with recommended settings, i.e., 8 workers (processes) 5 max threads each. Additionally, you’re using a single Sidekiq worker on a Performance-M dyno with max concurrency configured to 10.

Without load_async, you should set the pool value to 10 so that your Sidekiq worker always has an active connection available. The max number of concurrent connections that your database might receive with this config is quick maths:

5 x 8 + 1 x 10 = 50

If you introduce load_async with the default global_executor_concurrency of 4 you’ll get:

8 x 5 + 8 x 4 + 1 x 10 + 1 x 4 = 86

and pool should be configured to 14 to account for possible additional connections from the Sidekiq process.

So you can see that by just enabling the new API, we almost doubled the potential number of maximum concurrent database clients. I hope that your current Heroku plan is ready for that…

But, it also means that each process can schedule max 4 asynchronous queries simultaneously. So if you try to parallelize three SQL queries inside your bottleneck web endpoint, you’ve already exhausted 3 out of 4 connections available for load_async. And that’s just one out of 5 Puma threads (we’re using a limit of 4 async connections per worker). So I’m not sure if this is a perfect config. At the same time, increasing global_executor_concurrency should always be performed carefully because each additional connection uses more database memory and decreases overall database performance.

Honestly, I don’t know what’s the perfect tradeoff here. But, I hope that this extended description of how these different variables interact with each other will give you a solid background for confidently tweaking them in your production app. Inspecting the number of currently active connections using the metadata SQL query and playing with different threading and pool settings together with Siege load testing can be a great way to better understand how these values impact each other.

Comprehensive monitoring is critical if you want to introduce load_async to a production app with non-trivial traffic. I’d highly recommend adding alerts for a number of active database clients and memory usage. That’s another area where AWS RDS is superior to Heroku PostgreSQL addon thanks to seamless CloudWatch integration:

Abot AWS Cloudwatch dashboard requests detected by Scout APM

AWS Cloudwatch dashboard. Web servers and PG stats at a glance.

You can read more about moving your Heroku db to AWS RDS in my other blog post.

Now we understand why asynchronous queries should always be introduced with care. Database connections are a precious resource and should only be assigned to hand-picked places where it’s likely to make the best overall impact. ScoutAPM is a tool that I regularly use while conducting my Rails performance audits. My favorite feature is an instant overview of places where optimizing performance makes the most sense. You can easily sort your endpoints by the percentage of resources used:

Abot Polls for Slack endpoints performance analyzed with ScoutAPM

Later you can dive into the performance characteristics of each endpoint and check what part of its processing time is spent in the ActiveRecord layer:

ScoutAPM endpoint all layers

ScoutAPM endpoint database layer

Visual overview of endpoint processing time layers provided by ScoutAPM.

But time spent in the ActiveRecord layer does not necessarily mean that you’re dealing with slow queries. Often an N+1 bug could be causing hundreds of queries to be generated in a single request. You cannot fix N+1 using load_async (check out this blog post for info on how to do it correctly). To validate if you’re dealing with slow queries, you can use the Slow Query Insights ScoutAPM feature and track slow database queries and match them with potential bottleneck endpoints.

Alternatively, you can use calls and outliers methods of Rails PG Extras to detect slow queries that are using a significant ratio of your db processing time. Then with the help of the marginalia gem, you can trace the endpoints that generate those queries. You can find more info about using Rails PG Extras for improving database performance in this blog post.

Once you have these pieces of data, you can decide if there are places that eat up most of your apps’ processing time, and the time is spent in the database layer. In that case, it’s possible that slow queries could be parallelized and result in a significant overall improvement.

Another suitable use case for async SQL are endpoints that, in addition to slow queries, perform 3rd party HTTP requests. Adding load_async to an SQL query statement before an HTTP call would effectively parallelize it reducing a response time.

Continuous observation and iterative approach are critical when applying performance fixes for non-trivial Rails apps. Before adding load_async to any of the queries, you should double-check if speeding it up and paying the price of additional database connections is potentially worth it.


Async eager loading for relations

N+1 queries are the top performance killer for Rails applications. Most of the time, they can be easily avoided by using so-called eager loading. I wrote about it in more detail in my other blog post.

Currently load_async does not seem to support eager loaded queries. Running:

@users = User.slow.includes(:comment).load_async; sleep 2; @users


ASYNC User Load (0.0ms) (db time 1003.1ms)
Comment Load (20.0ms)

So, relation seems to be loaded synchronously inside a main thread. Hopefully this one will get improved in the future release. But, right now a related PR appears to be abandoned.

Asynchronous SQL inside a transaction

There’s following line of code inside a PR adding the new API:

raise AsynchronousQueryInsideTransactionError,
  "Asynchronous queries are not allowed inside transactions"

It’s pretty self-explanatory, but I’ve never managed to trigger it. Instead, if you try to run an async query inside a transaction, it would just fallback to synchronous execution:

ActiveRecord::Base.transaction do
  @users = User.slow.load_async; sleep 2; @users


User Load (2005.3ms) ...

You have to remember that the infamous ActiveRecord callback methods (excluding after_commit) are always implicitly wrapped in a database transaction. It means that asynchronous queries would never work there.

Another non-obvious context where all the calls are wrapped in a transaction is a Rails console sandbox mode. Running:

rails console --sandbox

Spawns a new console process, and all the database changes are rolled back on exit. So, you have to remember that it’s impossible to test load_async inside a sandbox mode.


Before Rails 7, it was possible to wrap SQL calls into an asynchronous thread and add parallelism to database interactions. But, DIY solutions were a highly probable source of sneaky bugs. The new built-it way for scheduling async queries comes with additional thread-safety and data consistency guarantees. I believe that if used with care, load_async has the potential to significantly improve many typical performance bottleneck scenarios.

I did try to do in-depth enough research on how the load_async API works and include all the relevant info. But, given its newness and complexity, I’m sure new interesting facts start popping up once it becomes more widely adopted in production environments. I’m planning to keep this guide up to date, so please let me know if you find that some info is missing or inaccurate.

Source link

Leave a reply

Please enter your comment!
Please enter your name here