Reducing Leaky Abstractions Introduced by ActiveRecord

0
48
Back to Basics: Boolean Expressions


Rails’ ActiveRecord provides a comprehensive interface for querying the
database. Unchecked and without proper processes in place, it can become
unwieldy as the domain changes.

Imagine an application domain where a team of people publishes a technical
blog.

class Person < ApplicationRecord
  has_many :posts
end

class Post < ApplicationRecord
  belongs_to :author, class_name: "Person"
end

In addition to the author association and other standard post data
attributes, the Post model contains a boolean flag named published.

A Rails controller showing the newest published posts might look like:

class PostsController < ApplicationController
  def index
    @newest_posts = Post.where(published: true).order(created_at: :desc).limit(10)
  end
end

Let’s go one step further, where we create a page dedicated to the list of
published authors:

class AuthorsController < ApplicationController
  def index
    @published_authors = Person.distinct.joins(:posts).where(posts: { published: true })
  end
end

A new feature comes in where teammates want to enqueue posts to be published in
the future.

This could be modeled by adjusting the published boolean to a published_at
timestamp that allows for three states:

  • unpublished (published_at is set to nil)
  • published (published_at is set to a timestamp less than or equal to now)
  • enqueued (published_at is set to a timestamp in the future)

While this is a relatively small change in the database and corresponding
migration (which we won’t go into here), the necessary changes across these
different controllers represent a code smell, Shotgun
Surgery
.

While this example is small, in larger codebases, changes like this can add up
to a sizeable PR quickly. Most often, changes associated with this shift in data
include:

  • controllers
  • service objects
  • query objects
  • jobs
  • factories
  • tests (especially acceptance tests or anything that touches the database)

The underlying issue here is that ActiveRecord can act as a leaky
abstraction
.

By nature of it abstracting over a database with direct references to columns,
in combination with the ability to use where either directly on Post within
a controller (or even worse, reaching through an association to find
published authors in the second controller example), we’re littering
information about how a post is considered published (the contents of the
where clause) in a few different files within the application (currently, the
model and two separate controllers).

While this approach is dependent on the complexity of the queries, I’d first
lean on a class method on Post:

class Post < ApplicationRecord
  # other methods

  def self.published
    where("published_at < ?", Time.current)
  end
end

With this, changes to the controllers are trivial:

class PostsController < ApplicationController
  def index
-   @newest_posts = Post.where(published: true).order(created_at: :desc).limit(10)
+   @newest_posts = Post.published.order(created_at: :desc).limit(10)
  end
end

class AuthorsController < ApplicationController
  def index
-   @published_authors = Person.distinct.joins(:posts).where(posts: { published: true })
+   @published_authors = Person.distinct.joins(:posts).merge(Post.published)
  end
end

Is there still coupling at the controller level between a person and their
corresponding posts? Yep! Adjusting that setup, however, seems more appropriate
to be a breaking change, where the notion of a post being published should
hold, generally speaking, whether we’re using a published boolean, a
published_at timestamp, or some sort of state machine.

Worth highlighting in this second change is the merge method, which handles
all the heavy lifting of merging the Post.published query with the
Person.distinct query.

In working with larger applications, use of where is not the only indicator
from ActiveRecord, nor is it always problematic. where with associations, for
example, falls into the “coupling association” category, which is usually
innocuous.

It’s also worth noting that where use being problematic is not only bound to
Rails controllers; service objects, jobs, and other areas of the application
querying against the “guts” of an ActiveRecord object are susceptible.

Finally, while we used a class method in the example above, for larger queries,
consider a dedicated query object to encapsulate logic in the appropriate
spots.



Source link

Leave a reply

Please enter your comment!
Please enter your name here