When designing your domain model for an application, the natural solution might end up using a
nil to model the absence of a value. In these cases, you might need to reach for null-based ordering within your database. When you’re writing a Ruby on Rails application, you’re most likely going to use ActiveRecord as the interface to your database. While ActiveRecord supports most queries out-of-the-box, sometimes you have to go off the beaten path.
Null-based ordering is one of those situations. Since ActiveRecord’s ordering works based on simple
key: :direction pairs, it is hard to perform null-based ordering outside of using the built-in behavior of your database. This article outlines two ways you can structure this type of null-based ordering, along with the caveats of using them.
A simple model
For this article, consider the following model. You want to deliver notifications to your customers. Each notification has a header and a body. You deliver the notification to an account via a “delivery” and the owner of the account receives the delivery and reads it at two separate times. When showing notifications to the owner of an account, you want to show unread notifications first in reverse order of delivery, then show the read notifications in the order that the owner read them.
Let’s run through a Class-Responsibility-Collaborator card exercise to define your model. Looking at this short description, we see three nouns that are candidates for models:
Delivery. We also see four properties: a
delivered at time, and a
read at time. We also see two responsibilities, “show unread notifications first” and “show read notifications in the order the owner read them.”
We can translate these findings to the following model:
class Account < ApplicationRecord has_many :deliveries, class_name: "Notifications::Delivery" has_many :notifications, through: :deliveries end class Notification < ApplicationRecord attribute :body, :string attribute :header, :string has_many :deliveries, class_name: "Notifications::Delivery" has_many :accounts, through: :deliveries end class Notifications::Delivery < ApplicationRecord alias_attribute :delivered_at, :created_at attribute :read_at, :time belongs_to :account belongs_to :notification end
This simple model uses a join model to represent the act of delivering a notification. By modeling an act, the join model makes the model more flexible by allowing us to easily deliver notifications to multiple accounts. It also separates the two concerns — the content of the notification and the act of delivering it — so that they can be independently modified.
Now that we have a model, let’s think about how to satisfy the goal of the brief.
Default null ordering
By default, SQL considers
NULL to be a value less than any present value. When you order by a nullable field, an
ORDER BY read_at ASC places all
NULL values at the end of the results, whereas an
ORDER BY read_at DESC clause places them at the beginning.
With that knowledge, it’s straightforward on how to place the unread notifications at the start of the list:
Notification .joins(:deliveries) .order( '"notifications_deliveries"."read_at" DESC' )
I find SQL fragments to be frustrating to write because typos are less obvious and you can’t use Rails' aliased attributes so we could refactor this to use
ActiveRecord::Relation#merge to write each clause in the natural sense:
Notification .joins(:deliveries) .merge(Notifications::Deliveries.order(read_at: :desc))
Now we have the unread messages at the beginning of the list … but the most recently read last. That doesn’t quite match our brief. What we want is for the unread notifications to be listed first, but the read ones to be listed in ascending order. What can we do?
Null-based ordering in SQL:2003
The 2003 version of the ISO SQL added a null ordering clause that allows you to specify
NULLS FIRST or
NULLS LAST for a nullable field. This changes the default interpretation of null for the purposes of sorting. Using this logic, we could change the query to:
Notification .joins(:deliveries) .order( '"notifications_deliveries"."read_at" ASC NULLS FIRST' )
This gets the notifications in the order we want, except for when there are multiple unread messages. Let’s fix that:
Notification .joins(:deliveries) .order( '"notifications_deliveries"."read_at" ASC NULLS FIRST', '"notifications_deliveries"."created_at" DESC' )
ORDER BY clause handles “unread notifications first […] read notifications in the order in which the owner read them.” And the second handles “unread notifications in order of delivery” because the database will only apply it to break ties where
read_at is the same value (usually only for unread notifications).
Refactoring to Arel
Since ActiveRecord does not, by default, support
NULLS FIRST we resorted to two SQL fragments. But as I said earlier, I find SQL fragments to be difficult to write on the first go-around and hard to debug when they result in error. To fix this, we can use the Arel internal querying grammar1. A refactor might looks something like this:
class Notifications::Delivery < ApplicationRecord def self.in_display_order order( arel_table[:read_at].asc.nulls_first, arel_table[:delivered_at].desc ) end end
With this, we have encapsulated the logic in a chainable method on the relation using fully-qualified field names, so we can go back to using
#merge to run the full query:
Notification .join(:deliveries) .merge(Notifications::Delivery.in_display_order)
There are two extra things that I leave as exercises for the reader:
- We are not scoping this query to a single account. How would you make sure you’re only showing deliveries to a particular account?
- This is a lot to type and a lot to change if we want to show notifications in this order in multiple places. What could you add or refactor to make it easier to change this logic in the future? Hint: think about how can we hide the implementation detail of deliveries from a module that only wants to know about
Since ActiveRecord is a generic object-relational mapper over several different databases, you have to think about whether your database supports this syntax before you try to use this in your own application. Let’s take a look at database support for null-based ordering.
PostgreSQL has supported null-based ordering since v8.3. Since this version went end-of-life in 2013, you are most likely on a version newer than that so go ahead!
As of v8.0.23, released on 2021-01-18, MySQL does not support this query2. You can work around the problem by using some quirks of MySQL’s treatments of null.
To order null values first in a query, we can rely on the fact that
IS NULL returns a
1 for a null value and a
0 otherwise. Using that knowledge, we can split the predicate into two:
class Notifications::Delivery < ApplicationRecord def self.in_display_order order( arel_table[:read_at].not_eq(nil), arel_table[:read_at].asc, arel_table[:delivered_at].desc ) end end
I’m not really sure how this will interact with indexes since I haven’t used MySQL in production since before the MariaDB split. Since the predicate checks for a null value, I think it will use indexes accordingly, but check before you roll it out to production.
If you use this and have different findings, please let me know!
SQLite has supported null-based ordering since v3.30.0, released on 2019-10-30. As of this writing, Debian Stable (Buster) is only on v3.27.2 but Debian Testing and Unstable (Buster) are both on v3.34.1. Ubuntu 20.04, the latest LTS release, is on v3.31.1. And for the Docker users, Alpine is on v3.34.1.
However, as of Rails 6.1.1, the SQLite adapter does not support the Arel version of the query and the SQL fragment version requires a wrap of the
NULLS FIRST ordering clause in
Arel.sql to prevent an
With those caveats, a modified version looks like:
class Notifications::Delivery < ApplicationRecord def self.in_display_order order( Arel.sql('"notifications_deliveries"."read_at" ASC NULLS FIRST'), '"notifications_deliveries"."created_at" DESC' ) end end
Null-based ordering allows you to have fine-grained control over the ordering of nullable fields within your database queries. For cases where you need to handle null values differently, using the SQL:2003 null ordering predicates can help you make your sorting logic easier-to-read and understand while also helping you keep querying logic in the database rather than Ruby.
PostgreSQL has excellent support for null-based ordering, both at the database level and the ActiveRecord level. MySQL does not support it, but there are reasonable workarounds. And SQLite requires a small workaround due to not supporting the query at the ActiveRecord level.
Using Arel makes it so you can even use your domain logic when you use
alias_attribute, which is an added cherry on top. Even though it’s a private API, it’s stable enough that I feel fine using it with tests around it for regression coverage during Rails upgrades.
Have you ever used null-based ordering in your Rails application? What were you querying for? If not, has there ever been a case where you needed to work around the problem?
It’s important to note that the Rails core team considers Arel to be a private API. I use it extensively and it has been stable for my use cases, but I recommend writing tests around any query in which you use it to make sure that Rails upgrades don’t break your queries. ↩︎
I know MySQL is still very popular, but I don’t know why. Do you work somewhere that uses MySQL? I’d love to hear why! ↩︎