Sometimes, designers will mock-up an extremely usable page that they’ve thought about and tested. Once they’re done and they bring it to you to implement, you look at it and think, “how on Earth can I do that efficiently?” One such case happened at work recently.
We were working on a “command center” of sorts where we show our customers a list of upgrades in various states. The design called for what looked like three different lists of records with each table consisting of records in a state. Since a customer can have potentially dozens of these records, we knew we wanted to paginate the page. The wrinkle in setup was that the pagination should consider the full list of records as the dataset, not each list individually. This caused us to wonder how we can efficiently implement grouped pagination in ActiveRecord.
Naive implementation
When presented with a mock, it’s sometimes difficult to fully understand the intended behavior. For example, take the following abstract mock. The blue represents “draft” posts, the green are “unpublished” posts, and the red are “published” posts1.
Upon first inspection, my pair and I saw three lists of records, each grouped by a headline that corresponds to a status. Arguably, the natural understanding of the pagination behavior on this mock is to paginate each set individually. When we translated that to controller logic, the naïve implementation looked something like this:
We questioned whether we understood the mock correctly since that seemed like it would be a challenging user experience. To make sure we understood it, we did our due diligence and reached out to the designer. It turned out that we should paginate the whole dataset, not each piece individually.
One dataset
We took the mock back for further review with our newfound understanding. A few minutes later, we sketched the following diagram:
By treating the list as a single dataset, it was clear that each page can have multiple groups on it. However, it wasn’t necessarily true that each page has every group on it. Some might have all three, depending on the size of the query, but others might have only two of the statuses, or one!
Complicating things even further, we knew that the order was primarily dependent on the status of the post and we should sort each set of posts (grouped by status) by title. Clearly, we should run a single query and paginate off of that, like the following mock:
We were initially stumped. How could we do this efficiently?
First attempts
Our first attempt to do this was in the database using the GROUP BY
clause in SQL. GROUP BY
is great for when you need to aggregate information for groups of records. However, we needed to treat the set as a single source so that we could paginate it. That’s not possible using GROUP BY
.
After that, we tried to step back from doing anything fancy and use an ORDER BY
clause. However, we thought back to the requirements: we must show the draft posts first, the unpublished posts second, and the published posts third. Since those three terms alphabetically sort in the order, “draft”, “published”, “unpublished,” we knew it wasn’t a straightforward query.
Grouped pagination
We thought a little harder about the problem. If we could somehow change the “status” field to a number, we could easily order the posts by that. With this thought, we reached for the CASE
conditional expression to transform the status field into something we could use. Transforming the post status and ordering them by that calculated field using a SQL fragment looks the following:
I prefer to use Arel nodes for this purpose to give me better error messages if something breaks, so we refactored to this form:
Great! We had our posts grouped and ordered appropriately with a call to .order(:title)
. We could then plug it into the controller for viewing2:
This single query3 was able to do what we needed it to do: grouped pagination. It gave us a unified dataset to paginate over, grouped by status, in the correct order for each group. The confusing logic was easily isolated behind the Post.order_by_status
method and we test it in isolation to make sure future Rails upgrades don’t break the syntax or logic.
-
In the interest of not complicating this post with the intimate details of our domain, I decided to show the example using blog posts as the domain of understanding. All you need to know about that is that posts have a “title” and we store statuses as strings. ↩
-
In reality, we didn’t stop here. We extracted the result into a model using
SimpleDelegator
. This let us remove the grouping logic from the controller and the view. However, that’s not germane to this post, so I’m ignoring it. ↩ -
The use of
select
might have caught your eye. Since we needed to show the records on the page, it’s not a performance concern to select from the relation. Also, using#where
has two problems. First, it removes the pagination. Second, it runs three queries instead of one. ↩