Software teams often cite code reviews as a primary means for ensuring quality and reducing bugs in their code. However, without high quality reviews, the efficacy of this practice is questionable. In most places that I’ve worked, a change requires only one code review (either through necessity due to team size or through informal norms). When this is the case, the code review becomes a taxing process where, if you’re not on your game, bugs or maintainability issues can start to leak into production. This means that it often falls to the senior engineers to do code reviews. However, relying entirely on the seniors means that they become a bottleneck in the process. When it comes to “dangerous” changes like Rails database migrations, the bottleneck of relying on a small subset of the team can mean that velocity drops. Or, to keep up velocity, you make code reviews less stringent and quality drops.
In try to “scale myself” for my team, I have started to experiment with checklists for performing code reviews. The first one that I wrote up is a checklist for reviewing Rails database migrations. We run on Heroku, which means that zero-downtime migrations are easy to do, as long as you write your migrations correctly. I intend this checklist to help more junior teammates do as thorough of a job in reviewing as I do. Going forward, I want to make more checklists for doing different types of code reviews, to further scale myself for my team.
Checklists are hard
One thing I found from starting this process is that checklists are hard. It’s really taxing to try to list everything that you do in a given situation. Most habits are tacit knowledge, rather than explicit knowledge. Because of this, recalling each habit when you’re not in the middle of using it is difficult and error-prone. As such, I have likely missed several practices that I do when I’m checking database migrations.
I would love to hear if you have any practices that you use to make sure your Rails database migrations are correct and won’t cause downtime. Have you tried this before? How did it go? Would you change anything about your review process?
Without further ado, here is the first version of our Rails database migrations checklist.
Rails database migrations checklist
When you are performing a code review that involves Rails database migrations, there is a list of commonly forgotten things that are good to think about. This checklist will be helpful for you any time you see a change in the
db/migrate folder of a Rails project.
- If you are using
#change, is the migration reversible?
- Are all the changes to
schema.rbrelated to migrations in this pull request?
Additive changes do not change any existing columns. They can add tables, columns, or indexes.
- Are all queried fields properly indexed via
- If you are using a JSON or JSON-B column, is it indexed with a GIN index via
add_index table_name, column_name, using: 'gin'?
- Are all unstructured columns using a JSON or JSON-B column instead of a YAML-based
- Do all columns that have a
validates_uniqueness_ofvalidation backed with a unique index as well (e.g.
add_index :users, :name, index: true, unique: true)?
- If you are adding an index to a pre-existing table, are you using
algorithm: :concurrentlyand did you specify
Destructive migrations are migrations that use any of the
- Can the model code interoperate between this version and the earlier version?
- If it cannot, can you split the deployment into two or three steps to make zero-downtime deploys?
When you are using
#change instead of the separate
#down methods, you must take care to make sure the migration is reversible. Here is an example of a non-reversible migration:
In this migration, we are changing from a
last_name schema to a singular
name column. Going up is perfectly fine (if we ignore that in this implementation we’re going to destroy data!): the new column gets created and the old columns get removed. Going down, however, is a different story. If we try to do a
rails db:migrate:down or a
rails db:rollback, we will see the following error:
#change handles both the
#down through one command, ActiveRecord requires a more explicit form when you’re removing a column. You have to specify the type of the column(s) that you are removing to make it reversible. You can fix this example like:
Because we try to run zero-downtime deploys, you must account for interoperability when you are changing the usage of a column or an association. For example, consider this change:
We want to change the storage of user names from two fields into a single
#name field. You can do that with the following:
However, we will get inconsistent results (and potential 500 errors) while this change is deploying and while the Rake task is running. To achieve interoperability between both versions of the code, we could solve it like this:
This change creates a virtual attribute for
#name that wraps the new column value and defaults the method response to the values of the first and last name on the model joined by a space. This means that:
- When the older version pulls records, the app ignores the
#namemethod and column entirely in favor of the old columns
- For records that the new version pulls before the Rake task modifies them, the app falls back to the right operand of the or clause in the
- Lastly, when the new version pulls records after the Rake task modifies them, the
#namemethod returns the value stored in the database.
Once the Rake task updates all the records, you can then deploy a second version with a migration to remove the old data and the compatibility layer:
For reference, Code Ship has a decent walkthrough of interoperable migrations, though be aware about checking for better ways to solve the problems in newer versions of Rails.