Software teams often cite code reviews as a primary means for ensuring quality and reducing bugs in their code. However, without high quality reviews, the efficacy of this practice is questionable. In most places that I’ve worked, a change requires only one code review (either through necessity due to team size or through informal norms). When this is the case, the code review becomes a taxing process where, if you’re not on your game, bugs or maintainability issues can start to leak into production. This means that it often falls to the senior engineers to do code reviews. However, relying entirely on the seniors means that they become a bottleneck in the process. When it comes to “dangerous” changes like Rails database migrations, the bottleneck of relying on a small subset of the team can mean that velocity drops. Or, to keep up velocity, you make code reviews less stringent and quality drops.
In try to “scale myself” for my team, I have started to experiment with checklists for performing code reviews. The first one that I wrote up is a checklist for reviewing Rails database migrations. We run on Heroku, which means that zero-downtime migrations are easy to do, as long as you write your migrations correctly. I intend this checklist to help more junior teammates do as thorough of a job in reviewing as I do. Going forward, I want to make more checklists for doing different types of code reviews, to further scale myself for my team.
Checklists are hard
One thing I found from starting this process is that checklists are hard. It’s really taxing to try to list everything that you do in a given situation. Most habits are tacit knowledge, rather than explicit knowledge. Because of this, recalling each habit when you’re not in the middle of using it is difficult and error-prone. As such, I have likely missed several practices that I do when I’m checking database migrations.
I would love to hear if you have any practices that you use to make sure your Rails database migrations are correct and won’t cause downtime. Have you tried this before? How did it go? Would you change anything about your review process?
Without further ado, here is the first version of our Rails database migrations checklist.
Rails database migrations checklist
When you are performing a code review that involves Rails database migrations, there is a list of commonly forgotten things that are good to think about. This checklist will be helpful for you any time you see a change in the
db/migrate folder of a Rails project.
- If you are using
#change, is the migration reversible?
- Are all the changes to
schema.rbrelated to migrations in this pull request?
Additive changes do not change any existing columns. They can add tables, columns, or indexes.
- Are all queried fields properly indexed via
- If you are using a JSON or JSON-B column, is it indexed with a GIN index via
add_index table_name, column_name, using: 'gin'?
- Are all unstructured columns using a JSON or JSON-B column instead of a YAML-based
- Do all columns that have a
validates_uniqueness_ofvalidation backed with a unique index as well (e.g.
add_index :users, :name, index: true, unique: true)?
- If you are adding an index to a pre-existing table, are you using
algorithm: :concurrentlyand did you specify
Destructive migrations are migrations that use any of the
- Can the model code interoperate between this version and the earlier version?
- If it cannot, can you split the deployment into two or three steps to make zero-downtime deploys?
When you are using
#change instead of the separate
#down methods, you must take care to make sure the migration is reversible. Here is an example of a non-reversible migration:
def change add_column :users, :name, :string remove_column :users, :first_name remove_column :users, :last_name end
In this migration, we are changing from a
last_name schema to a singular
name column. Going up is perfectly fine (if we ignore that in this implementation we’re going to destroy data!): the new column gets created and the old columns get removed. Going down, however, is a different story. If we try to do a
rails db:migrate:down or a
rails db:rollback, we will see the following error:
== 20180605140744 ChangeToSingularName: reverting ============================= rails aborted! StandardError: An error has occurred, this and all later migrations canceled: remove_column is only reversible if given a type.
#change handles both the
#down through one command, ActiveRecord requires a more explicit form when you’re removing a column. You have to specify the type of the column(s) that you are removing to make it reversible. You can fix this example like:
def change add_column :users, :name, :string remove_column :users, :first_name, :string remove_column :users, :last_name, :string end
Because we try to run zero-downtime deploys, you must account for interoperability when you are changing the usage of a column or an association. For example, consider this change:
class User < ApplicationRecord validates :first_name, presence: true validates :last_name, presence: true end class CreateUsers < ActiveRecord::Migration[5.1] def change create_table :users do |table| table.string :first_name, null: false table.string :last_name, null: false end end end
We want to change the storage of user names from two fields into a single
#name field. You can do that with the following:
# In a migration file class AddUserNames < ActiveRecord::Migration[5.1] def change add_column :users, :name, :string end end # In a Rake task namespace :users do desc 'Migrates the first_name/last_name combo over to name' task migrate_names_to_single_field: :environment do User.where(name: nil).find_each do |user| user.update(name: [user.first_name, user.last_name].join(' ')) end end end # In the User model class User < ApplicationRecord - validates :first_name, presence: true - validates :last_name, presence: true end
However, we will get inconsistent results (and potential 500 errors) while this change is deploying and while the Rake task is running. To achieve interoperability between both versions of the code, we could solve it like this:
class User < ApplicationRecord - validates :first_name, presence: true - validates :last_name, presence: true + + def name + super || attributes.values_at('first_name', 'last_name').join(' ') + end end
This change creates a virtual attribute for
#name that wraps the new column value and defaults the method response to the values of the first and last name on the model joined by a space. This means that:
- When the older version pulls records, the app ignores the
#namemethod and column entirely in favor of the old columns
- For records that the new version pulls before the Rake task modifies them, the app falls back to the right operand of the or clause in the
- Lastly, when the new version pulls records after the Rake task modifies them, the
#namemethod returns the value stored in the database.
Once the Rake task updates all the records, you can then deploy a second version with a migration to remove the old data and the compatibility layer:
# In a migration file class RemoveDeprecatedNameFieldsFromUser < ActiveRecord::Migration[5.1] def change remove_column :users, :first_name, :string remove_column :users, :last_name, :string change_column_null :users, :name, false end end
# In the User model class User < ApplicationRecord + validates :name, presence: true - - def name - super || attributes.values_at('first_name', 'last_name').join(' ') - end end
For reference, Code Ship has a decent walkthrough of interoperable migrations, though be aware about checking for better ways to solve the problems in newer versions of Rails.