Cleaning House - Removing dead columns

Cleaning House - Removing dead columns

On January 10, 2023, our code base turns 11 years old. For a repo with this much history, we’d expect for the business needs and features to have evolved over time. Surely in the quest to ship new features, our past selves might have forgotten to remove all the dead code. While examining our spaghetti model, we were trying to figure out the purpose of each item. Some of this analysis is waste since the column might be dead code.

Explore your table

We have many nullable columns, so I was curious if any of them were no longer used. For each column, I ran this query

select id, created_at from companies where name IS NOT NULL ORDER BY id DESC  LIMIT 1;

I came across five columns that were very old

columnLast created company to have this value set (1)
code2015-08-18 15:15
industry2017-05-19 18:52
special_code2015-08-18 15:15
zendesk2018-05-10 10:43
google2018-05-16 16:12

Research the history of the column

This is strong evidence that the column isn’t getting used. I do like to look around a bit more to see what’s up.

  1. Check out the migrations
  2. git log -S special_code

The extra research allowed me to see when it was added. Several of the columns were added together, and when they were no longer used by the code. It also gave me a list of people to possibly talk to if I had more questions.

1. (Optional) Fail Your Tests

See which tests fail if your column is removed. Don’t forget to turn off any validations!

  self.ignored_columns = [


# validates :special_code, inclusion: { in: SPECIAL_CODES}, allow_blank:true

I learned that we had a .rabl file that displayed the special_code which caused many tests to fail. I then removed it from the .rabl file and re-ran my tests.

It’s possible that you can just skip this step. I had noticed that there were many references to special_code in my tests and I was concerned that removing them would leave an unexposed usage of special_code. However, we have that covered later on by adding Step X, logging.

2. Remove any dependent dead code

If the column is dead, there might be additional dead code that needs removing. Have fun!

3. Introduce logging to make sure nothing is using the attribute / column

I’ve been burned by meta-programming. If this table is in the center of all the action, taking a few extra steps is worth giving you the confidence that nothing is going to fall apart in the next step. Any issues will be a log statement.

  def special_code"Using Company#special_code at #{caller}")

  def special_code=(code)"Using Company#special_code=  at #{caller}")

After deploying, I then wait a day (or longer) to see if there is hidden part of the system using the code. You might just be surprised!

4. Mark the column as unused or ignored by rails

It’s tempting to jump to the migration. In large rails systems, it’s a best practice to never rollback a migration. This is our last line of defense before dropping the column for real. Any issues will cause a rails exception.

self.ignored_columns = [:special_code]

Remove the logging that you added in the previous step

-  def special_code
-"Using Company#special_code at #{caller}")
-    super
-  end

-  def special_code=(code)
-"Using Company#special_code=  at #{caller}")
-    super(code)
-  end

If you use any tools to annotate your model, these should remove all the annotations as well.

5. Delete the column with a migration

bin/rails g migration DropSpecialCodeColumnFromCompany

Look at your db/schema.rb file for any indexes. I go here to seee the column type and defaults for the down method (even though it will never be run on my production systems)

rake db:migrate

If you use a system like ghost, this will generate the json changes too.

6. Clean-up

After you’ve run the migration in all environments, now it’s time to take care of any cleanup. Remove the self.ignored_columns from your model.

© 2009-2022. All rights reserved.