PageRenderTime 43ms CodeModel.GetById 28ms RepoModel.GetById 0ms app.codeStats 0ms

/doc/development/what_requires_downtime.md

https://gitlab.com/markglenfletcher/gitlab-ee
Markdown | 388 lines | 285 code | 103 blank | 0 comment | 0 complexity | 7510ab7fd5379668b97589d9ed078751 MD5 | raw file
  1. # What requires downtime?
  2. When working with a database certain operations can be performed without taking
  3. GitLab offline, others do require a downtime period. This guide describes
  4. various operations, their impact, and how to perform them without requiring
  5. downtime.
  6. ## Adding Columns
  7. You can safely add a new column to an existing table as long as it does **not**
  8. have a default value. For example, this query would not require downtime:
  9. ```sql
  10. ALTER TABLE projects ADD COLUMN random_value int;
  11. ```
  12. Add a column _with_ a default however does require downtime. For example,
  13. consider this query:
  14. ```sql
  15. ALTER TABLE projects ADD COLUMN random_value int DEFAULT 42;
  16. ```
  17. This requires updating every single row in the `projects` table so that
  18. `random_value` is set to `42` by default. This requires updating all rows and
  19. indexes in a table. This in turn acquires enough locks on the table for it to
  20. effectively block any other queries.
  21. Adding a column with a default value _can_ be done without requiring downtime
  22. when using the migration helper method
  23. `Gitlab::Database::MigrationHelpers#add_column_with_default`. This method works
  24. similar to `add_column` except it updates existing rows in batches without
  25. blocking access to the table being modified. See ["Adding Columns With Default
  26. Values"](migration_style_guide.md#adding-columns-with-default-values) for more
  27. information on how to use this method.
  28. ## Dropping Columns
  29. Removing columns is tricky because running GitLab processes may still be using
  30. the columns. To work around this you will need two separate merge requests and
  31. releases: one to ignore and then remove the column, and one to remove the ignore
  32. rule.
  33. ### Step 1: Ignoring The Column
  34. The first step is to ignore the column in the application code. This is
  35. necessary because Rails caches the columns and re-uses this cache in various
  36. places. This can be done by defining the columns to ignore. For example, to ignore
  37. `updated_at` in the User model you'd use the following:
  38. ```ruby
  39. class User < ApplicationRecord
  40. self.ignored_columns += %i[updated_at]
  41. end
  42. ```
  43. Once added you should create a _post-deployment_ migration that removes the
  44. column. Both these changes should be submitted in the same merge request.
  45. ### Step 2: Removing The Ignore Rule
  46. Once the changes from step 1 have been released & deployed you can set up a
  47. separate merge request that removes the ignore rule. This merge request can
  48. simply remove the `self.ignored_columns` line.
  49. ## Renaming Columns
  50. Renaming columns the normal way requires downtime as an application may continue
  51. using the old column name during/after a database migration. To rename a column
  52. without requiring downtime we need two migrations: a regular migration, and a
  53. post-deployment migration. Both these migration can go in the same release.
  54. ### Step 1: Add The Regular Migration
  55. First we need to create the regular migration. This migration should use
  56. `Gitlab::Database::MigrationHelpers#rename_column_concurrently` to perform the
  57. renaming. For example
  58. ```ruby
  59. # A regular migration in db/migrate
  60. class RenameUsersUpdatedAtToUpdatedAtTimestamp < ActiveRecord::Migration[4.2]
  61. include Gitlab::Database::MigrationHelpers
  62. disable_ddl_transaction!
  63. def up
  64. rename_column_concurrently :users, :updated_at, :updated_at_timestamp
  65. end
  66. def down
  67. undo_rename_column_concurrently :users, :updated_at, :updated_at_timestamp
  68. end
  69. end
  70. ```
  71. This will take care of renaming the column, ensuring data stays in sync, copying
  72. over indexes and foreign keys, etc.
  73. **NOTE:** if a column contains 1 or more indexes that do not contain the name of
  74. the original column, the above procedure will fail. In this case you will first
  75. need to rename these indexes.
  76. ### Step 2: Add A Post-Deployment Migration
  77. The renaming procedure requires some cleaning up in a post-deployment migration.
  78. We can perform this cleanup using
  79. `Gitlab::Database::MigrationHelpers#cleanup_concurrent_column_rename`:
  80. ```ruby
  81. # A post-deployment migration in db/post_migrate
  82. class CleanupUsersUpdatedAtRename < ActiveRecord::Migration[4.2]
  83. include Gitlab::Database::MigrationHelpers
  84. disable_ddl_transaction!
  85. def up
  86. cleanup_concurrent_column_rename :users, :updated_at, :updated_at_timestamp
  87. end
  88. def down
  89. undo_cleanup_concurrent_column_rename :users, :updated_at, :updated_at_timestamp
  90. end
  91. end
  92. ```
  93. ## Changing Column Constraints
  94. Adding or removing a NOT NULL clause (or another constraint) can typically be
  95. done without requiring downtime. However, this does require that any application
  96. changes are deployed _first_. Thus, changing the constraints of a column should
  97. happen in a post-deployment migration.
  98. NOTE: Avoid using `change_column` as it produces inefficient query because it re-defines
  99. the whole column type. For example, to add a NOT NULL constraint, prefer `change_column_null`
  100. ## Changing Column Types
  101. Changing the type of a column can be done using
  102. `Gitlab::Database::MigrationHelpers#change_column_type_concurrently`. This
  103. method works similarly to `rename_column_concurrently`. For example, let's say
  104. we want to change the type of `users.username` from `string` to `text`.
  105. ### Step 1: Create A Regular Migration
  106. A regular migration is used to create a new column with a temporary name along
  107. with setting up some triggers to keep data in sync. Such a migration would look
  108. as follows:
  109. ```ruby
  110. # A regular migration in db/migrate
  111. class ChangeUsersUsernameStringToText < ActiveRecord::Migration[4.2]
  112. include Gitlab::Database::MigrationHelpers
  113. disable_ddl_transaction!
  114. def up
  115. change_column_type_concurrently :users, :username, :text
  116. end
  117. def down
  118. cleanup_concurrent_column_type_change :users, :username
  119. end
  120. end
  121. ```
  122. ### Step 2: Create A Post Deployment Migration
  123. Next we need to clean up our changes using a post-deployment migration:
  124. ```ruby
  125. # A post-deployment migration in db/post_migrate
  126. class ChangeUsersUsernameStringToTextCleanup < ActiveRecord::Migration[4.2]
  127. include Gitlab::Database::MigrationHelpers
  128. disable_ddl_transaction!
  129. def up
  130. cleanup_concurrent_column_type_change :users
  131. end
  132. def down
  133. change_column_type_concurrently :users, :username, :string
  134. end
  135. end
  136. ```
  137. And that's it, we're done!
  138. ## Changing The Schema For Large Tables
  139. While `change_column_type_concurrently` and `rename_column_concurrently` can be
  140. used for changing the schema of a table without downtime, it doesn't work very
  141. well for large tables. Because all of the work happens in sequence the migration
  142. can take a very long time to complete, preventing a deployment from proceeding.
  143. They can also produce a lot of pressure on the database due to it rapidly
  144. updating many rows in sequence.
  145. To reduce database pressure you should instead use
  146. `change_column_type_using_background_migration` or `rename_column_using_background_migration`
  147. when migrating a column in a large table (e.g. `issues`). These methods work
  148. similarly to the concurrent counterparts but uses background migration to spread
  149. the work / load over a longer time period, without slowing down deployments.
  150. For example, to change the column type using a background migration:
  151. ```ruby
  152. class ExampleMigration < ActiveRecord::Migration[4.2]
  153. include Gitlab::Database::MigrationHelpers
  154. disable_ddl_transaction!
  155. class Issue < ActiveRecord::Base
  156. self.table_name = 'issues'
  157. include EachBatch
  158. def self.to_migrate
  159. where('closed_at IS NOT NULL')
  160. end
  161. end
  162. def up
  163. change_column_type_using_background_migration(
  164. Issue.to_migrate,
  165. :closed_at,
  166. :datetime_with_timezone
  167. )
  168. end
  169. def down
  170. change_column_type_using_background_migration(
  171. Issue.to_migrate,
  172. :closed_at,
  173. :datetime
  174. )
  175. end
  176. end
  177. ```
  178. This would change the type of `issues.closed_at` to `timestamp with time zone`.
  179. Keep in mind that the relation passed to
  180. `change_column_type_using_background_migration` _must_ include `EachBatch`,
  181. otherwise it will raise a `TypeError`.
  182. This migration then needs to be followed in a separate release (_not_ a patch
  183. release) by a cleanup migration, which should steal from the queue and handle
  184. any remaining rows. For example:
  185. ```ruby
  186. class MigrateRemainingIssuesClosedAt < ActiveRecord::Migration[4.2]
  187. include Gitlab::Database::MigrationHelpers
  188. DOWNTIME = false
  189. disable_ddl_transaction!
  190. class Issue < ActiveRecord::Base
  191. self.table_name = 'issues'
  192. include EachBatch
  193. end
  194. def up
  195. Gitlab::BackgroundMigration.steal('CopyColumn')
  196. Gitlab::BackgroundMigration.steal('CleanupConcurrentTypeChange')
  197. migrate_remaining_rows if migrate_column_type?
  198. end
  199. def down
  200. # Previous migrations already revert the changes made here.
  201. end
  202. def migrate_remaining_rows
  203. Issue.where('closed_at_for_type_change IS NULL AND closed_at IS NOT NULL').each_batch do |batch|
  204. batch.update_all('closed_at_for_type_change = closed_at')
  205. end
  206. cleanup_concurrent_column_type_change(:issues, :closed_at)
  207. end
  208. def migrate_column_type?
  209. # Some environments may have already executed the previous version of this
  210. # migration, thus we don't need to migrate those environments again.
  211. column_for('issues', 'closed_at').type == :datetime # rubocop:disable Migration/Datetime
  212. end
  213. end
  214. ```
  215. The same applies to `rename_column_using_background_migration`:
  216. 1. Create a migration using the helper, which will schedule background
  217. migrations to spread the writes over a longer period of time.
  218. 1. In the next monthly release, create a clean-up migration to steal from the
  219. Sidekiq queues, migrate any missing rows, and cleanup the rename. This
  220. migration should skip the steps after stealing from the Sidekiq queues if the
  221. column has already been renamed.
  222. For more information, see [the documentation on cleaning up background
  223. migrations](background_migrations.md#cleaning-up).
  224. ## Adding Indexes
  225. Adding indexes is an expensive process that blocks INSERT and UPDATE queries for
  226. the duration. You can work around this by using the `CONCURRENTLY` option:
  227. ```sql
  228. CREATE INDEX CONCURRENTLY index_name ON projects (column_name);
  229. ```
  230. Migrations can take advantage of this by using the method
  231. `add_concurrent_index`. For example:
  232. ```ruby
  233. class MyMigration < ActiveRecord::Migration[4.2]
  234. def up
  235. add_concurrent_index :projects, :column_name
  236. end
  237. def down
  238. remove_index(:projects, :column_name) if index_exists?(:projects, :column_name)
  239. end
  240. end
  241. ```
  242. Note that `add_concurrent_index` can not be reversed automatically, thus you
  243. need to manually define `up` and `down`.
  244. ## Dropping Indexes
  245. Dropping an index does not require downtime.
  246. ## Adding Tables
  247. This operation is safe as there's no code using the table just yet.
  248. ## Dropping Tables
  249. Dropping tables can be done safely using a post-deployment migration, but only
  250. if the application no longer uses the table.
  251. ## Adding Foreign Keys
  252. Adding foreign keys usually works in 3 steps:
  253. 1. Start a transaction
  254. 1. Run `ALTER TABLE` to add the constraint(s)
  255. 1. Check all existing data
  256. Because `ALTER TABLE` typically acquires an exclusive lock until the end of a
  257. transaction this means this approach would require downtime.
  258. GitLab allows you to work around this by using
  259. `Gitlab::Database::MigrationHelpers#add_concurrent_foreign_key`. This method
  260. ensures that no downtime is needed.
  261. ## Removing Foreign Keys
  262. This operation does not require downtime.
  263. ## Data Migrations
  264. Data migrations can be tricky. The usual approach to migrate data is to take a 3
  265. step approach:
  266. 1. Migrate the initial batch of data
  267. 1. Deploy the application code
  268. 1. Migrate any remaining data
  269. Usually this works, but not always. For example, if a field's format is to be
  270. changed from JSON to something else we have a bit of a problem. If we were to
  271. change existing data before deploying application code we'll most likely run
  272. into errors. On the other hand, if we were to migrate after deploying the
  273. application code we could run into the same problems.
  274. If you merely need to correct some invalid data, then a post-deployment
  275. migration is usually enough. If you need to change the format of data (e.g. from
  276. JSON to something else) it's typically best to add a new column for the new data
  277. format, and have the application use that. In such a case the procedure would
  278. be:
  279. 1. Add a new column in the new format
  280. 1. Copy over existing data to this new column
  281. 1. Deploy the application code
  282. 1. In a post-deployment migration, copy over any remaining data
  283. In general there is no one-size-fits-all solution, therefore it's best to
  284. discuss these kind of migrations in a merge request to make sure they are
  285. implemented in the best way possible.