PageRenderTime 82ms CodeModel.GetById 22ms RepoModel.GetById 0ms app.codeStats 0ms

/doc/gitlab-geo/configuration.md

https://gitlab.com/LeclercA/gitlab-ee
Markdown | 298 lines | 217 code | 81 blank | 0 comment | 0 complexity | 67cbdf3573440bc63286b0d7453f83f7 MD5 | raw file
  1. # GitLab Geo configuration
  2. > **Important:**
  3. Make sure you have followed the first two steps of the
  4. [Setup instructions](README.md#setup-instructions).
  5. After having installed GitLab Enterprise Edition in the instance that will serve
  6. as a Geo node and set up the database replication, the next steps can be summed
  7. up to:
  8. 1. Configure the primary node
  9. 1. Replicate some required configurations between the primary and the secondaries
  10. 1. Start GitLab in the secondary node's machine
  11. 1. Configure every secondary node in the primary's Admin screen
  12. After GitLab's instance is online and defined in **Geo Nodes** admin screen,
  13. new data will start to be automatically replicated, but you still need to copy
  14. old data from the primary machine (more information below).
  15. ## Primary node GitLab setup
  16. >**Notes:**
  17. - You will need to setup your database into a **Primary <-> Secondary (read-only)** replication
  18. topology, and your Primary node should always point to a database's Primary
  19. instance. If you haven't done that already, read [database replication](./database.md).
  20. - Only in the Geo nodes admin area of the primary node, will you be adding all
  21. nodes' information (secondary and primary). Do not add anything in the Geo
  22. nodes admin area of the secondaries.
  23. To setup the primary node:
  24. 1. [Create the SSH key pair][ssh-pair] for the primary node.
  25. 1. Visit the primary node's **Admin Area > Geo Nodes** (`/admin/geo_nodes`).
  26. 1. Add your primary node by providing its full URL and the public SSH key
  27. you created previously. Make sure to check the box 'This is a primary node'
  28. when adding it.
  29. ![Add new primary Geo node](img/geo_nodes_add_new.png)
  30. ---
  31. >**Note:**
  32. Don't set anything up for the `secondary` node yet, make sure to follow the
  33. [Secondary node GitLab setup](#secondary-node-gitlab-setup) first.
  34. In the following table you can see what all these settings mean:
  35. | Setting | Description |
  36. | --------- | ----------- |
  37. | Primary | This marks a Geo Node as primary. There can be only one primary, make sure that you first add the primary node and then all the others. |
  38. | URL | Your instance's full URL, in the same way it is configured in `gitlab.yml` (source based installations) or `/etc/gitlab/gitlab.rb` (omnibus installations). |
  39. |Public Key | The SSH public key of the user that your GitLab instance runs on (unless changed, should be the user `git`). That means that you have to go in each Geo Node separately and create an SSH key pair. See the [SSH key creation][ssh-pair] section. |
  40. ## Secondary node GitLab setup
  41. >**Note:**
  42. The Geo nodes admin area (**Admin Area > Geo Nodes**) is not used when setting
  43. up the secondary nodes. This is handled at the primary one.
  44. To install a secondary node, you must follow the normal GitLab Enterprise
  45. Edition installation, with some extra requirements:
  46. - You should point your database connection to a [replicated instance](./database.md).
  47. - Your secondary node should be allowed to [communicate via HTTP/HTTPS and
  48. SSH with your primary node (make sure your firewall is not blocking that).
  49. - Don't make any extra steps you would do for a normal new installation
  50. - Don't setup any custom authentication (this will be handled by the `primary` node)
  51. You need to make sure you restored the database backup (that is part of setting
  52. up replication) and that the primary node PostgreSQL instance is ready to
  53. replicate data.
  54. ### Database Encryption Key
  55. GitLab stores a unique encryption key in disk that we use to safely store
  56. sensitive data in the database.
  57. Any secondary node must have the **exact same value** for `db_key_base` as
  58. defined in the primary one.
  59. - For Omnibus installations it is stored at `/etc/gitlab/gitlab-secrets.json`.
  60. - For installations from source it is stored at `/home/git/gitlab/config/secrets.yml`.
  61. Find that key in the primary node and copy paste its value in the secondaries.
  62. ### Enable the secondary GitLab instance
  63. Your new GitLab secondary node can now be safely started.
  64. 1. [Create the SSH key pair][ssh-pair] for the secondary node.
  65. 1. Visit the primary node's **Admin Area > Geo Nodes** (`/admin/geo_nodes`).
  66. 1. Add your secondary node by providing its full URL and the public SSH key
  67. you created previously.
  68. 1. Hit the **Add node** button.
  69. ---
  70. After the **Add Node** button is pressed, the primary node will start to notify
  71. changes to the secondary. Make sure the secondary instance is running and
  72. accessible.
  73. The two most obvious issues that replication can have here are:
  74. - Database replication not working well
  75. - Instance to instance notification not working. In that case, it can be
  76. something of the following:
  77. - You are using a custom certificate or custom CA (see the
  78. [Troubleshooting](#troubleshooting) section)
  79. - Instance is firewalled (check your firewall rules)
  80. ### Repositories data replication
  81. Getting a new secondary Geo node up and running, will also require the
  82. repositories directory to be synced from the primary node. You can use `rsync`
  83. for that. Assuming `1.2.3.4` is the IP of the primary node, SSH into the
  84. secondary and run:
  85. ```bash
  86. # For Omnibus installations
  87. rsync -guavrP root@1.2.3.4:/var/opt/gitlab/git-data/repositories/ /var/opt/gitlab/git-data/repositories/
  88. gitlab-ctl reconfigure # to fix directory permissions
  89. # For installations from source
  90. rsync -guavrP root@1.2.3.4:/home/git/repositories/ /home/git/repositories/
  91. chmod ug+rwX,o-rwx /home/git/repositories
  92. ```
  93. If this step is not followed, the secondary node will eventually clone and
  94. fetch every missing repository as they are updated with new commits on the
  95. primary node, so syncing the repositories beforehand will buy you some time.
  96. While active repositories will be eventually replicated, if you don't rsync,
  97. the files, any archived/inactive repositories will not get in the secondary node
  98. as Geo doesn't run any routine task to look for missing repositories.
  99. ### Authorized keys regeneration
  100. The final step will be to regenerate the keys for `~/.ssh/authorized_keys` using
  101. the command below (HTTPS clone will still work without this extra step).
  102. On the secondary node where the database is [already replicated](./database.md),
  103. run:
  104. ```
  105. # For Omnibus installations
  106. gitlab-rake gitlab:shell:setup
  107. # For source installations
  108. sudo -u git -H bundle exec rake gitlab:shell:setup RAILS_ENV=production
  109. ```
  110. This will enable `git` operations to authorize against your existing users.
  111. New users and SSH keys updated after this step, will be replicated automatically.
  112. ### Ready to use
  113. Your instance should be ready to use. You can visit the Admin area in the
  114. secondary node to check if it's correctly identified as a secondary Geo node and
  115. if Geo is enabled.
  116. If your installation isn't working properly, check the
  117. [troubleshooting](#troubleshooting) section.
  118. ## Create SSH key pairs for new Geo nodes
  119. >**Note:**
  120. These are general instructions to create a new SSH key pair for a new Geo node,
  121. either primary or secondary.
  122. ---
  123. When adding a new Geo node, you must provide an SSH public key of the user that
  124. your GitLab instance runs on (unless changed, should be the user `git`). This
  125. user will act as a "normal user" who fetches from the primary Geo node.
  126. 1. Run the command below on each server that will be a Geo node:
  127. ```bash
  128. sudo -u git -H ssh-keygen
  129. ```
  130. 1. Get the contents of `id_rsa.pub` the was just created:
  131. ```
  132. # Omnibus installations
  133. sudo -u git cat /var/opt/gitlab/.ssh/id_rsa.pub
  134. # Installations from source
  135. sudo -u git cat /home/git/.ssh/id_rsa.pub
  136. ```
  137. 1. Copy them to the admin area of the **primary** node (**Admin Area > Geo Nodes**).
  138. ---
  139. If for any reason you generate the key using a different name from the default
  140. `id_rsa`, or you want to generate an extra key only for the repository
  141. synchronization feature, you can do so, but you have to create/modify your
  142. `~/.ssh/config` (for the `git` user).
  143. This is an example on how to change the default key for all remote hosts:
  144. ```bash
  145. Host * # Match all remote hosts
  146. IdentityFile ~/.ssh/mycustom.key # The location of your private key
  147. ```
  148. This is how to change it for an specific host:
  149. ```bash
  150. Host example.com # The FQDN of the primary Geo node
  151. HostName example.com # The FQDN of the primary Geo node
  152. IdentityFile ~/.ssh/mycustom.key # The location of your private key
  153. ```
  154. ### Add the primary node to the `known_hosts` file of the secondary nodes
  155. >**Note:**
  156. This operation is only needed for the secondary nodes.
  157. ---
  158. The secondary nodes need to know the SSH fingerprint of the primary node that
  159. will be used for the Git clone/fetch operations. In order to add it to the
  160. `known_hosts` file, while in the terminal of a secondary node, run the
  161. following command and type `yes` when asked:
  162. ```
  163. sudo -u git -H ssh git@<primary-node-url>
  164. ```
  165. Replace `<primary-node-url>` with the FQDN of the primary node. You can verify
  166. that the fingerprint was added by checking:
  167. - `/var/opt/gitlab/.ssh/known_hosts` for Omnibus installations or
  168. - `/home/git/.ssh/known_hosts` for installations from source
  169. ## Troubleshooting
  170. Setting up Geo requires careful attention to details and sometimes it's easy to
  171. miss a step. Here is a checklist of questions you should ask to try to detect
  172. where you have to fix (all commands and path locations are for Omnibus installs):
  173. - Is Postgres replication working?
  174. - Are my nodes pointing to the correct database instance?
  175. - You should make sure your primary Geo node points to the instance with
  176. writing permissions.
  177. - Any secondary nodes should point only to read-only instances.
  178. - Can Geo detect my current node correctly?
  179. - Geo uses your defined node from `Admin > Geo` screen, and tries to match
  180. with the value defined in `/etc/gitlab/gitlab.rb` configuration file.
  181. The relevant line looks like: `external_url "http://gitlab.example.com"`.
  182. - To check if node on current machine is correctly detected type:
  183. ```
  184. sudo gitlab-rails runner "Gitlab::Geo.current_node"
  185. ```
  186. and expect something like:
  187. ```
  188. #<GeoNode id: 2, schema: "https", host: "gitlab.example.com", port: 443, relative_url_root: "", primary: false, ...>
  189. ```
  190. - By running the command above, `primary` should be `true` when executed in
  191. the primary node, and `false` on any secondary
  192. - Did I define the correct SSH Key for the node?
  193. - You must create an SSH Key for `git` user
  194. - This key is the one you have to inform at `Admin > Geo`
  195. - Can I SSH from secondary to primary node using `git` user account?
  196. - This is the most obvious cause of problems with repository replication issues.
  197. If you haven't added the primary node's key to `known_hosts`, you will end up with
  198. a lot of failed sidekiq jobs with an error similar to:
  199. ```
  200. Gitlab::Shell::Error: Host key verification failed. fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists.
  201. ```
  202. An easy way to fix is by logging in as the `git` user in the secondary node and run:
  203. ```
  204. # remove old entries to your primary gitlab in known_hosts
  205. ssh-keyscan -R your-primary-gitlab.example.com
  206. # add a new entry in known_hosts
  207. ssh-keyscan -t rsa your-primary-gitlab.example.com >> ~/.ssh/known_hosts
  208. ```
  209. - Can primary node communicate with secondary node by HTTP/HTTPS ports?
  210. - Can secondary nodes communicate with primary node by HTTP/HTTPS/SSH ports?
  211. - Can secondary nodes execute a successful git clone using git user's own
  212. SSH Key to primary node repository?
  213. >**Note:**
  214. This list is an attempt to document all the moving parts that can go wrong.
  215. We are working into getting all this steps verified automatically in a
  216. rake task in the future.
  217. [ssh-pair]: #create-ssh-key-pairs-for-new-geo-nodes