PageRenderTime 24ms CodeModel.GetById 44ms RepoModel.GetById 1ms app.codeStats 0ms

/nagios/README.md

https://github.com/patcon/cookbooks
Markdown | 311 lines | 218 code | 93 blank | 0 comment | 0 complexity | a9104e7385cb6697fe70d282eb7a06e5 MD5 | raw file
  1. Description
  2. ===========
  3. Installs and configures Nagios 3 for a server and for clients using Chef search capabilities.
  4. Changes
  5. =======
  6. ## v1.0.4:
  7. * [COOK-838] - Add HTTPS Option to Nagios Cookbook
  8. ## v1.0.2:
  9. * [COOK-636] - Nagios server recipe attempts to start too soon
  10. * [COOK-815] - Nagios Config Changes Kill Nagios If Config Goes Bad
  11. ## v1.0.0:
  12. * Use Chef 0.10's `node.chef_environment` instead of `node['app_environment']`.
  13. * source installation support on both client and server sides
  14. * initial RHEL/CentOS/Fedora support
  15. Requirements
  16. ============
  17. Chef
  18. ----
  19. Chef version 0.10.0+ is required for chef environment usage. See __Environments__ under __Usage__ below.
  20. A data bag named 'users' should exist, see __Data Bag__ below.
  21. The monitoring server that uses this recipe should have a role named 'monitoring' or similar, this is settable via an attribute. See __Attributes__ below.
  22. Because of the heavy use of search, this recipe will not work with Chef Solo, as it cannot do any searches without a server.
  23. Platform
  24. --------
  25. * Debian, Ubuntu
  26. * RHEL, CentOS, Fedora
  27. Tested on Ubuntu 10.04 and CentOS 5.5
  28. Cookbooks
  29. ---------
  30. * apache2
  31. * build-essential
  32. * php
  33. Attributes
  34. ==========
  35. default
  36. -------
  37. The following attributes are used by both client and server recipes.
  38. * `node['nagios']['user']` - nagios user, default 'nagios'.
  39. * `node['nagios']['group']` - nagios group, default 'nagios'.
  40. * `node['nagios']['plugin_dir']` - location where nagios plugins go,
  41. * default '/usr/lib/nagios/plugins'.
  42. client
  43. ------
  44. The following attributes are used for the client NRPE checks for warning and critical levels.
  45. * `node['nagios']['client']['install_method']` - whether to install from package or source. Default chosen by platform based on known packages available for Nagios 3: debian/ubuntu 'package', redhat/centos/fedora/scientific: source
  46. * `node['nagios']['plugins']['url']` - url to retrieve the plugins source
  47. * `node['nagios']['plugins']['version']` - version of the plugins
  48. * `node['nagios']['plugins']['checksum']` - checksum of the plugins source tarball
  49. * `node['nagios']['nrpe']['home']` - home directory of nrpe, default /usr/lib/nagios
  50. * `node['nagios']['nrpe']['conf_dir']` - location of the nrpe configuration, default /etc/nagios
  51. * `node['nagios']['nrpe']['url']` - url to retrieve nrpe source
  52. * `node['nagios']['nrpe']['version']` - version of nrpe to download
  53. * `node['nagios']['nrpe']['checksum']` - checksum of the nrpe source tarball
  54. * `node['nagios']['checks']['memory']['critical']` - threshold of critical memory usage, default 150
  55. * `node['nagios']['checks']['memory']['warning']` - threshold of warning memory usage, default 250
  56. * `node['nagios']['checks']['load']['critical']` - threshold of critical load average, default 30,20,10
  57. * `node['nagios']['checks']['load']['warning']` - threshold of warning load average, default 15,10,5
  58. * `node['nagios']['checks']['smtp_host']` - default relayhost to check for connectivity. Default is an empty string, set via an attribute in a role.
  59. * `node['nagios']['server_role']` - the role that the nagios server will have in its run list that the clients can search for.
  60. server
  61. ------
  62. Default directory locations are based on FHS. Change to suit your preferences.
  63. * `node['nagios']['server']['install_method']` - whether to install from package or source. Default chosen by platform based on known packages available for Nagios 3: debian/ubuntu 'package', redhat/centos/fedora/scientific: source
  64. * `node['nagios']['server']['service_name']` - name of the service used for nagios, default chosen by platform, debian/ubuntu "nagios3", redhat family "nagios", all others, "nagios"
  65. * `node['nagios']['home']` - nagios main home directory, default "/usr/lib/nagios3"
  66. * `node['nagios']['conf_dir']` - location where main nagios config lives, default "/etc/nagios3"
  67. * `node['nagios']['config_dir']` - location where included configuration files live, default "/etc/nagios3/conf.d"
  68. * `node['nagios']['log_dir']` - location of nagios logs, default "/var/log/nagios3"
  69. * `node['nagios']['cache_dir']` - location of cached data, default "/var/cache/nagios3"
  70. * `node['nagios']['state_dir']` - nagios runtime state information, default "/var/lib/nagios3"
  71. * `node['nagios']['run_dir']` - where pidfiles are stored, default "/var/run/nagios3"
  72. * `node['nagios']['docroot']` - nagios webui docroot, default "/usr/share/nagios3/htdocs"
  73. * `node['nagios']['enable_ssl]` - boolean for whether nagios web server should be https, default false
  74. * `node['nagios']['http_port']` - port that the apache server should listen on, determined whether ssl is enabled (443 if so, otherwise 80)
  75. * `node['nagios']['server_name']` - common name to use in a server cert, default "nagios"
  76. * `node['nagios']['ssl_req']` - info to use in a cert, default `/C=US/ST=Several/L=Locality/O=Example/OU=Operations/CN=#{node['nagios']['server_name']}/emailAddress=ops@#{node['nagios']['server_name']}`
  77. * `node['nagios']['notifications_enabled']` - set to 1 to enable notification.
  78. * `node['nagios']['check_external_commands']`
  79. * `node['nagios']['default_contact_groups']`
  80. * `node['nagios']['sysadmin_email']` - default notification email.
  81. * `node['nagios']['sysadmin_sms_email']` - default notification sms.
  82. * `node['nagios']['server_auth_method']` - authentication with the server can be done with openid (using `apache2::mod_auth_openid`), or htauth (basic). The default is openid, any other value will use htauth (basic).
  83. * `node['nagios']['templates']`
  84. * `node['nagios']['interval_length']` - minimum interval.
  85. * `node['nagios']['default_host']['check_interval']`
  86. * `node['nagios']['default_host']['retry_interval']`
  87. * `node['nagios']['default_host']['max_check_attempts']`
  88. * `node['nagios']['default_host']['notification_interval']`
  89. * `node['nagios']['default_service']['check_interval']`
  90. * `node['nagios']['default_service']['retry_interval']`
  91. * `node['nagios']['default_service']['max_check_attempts']`
  92. * `node['nagios']['default_service']['notification_interval']`
  93. Recipes
  94. =======
  95. default
  96. -------
  97. Includes the `nagios::client` recipe.
  98. client
  99. ------
  100. Includes the correct client installation recipe based on platform, either `nagios::client_package` or `nagios::client_source`.
  101. The client recipe searches for servers allowed to connect via NRPE that have a role named in the `node['nagios']['server_role']` attribute. The recipe will also install the required packages and start the NRPE service. A custom plugin for checking memory is also added.
  102. Searches are confined to the node's `chef_environment`.
  103. Client commands for NRPE can be modified by editing the nrpe.cfg.erb template.
  104. client\_package
  105. --------------
  106. Installs the Nagios client libraries from packages. Default for Debian / Ubuntu systems.
  107. client\_source
  108. -------------
  109. Installs the Nagios client libraries from source. Default for Red Hat / CentOS / Fedora systems as native packages of Nagios 3 are not available in the default repositories.
  110. server
  111. ------
  112. Includes the correct client installation recipe based on platform, either `nagios::server_package` or `nagios::server_source`.
  113. The server recipe sets up Apache as the web front end. The nagios::client recipe is also included. This recipe also does a number of searches to dynamically build the hostgroups to monitor, hosts that belong to them and admins to notify of events/alerts.
  114. Searches are confined to the node's `chef_environment`.
  115. The recipe does the following:
  116. 1. Searches for members of the sysadmins group by searching through 'users' data bag and adds them to a list for notification/contacts.
  117. 2. Search all nodes for a role matching the app_environment.
  118. 3. Search all available roles and build a list which will be the Nagios hostgroups.
  119. 4. Search for all nodes of each role and add the hostnames to the hostgroups.
  120. 5. Installs various packages required for the server.
  121. 6. Sets up some configuration directories.
  122. 7. Moves the package-installed Nagios configuration to a 'dist' directory.
  123. 8. Disables the 000-default VirtualHost present on Debian/Ubuntu Apache2 package installations.
  124. 9. Enables the Nagios web front end configuration.
  125. 10. Sets up the configuration templates for services, contacts, hostgroups and hosts.
  126. *NOTE*: You will probably need to change the services.cfg.erb template for your environment.
  127. To add custom commands for service checks, these can be done on a per-role basis by editing the 'services.cfg.erb' template. This template has some pre-configured checks that use role names used in an example infrastructure. Here's a brief description:
  128. * monitoring - check_smtp (e.g., postfix relayhost) w/ NRPE and tcp port 514 (e.g., rsyslog)
  129. * load\_balancer - check_nginx with NRPE.
  130. * appserver - check_unicorn with NRPE, e.g. a Rails application using Unicorn.
  131. * database\_master - check\_mysql\_server with NRPE for a MySQL database master.
  132. server\_package
  133. --------------
  134. Installs the Nagios server libraries from packages. Default for Debian / Ubuntu systems.
  135. server\_source
  136. -------------
  137. Installs the Nagios server libraries from source. Default for Red Hat / CentOS / Fedora systems as native packages of Nagios 3 are not available in the default repositories.
  138. Data Bags
  139. =========
  140. Create a `users` data bag that will contain the users that will be able to log into the Nagios webui. Each user can use htauth with a specified password, or an openid. Users that should be able to log in should be in the sysadmin group. Example user data bag item:
  141. {
  142. "id": "nagiosadmin",
  143. "groups": "sysadmin",
  144. "htpasswd": "hashed_htpassword",
  145. "openid": "http://nagiosadmin.myopenid.com/",
  146. "nagios": {
  147. "pager": "nagiosadmin_pager@example.com",
  148. "email": "nagiosadmin@example.com"
  149. }
  150. }
  151. When using server_auth_method 'openid', use the openid in the data bag item. Any other value for this attribute (e.g., "htauth", "htpasswd", etc) will use the htpasswd value as the password in `/etc/nagios3/htpasswd.users`.
  152. The openid must have the http:// and trailing /. The htpasswd must be the hashed value. Get this value with htpasswd:
  153. % htpasswd -n -s nagiosadmin
  154. New password:
  155. Re-type new password:
  156. nagiosadmin:{SHA}oCagzV4lMZyS7jl2Z0WlmLxEkt4=
  157. For example use the `{SHA}oCagzV4lMZyS7jl2Z0WlmLxEkt4=` value in the data bag.
  158. Roles
  159. =====
  160. Create a role to use for the monitoring server. The role name should match the value of the attribute "nagios[:server_role]". By default, this is 'monitoring'. For example:
  161. % cat roles/monitoring.rb
  162. name "monitoring"
  163. description "Monitoring server"
  164. run_list(
  165. "recipe[nagios::server]"
  166. )
  167. default_attributes(
  168. "nagios" => {
  169. "server_auth_method" => "htauth"
  170. }
  171. )
  172. % knife role from file monitoring.rb
  173. Definitions
  174. ===========
  175. nagios_conf
  176. -----------
  177. This definition is used to drop in a configuration file in the base Nagios configuration directory's conf.d. This can be used for customized configurations for various services.
  178. Libraries
  179. =========
  180. default
  181. -------
  182. The library included with the cookbook provides some helper methods used in templates.
  183. * nagios_boolean
  184. * nagios_interval - calculates interval based on interval length and a given number of seconds.
  185. * nagios_attr - retrieves a nagios attribute from the node.
  186. Usage
  187. =====
  188. See below under __Environments__ for how to set up Chef 0.10 environment for use with this cookbook.
  189. For a Nagios server, create a role named 'monitoring', and add the following recipe to the run_list:
  190. recipe[nagios::server]
  191. This will allow client nodes to search for the server by this role and add its IP address to the allowed list for NRPE.
  192. To install Nagios and NRPE on a client node:
  193. include_recipe "nagios::client"
  194. This is a fairly complicated cookbook. For a walkthrough and example usage please see [Opscode's Nagios Quick Start](http://help.opscode.com/kb/otherhelp/nagios-quick-start).
  195. Environments
  196. ------------
  197. The searches used are confined to the node's `chef_environment`. If you do not use any environments (Chef 0.10+ feature) the `_default` environment is used, which is applied to all nodes in the Chef Server that are not in another defined role. To use environments, create them as files in your chef-repo, then upload them to the Chef Server.
  198. % cat environments/production.rb
  199. name "production"
  200. description "Systems in the Production Environment"
  201. % knife environment from file production.rb
  202. License and Author
  203. ==================
  204. Author:: Joshua Sierles <joshua@37signals.com>
  205. Author:: Nathan Haneysmith <nathan@opscode.com>
  206. Author:: Joshua Timberman <joshua@opscode.com>
  207. Author:: Seth Chisamore <schisamo@opscode.com>
  208. Copyright 2009, 37signals
  209. Copyright 2009-2011, Opscode, Inc
  210. Licensed under the Apache License, Version 2.0 (the "License");
  211. you may not use this file except in compliance with the License.
  212. You may obtain a copy of the License at
  213. http://www.apache.org/licenses/LICENSE-2.0
  214. Unless required by applicable law or agreed to in writing, software
  215. distributed under the License is distributed on an "AS IS" BASIS,
  216. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  217. See the License for the specific language governing permissions and
  218. limitations under the License.