src/ceph/doc/mgr/prometheus.rst

   1 =================
   2 Prometheus plugin
   3 =================
   4
   5 Provides a Prometheus exporter to pass on Ceph performance counters
   6 from the collection point in ceph-mgr.  Ceph-mgr receives MMgrReport
   7 messages from all MgrClient processes (mons and OSDs, for instance)
   8 with performance counter schema data and actual counter data, and keeps
   9 a circular buffer of the last N samples.  This plugin creates an HTTP
  10 endpoint (like all Prometheus exporters) and retrieves the latest sample
  11 of every counter when polled (or "scraped" in Prometheus terminology).
  12 The HTTP path and query parameters are ignored; all extant counters
  13 for all reporting entities are returned in text exposition format.
  14 (See the Prometheus `documentation <https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details>`_.)
  15
  16 Enabling prometheus output
  17 ==========================
  18
  19 The *prometheus* module is enabled with::
  20
  21   ceph mgr module enable prometheus
  22
  23 Configuration
  24 -------------
  25
  26 By default the module will accept HTTP requests on port ``9283`` on all
  27 IPv4 and IPv6 addresses on the host.  The port and listen address are both
  28 configurable with ``ceph config-key set``, with keys
  29 ``mgr/prometheus/server_addr`` and ``mgr/prometheus/server_port``.
  30 This port is registered with Prometheus's `registry <https://github.com/prometheus/prometheus/wiki/Default-port-allocations>`_.
  31
  32 Statistic names and labels
  33 ==========================
  34
  35 The names of the stats are exactly as Ceph names them, with
  36 illegal characters ``.``, ``-`` and ``::`` translated to ``_``,
  37 and ``ceph_`` prefixed to all names.
  38
  39
  40 All *daemon* statistics have a ``ceph_daemon`` label such as "osd.123"
  41 that identifies the type and ID of the daemon they come from.  Some
  42 statistics can come from different types of daemon, so when querying
  43 e.g. an OSD's RocksDB stats, you would probably want to filter
  44 on ceph_daemon starting with "osd" to avoid mixing in the monitor
  45 rocksdb stats.
  46
  47
  48 The *cluster* statistics (i.e. those global to the Ceph cluster)
  49 have labels appropriate to what they report on.  For example,
  50 metrics relating to pools have a ``pool_id`` label.
  51
  52 Pool and OSD metadata series
  53 ----------------------------
  54
  55 Special series are output to enable displaying and querying on
  56 certain metadata fields.
  57
  58 Pools have a ``ceph_pool_metadata`` field like this:
  59
  60 ::
  61
  62     ceph_pool_metadata{pool_id="2",name="cephfs_metadata_a"} 0.0
  63
  64 OSDs have a ``ceph_osd_metadata`` field like this:
  65
  66 ::
  67
  68     ceph_osd_metadata{cluster_addr="172.21.9.34:6802/19096",device_class="ssd",id="0",public_addr="172.21.9.34:6801/19096",weight="1.0"} 0.0
  69
  70
  71 Correlating drive statistics with node_exporter
  72 -----------------------------------------------
  73
  74 The prometheus output from Ceph is designed to be used in conjunction
  75 with the generic host monitoring from the Prometheus node_exporter.
  76
  77 To enable correlation of Ceph OSD statistics with node_exporter's
  78 drive statistics, special series are output like this:
  79
  80 ::
  81
  82     ceph_disk_occupation{ceph_daemon="osd.0",device="sdd",instance="myhost",job="ceph"}
  83
  84 To use this to get disk statistics by OSD ID, use the ``and on`` syntax
  85 in your prometheus query like this:
  86
  87 ::
  88
  89     rate(node_disk_bytes_written[30s]) and on (device,instance) ceph_disk_occupation{ceph_daemon="osd.0"}
  90
  91 See the prometheus documentation for more information about constructing
  92 queries.
  93
  94 Note that for this mechanism to work, Ceph and node_exporter must agree
  95 about the values of the ``instance`` label.  See the following section
  96 for guidance about to to set up Prometheus in a way that sets
  97 ``instance`` properly.
  98
  99 Configuring Prometheus server
 100 =============================
 101
 102 See the prometheus documentation for full details of how to add
 103 scrape endpoints: the notes
 104 in this section are tips on how to configure Prometheus to capture
 105 the Ceph statistics in the most usefully-labelled form.
 106
 107 This configuration is necessary because Ceph is reporting metrics
 108 from many hosts and services via a single endpoint, and some
 109 metrics that relate to no physical host (such as pool statistics).
 110
 111 honor_labels
 112 ------------
 113
 114 To enable Ceph to output properly-labelled data relating to any host,
 115 use the ``honor_labels`` setting when adding the ceph-mgr endpoints
 116 to your prometheus configuration.
 117
 118 Without this setting, any ``instance`` labels that Ceph outputs, such
 119 as those in ``ceph_disk_occupation`` series, will be overridden
 120 by Prometheus.
 121
 122 Ceph instance label
 123 -------------------
 124
 125 By default, Prometheus applies an ``instance`` label that includes
 126 the hostname and port of the endpoint that the series game from.  Because
 127 Ceph clusters have multiple manager daemons, this results in an ``instance``
 128 label that changes spuriously when the active manager daemon changes.
 129
 130 Set a custom ``instance`` label in your Prometheus target configuration:
 131 you might wish to set it to the hostname of your first monitor, or something
 132 completely arbitrary like "ceph_cluster".
 133
 134 node_exporter instance labels
 135 -----------------------------
 136
 137 Set your ``instance`` labels to match what appears in Ceph's OSD metadata
 138 in the ``hostname`` field.  This is generally the short hostname of the node.
 139
 140 This is only necessary if you want to correlate Ceph stats with host stats,
 141 but you may find it useful to do it in all cases in case you want to do
 142 the correlation in the future.
 143
 144 Example configuration
 145 ---------------------
 146
 147 This example shows a single node configuration running ceph-mgr and
 148 node_exporter on a server called ``senta04``.
 149
 150 This is just an example: there are other ways to configure prometheus
 151 scrape targets and label rewrite rules.
 152
 153 prometheus.yml
 154 ~~~~~~~~~~~~~~
 155
 156 ::
 157
 158     global:
 159       scrape_interval:     15s
 160       evaluation_interval: 15s
 161
 162     scrape_configs:
 163       - job_name: 'node'
 164         file_sd_configs:
 165           - files:
 166             - node_targets.yml
 167       - job_name: 'ceph'
 168         honor_labels: true
 169         file_sd_configs:
 170           - files:
 171             - ceph_targets.yml
 172
 173
 174 ceph_targets.yml
 175 ~~~~~~~~~~~~~~~~
 176
 177
 178 ::
 179
 180     [
 181         {
 182             "targets": [ "senta04.mydomain.com:9283" ],
 183             "labels": {
 184                 "instance": "ceph_cluster"
 185             }
 186         }
 187     ]
 188
 189
 190 node_targets.yml
 191 ~~~~~~~~~~~~~~~~
 192
 193 ::
 194
 195     [
 196         {
 197             "targets": [ "senta04.mydomain.com:9100" ],
 198             "labels": {
 199                 "instance": "senta04"
 200             }
 201         }
 202     ]
 203
 204
 205 Notes
 206 =====
 207
 208 Counters and gauges are exported; currently histograms and long-running
 209 averages are not.  It's possible that Ceph's 2-D histograms could be
 210 reduced to two separate 1-D histograms, and that long-running averages
 211 could be exported as Prometheus' Summary type.
 212
 213 Timestamps, as with many Prometheus exporters, are established by
 214 the server's scrape time (Prometheus expects that it is polling the
 215 actual counter process synchronously).  It is possible to supply a
 216 timestamp along with the stat report, but the Prometheus team strongly
 217 advises against this.  This means that timestamps will be delayed by
 218 an unpredictable amount; it's not clear if this will be problematic,
 219 but it's worth knowing about.