X-Git-Url: https://gerrit.opnfv.org/gerrit/gitweb?a=blobdiff_plain;f=src%2Fceph%2Fdoc%2Frados%2Fconfiguration%2Fmon-config-ref.rst;fp=src%2Fceph%2Fdoc%2Frados%2Fconfiguration%2Fmon-config-ref.rst;h=0000000000000000000000000000000000000000;hb=7da45d65be36d36b880cc55c5036e96c24b53f00;hp=6c8e92b17bd06b317bdc0460d6377fdd8020bbfc;hpb=691462d09d0987b47e112d6ee8740375df3c51b2;p=stor4nfv.git diff --git a/src/ceph/doc/rados/configuration/mon-config-ref.rst b/src/ceph/doc/rados/configuration/mon-config-ref.rst deleted file mode 100644 index 6c8e92b..0000000 --- a/src/ceph/doc/rados/configuration/mon-config-ref.rst +++ /dev/null @@ -1,1222 +0,0 @@ -========================== - Monitor Config Reference -========================== - -Understanding how to configure a :term:`Ceph Monitor` is an important part of -building a reliable :term:`Ceph Storage Cluster`. **All Ceph Storage Clusters -have at least one monitor**. A monitor configuration usually remains fairly -consistent, but you can add, remove or replace a monitor in a cluster. See -`Adding/Removing a Monitor`_ and `Add/Remove a Monitor (ceph-deploy)`_ for -details. - - -.. index:: Ceph Monitor; Paxos - -Background -========== - -Ceph Monitors maintain a "master copy" of the :term:`cluster map`, which means a -:term:`Ceph Client` can determine the location of all Ceph Monitors, Ceph OSD -Daemons, and Ceph Metadata Servers just by connecting to one Ceph Monitor and -retrieving a current cluster map. Before Ceph Clients can read from or write to -Ceph OSD Daemons or Ceph Metadata Servers, they must connect to a Ceph Monitor -first. With a current copy of the cluster map and the CRUSH algorithm, a Ceph -Client can compute the location for any object. The ability to compute object -locations allows a Ceph Client to talk directly to Ceph OSD Daemons, which is a -very important aspect of Ceph's high scalability and performance. See -`Scalability and High Availability`_ for additional details. - -The primary role of the Ceph Monitor is to maintain a master copy of the cluster -map. Ceph Monitors also provide authentication and logging services. Ceph -Monitors write all changes in the monitor services to a single Paxos instance, -and Paxos writes the changes to a key/value store for strong consistency. Ceph -Monitors can query the most recent version of the cluster map during sync -operations. Ceph Monitors leverage the key/value store's snapshots and iterators -(using leveldb) to perform store-wide synchronization. - -.. ditaa:: - - /-------------\ /-------------\ - | Monitor | Write Changes | Paxos | - | cCCC +-------------->+ cCCC | - | | | | - +-------------+ \------+------/ - | Auth | | - +-------------+ | Write Changes - | Log | | - +-------------+ v - | Monitor Map | /------+------\ - +-------------+ | Key / Value | - | OSD Map | | Store | - +-------------+ | cCCC | - | PG Map | \------+------/ - +-------------+ ^ - | MDS Map | | Read Changes - +-------------+ | - | cCCC |*---------------------+ - \-------------/ - - -.. deprecated:: version 0.58 - -In Ceph versions 0.58 and earlier, Ceph Monitors use a Paxos instance for -each service and store the map as a file. - -.. index:: Ceph Monitor; cluster map - -Cluster Maps ------------- - -The cluster map is a composite of maps, including the monitor map, the OSD map, -the placement group map and the metadata server map. The cluster map tracks a -number of important things: which processes are ``in`` the Ceph Storage Cluster; -which processes that are ``in`` the Ceph Storage Cluster are ``up`` and running -or ``down``; whether, the placement groups are ``active`` or ``inactive``, and -``clean`` or in some other state; and, other details that reflect the current -state of the cluster such as the total amount of storage space, and the amount -of storage used. - -When there is a significant change in the state of the cluster--e.g., a Ceph OSD -Daemon goes down, a placement group falls into a degraded state, etc.--the -cluster map gets updated to reflect the current state of the cluster. -Additionally, the Ceph Monitor also maintains a history of the prior states of -the cluster. The monitor map, OSD map, placement group map and metadata server -map each maintain a history of their map versions. We call each version an -"epoch." - -When operating your Ceph Storage Cluster, keeping track of these states is an -important part of your system administration duties. See `Monitoring a Cluster`_ -and `Monitoring OSDs and PGs`_ for additional details. - -.. index:: high availability; quorum - -Monitor Quorum --------------- - -Our Configuring ceph section provides a trivial `Ceph configuration file`_ that -provides for one monitor in the test cluster. A cluster will run fine with a -single monitor; however, **a single monitor is a single-point-of-failure**. To -ensure high availability in a production Ceph Storage Cluster, you should run -Ceph with multiple monitors so that the failure of a single monitor **WILL NOT** -bring down your entire cluster. - -When a Ceph Storage Cluster runs multiple Ceph Monitors for high availability, -Ceph Monitors use `Paxos`_ to establish consensus about the master cluster map. -A consensus requires a majority of monitors running to establish a quorum for -consensus about the cluster map (e.g., 1; 2 out of 3; 3 out of 5; 4 out of 6; -etc.). - -``mon force quorum join`` - -:Description: Force monitor to join quorum even if it has been previously removed from the map -:Type: Boolean -:Default: ``False`` - -.. index:: Ceph Monitor; consistency - -Consistency ------------ - -When you add monitor settings to your Ceph configuration file, you need to be -aware of some of the architectural aspects of Ceph Monitors. **Ceph imposes -strict consistency requirements** for a Ceph monitor when discovering another -Ceph Monitor within the cluster. Whereas, Ceph Clients and other Ceph daemons -use the Ceph configuration file to discover monitors, monitors discover each -other using the monitor map (monmap), not the Ceph configuration file. - -A Ceph Monitor always refers to the local copy of the monmap when discovering -other Ceph Monitors in the Ceph Storage Cluster. Using the monmap instead of the -Ceph configuration file avoids errors that could break the cluster (e.g., typos -in ``ceph.conf`` when specifying a monitor address or port). Since monitors use -monmaps for discovery and they share monmaps with clients and other Ceph -daemons, **the monmap provides monitors with a strict guarantee that their -consensus is valid.** - -Strict consistency also applies to updates to the monmap. As with any other -updates on the Ceph Monitor, changes to the monmap always run through a -distributed consensus algorithm called `Paxos`_. The Ceph Monitors must agree on -each update to the monmap, such as adding or removing a Ceph Monitor, to ensure -that each monitor in the quorum has the same version of the monmap. Updates to -the monmap are incremental so that Ceph Monitors have the latest agreed upon -version, and a set of previous versions. Maintaining a history enables a Ceph -Monitor that has an older version of the monmap to catch up with the current -state of the Ceph Storage Cluster. - -If Ceph Monitors discovered each other through the Ceph configuration file -instead of through the monmap, it would introduce additional risks because the -Ceph configuration files are not updated and distributed automatically. Ceph -Monitors might inadvertently use an older Ceph configuration file, fail to -recognize a Ceph Monitor, fall out of a quorum, or develop a situation where -`Paxos`_ is not able to determine the current state of the system accurately. - - -.. index:: Ceph Monitor; bootstrapping monitors - -Bootstrapping Monitors ----------------------- - -In most configuration and deployment cases, tools that deploy Ceph may help -bootstrap the Ceph Monitors by generating a monitor map for you (e.g., -``ceph-deploy``, etc). A Ceph Monitor requires a few explicit -settings: - -- **Filesystem ID**: The ``fsid`` is the unique identifier for your - object store. Since you can run multiple clusters on the same - hardware, you must specify the unique ID of the object store when - bootstrapping a monitor. Deployment tools usually do this for you - (e.g., ``ceph-deploy`` can call a tool like ``uuidgen``), but you - may specify the ``fsid`` manually too. - -- **Monitor ID**: A monitor ID is a unique ID assigned to each monitor within - the cluster. It is an alphanumeric value, and by convention the identifier - usually follows an alphabetical increment (e.g., ``a``, ``b``, etc.). This - can be set in a Ceph configuration file (e.g., ``[mon.a]``, ``[mon.b]``, etc.), - by a deployment tool, or using the ``ceph`` commandline. - -- **Keys**: The monitor must have secret keys. A deployment tool such as - ``ceph-deploy`` usually does this for you, but you may - perform this step manually too. See `Monitor Keyrings`_ for details. - -For additional details on bootstrapping, see `Bootstrapping a Monitor`_. - -.. index:: Ceph Monitor; configuring monitors - -Configuring Monitors -==================== - -To apply configuration settings to the entire cluster, enter the configuration -settings under ``[global]``. To apply configuration settings to all monitors in -your cluster, enter the configuration settings under ``[mon]``. To apply -configuration settings to specific monitors, specify the monitor instance -(e.g., ``[mon.a]``). By convention, monitor instance names use alpha notation. - -.. code-block:: ini - - [global] - - [mon] - - [mon.a] - - [mon.b] - - [mon.c] - - -Minimum Configuration ---------------------- - -The bare minimum monitor settings for a Ceph monitor via the Ceph configuration -file include a hostname and a monitor address for each monitor. You can configure -these under ``[mon]`` or under the entry for a specific monitor. - -.. code-block:: ini - - [mon] - mon host = hostname1,hostname2,hostname3 - mon addr = 10.0.0.10:6789,10.0.0.11:6789,10.0.0.12:6789 - - -.. code-block:: ini - - [mon.a] - host = hostname1 - mon addr = 10.0.0.10:6789 - -See the `Network Configuration Reference`_ for details. - -.. note:: This minimum configuration for monitors assumes that a deployment - tool generates the ``fsid`` and the ``mon.`` key for you. - -Once you deploy a Ceph cluster, you **SHOULD NOT** change the IP address of -the monitors. However, if you decide to change the monitor's IP address, you -must follow a specific procedure. See `Changing a Monitor's IP Address`_ for -details. - -Monitors can also be found by clients using DNS SRV records. See `Monitor lookup through DNS`_ for details. - -Cluster ID ----------- - -Each Ceph Storage Cluster has a unique identifier (``fsid``). If specified, it -usually appears under the ``[global]`` section of the configuration file. -Deployment tools usually generate the ``fsid`` and store it in the monitor map, -so the value may not appear in a configuration file. The ``fsid`` makes it -possible to run daemons for multiple clusters on the same hardware. - -``fsid`` - -:Description: The cluster ID. One per cluster. -:Type: UUID -:Required: Yes. -:Default: N/A. May be generated by a deployment tool if not specified. - -.. note:: Do not set this value if you use a deployment tool that does - it for you. - - -.. index:: Ceph Monitor; initial members - -Initial Members ---------------- - -We recommend running a production Ceph Storage Cluster with at least three Ceph -Monitors to ensure high availability. When you run multiple monitors, you may -specify the initial monitors that must be members of the cluster in order to -establish a quorum. This may reduce the time it takes for your cluster to come -online. - -.. code-block:: ini - - [mon] - mon initial members = a,b,c - - -``mon initial members`` - -:Description: The IDs of initial monitors in a cluster during startup. If - specified, Ceph requires an odd number of monitors to form an - initial quorum (e.g., 3). - -:Type: String -:Default: None - -.. note:: A *majority* of monitors in your cluster must be able to reach - each other in order to establish a quorum. You can decrease the initial - number of monitors to establish a quorum with this setting. - -.. index:: Ceph Monitor; data path - -Data ----- - -Ceph provides a default path where Ceph Monitors store data. For optimal -performance in a production Ceph Storage Cluster, we recommend running Ceph -Monitors on separate hosts and drives from Ceph OSD Daemons. As leveldb is using -``mmap()`` for writing the data, Ceph Monitors flush their data from memory to disk -very often, which can interfere with Ceph OSD Daemon workloads if the data -store is co-located with the OSD Daemons. - -In Ceph versions 0.58 and earlier, Ceph Monitors store their data in files. This -approach allows users to inspect monitor data with common tools like ``ls`` -and ``cat``. However, it doesn't provide strong consistency. - -In Ceph versions 0.59 and later, Ceph Monitors store their data as key/value -pairs. Ceph Monitors require `ACID`_ transactions. Using a data store prevents -recovering Ceph Monitors from running corrupted versions through Paxos, and it -enables multiple modification operations in one single atomic batch, among other -advantages. - -Generally, we do not recommend changing the default data location. If you modify -the default location, we recommend that you make it uniform across Ceph Monitors -by setting it in the ``[mon]`` section of the configuration file. - - -``mon data`` - -:Description: The monitor's data location. -:Type: String -:Default: ``/var/lib/ceph/mon/$cluster-$id`` - - -``mon data size warn`` - -:Description: Issue a ``HEALTH_WARN`` in cluster log when the monitor's data - store goes over 15GB. -:Type: Integer -:Default: 15*1024*1024*1024* - - -``mon data avail warn`` - -:Description: Issue a ``HEALTH_WARN`` in cluster log when the available disk - space of monitor's data store is lower or equal to this - percentage. -:Type: Integer -:Default: 30 - - -``mon data avail crit`` - -:Description: Issue a ``HEALTH_ERR`` in cluster log when the available disk - space of monitor's data store is lower or equal to this - percentage. -:Type: Integer -:Default: 5 - - -``mon warn on cache pools without hit sets`` - -:Description: Issue a ``HEALTH_WARN`` in cluster log if a cache pool does not - have the hitset type set set. - See `hit set type <../operations/pools#hit-set-type>`_ for more - details. -:Type: Boolean -:Default: True - - -``mon warn on crush straw calc version zero`` - -:Description: Issue a ``HEALTH_WARN`` in cluster log if the CRUSH's - ``straw_calc_version`` is zero. See - `CRUSH map tunables <../operations/crush-map#tunables>`_ for - details. -:Type: Boolean -:Default: True - - -``mon warn on legacy crush tunables`` - -:Description: Issue a ``HEALTH_WARN`` in cluster log if - CRUSH tunables are too old (older than ``mon_min_crush_required_version``) -:Type: Boolean -:Default: True - - -``mon crush min required version`` - -:Description: The minimum tunable profile version required by the cluster. - See - `CRUSH map tunables <../operations/crush-map#tunables>`_ for - details. -:Type: String -:Default: ``firefly`` - - -``mon warn on osd down out interval zero`` - -:Description: Issue a ``HEALTH_WARN`` in cluster log if - ``mon osd down out interval`` is zero. Having this option set to - zero on the leader acts much like the ``noout`` flag. It's hard - to figure out what's going wrong with clusters witout the - ``noout`` flag set but acting like that just the same, so we - report a warning in this case. -:Type: Boolean -:Default: True - - -``mon cache target full warn ratio`` - -:Description: Position between pool's ``cache_target_full`` and - ``target_max_object`` where we start warning -:Type: Float -:Default: ``0.66`` - - -``mon health data update interval`` - -:Description: How often (in seconds) the monitor in quorum shares its health - status with its peers. (negative number disables it) -:Type: Float -:Default: ``60`` - - -``mon health to clog`` - -:Description: Enable sending health summary to cluster log periodically. -:Type: Boolean -:Default: True - - -``mon health to clog tick interval`` - -:Description: How often (in seconds) the monitor send health summary to cluster - log (a non-positive number disables it). If current health summary - is empty or identical to the last time, monitor will not send it - to cluster log. -:Type: Integer -:Default: 3600 - - -``mon health to clog interval`` - -:Description: How often (in seconds) the monitor send health summary to cluster - log (a non-positive number disables it). Monitor will always - send the summary to cluster log no matter if the summary changes - or not. -:Type: Integer -:Default: 60 - - - -.. index:: Ceph Storage Cluster; capacity planning, Ceph Monitor; capacity planning - -Storage Capacity ----------------- - -When a Ceph Storage Cluster gets close to its maximum capacity (i.e., ``mon osd -full ratio``), Ceph prevents you from writing to or reading from Ceph OSD -Daemons as a safety measure to prevent data loss. Therefore, letting a -production Ceph Storage Cluster approach its full ratio is not a good practice, -because it sacrifices high availability. The default full ratio is ``.95``, or -95% of capacity. This a very aggressive setting for a test cluster with a small -number of OSDs. - -.. tip:: When monitoring your cluster, be alert to warnings related to the - ``nearfull`` ratio. This means that a failure of some OSDs could result - in a temporary service disruption if one or more OSDs fails. Consider adding - more OSDs to increase storage capacity. - -A common scenario for test clusters involves a system administrator removing a -Ceph OSD Daemon from the Ceph Storage Cluster to watch the cluster rebalance; -then, removing another Ceph OSD Daemon, and so on until the Ceph Storage Cluster -eventually reaches the full ratio and locks up. We recommend a bit of capacity -planning even with a test cluster. Planning enables you to gauge how much spare -capacity you will need in order to maintain high availability. Ideally, you want -to plan for a series of Ceph OSD Daemon failures where the cluster can recover -to an ``active + clean`` state without replacing those Ceph OSD Daemons -immediately. You can run a cluster in an ``active + degraded`` state, but this -is not ideal for normal operating conditions. - -The following diagram depicts a simplistic Ceph Storage Cluster containing 33 -Ceph Nodes with one Ceph OSD Daemon per host, each Ceph OSD Daemon reading from -and writing to a 3TB drive. So this exemplary Ceph Storage Cluster has a maximum -actual capacity of 99TB. With a ``mon osd full ratio`` of ``0.95``, if the Ceph -Storage Cluster falls to 5TB of remaining capacity, the cluster will not allow -Ceph Clients to read and write data. So the Ceph Storage Cluster's operating -capacity is 95TB, not 99TB. - -.. ditaa:: - - +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ - | Rack 1 | | Rack 2 | | Rack 3 | | Rack 4 | | Rack 5 | | Rack 6 | - | cCCC | | cF00 | | cCCC | | cCCC | | cCCC | | cCCC | - +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ - | OSD 1 | | OSD 7 | | OSD 13 | | OSD 19 | | OSD 25 | | OSD 31 | - +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ - | OSD 2 | | OSD 8 | | OSD 14 | | OSD 20 | | OSD 26 | | OSD 32 | - +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ - | OSD 3 | | OSD 9 | | OSD 15 | | OSD 21 | | OSD 27 | | OSD 33 | - +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ - | OSD 4 | | OSD 10 | | OSD 16 | | OSD 22 | | OSD 28 | | Spare | - +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ - | OSD 5 | | OSD 11 | | OSD 17 | | OSD 23 | | OSD 29 | | Spare | - +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ - | OSD 6 | | OSD 12 | | OSD 18 | | OSD 24 | | OSD 30 | | Spare | - +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ - -It is normal in such a cluster for one or two OSDs to fail. A less frequent but -reasonable scenario involves a rack's router or power supply failing, which -brings down multiple OSDs simultaneously (e.g., OSDs 7-12). In such a scenario, -you should still strive for a cluster that can remain operational and achieve an -``active + clean`` state--even if that means adding a few hosts with additional -OSDs in short order. If your capacity utilization is too high, you may not lose -data, but you could still sacrifice data availability while resolving an outage -within a failure domain if capacity utilization of the cluster exceeds the full -ratio. For this reason, we recommend at least some rough capacity planning. - -Identify two numbers for your cluster: - -#. The number of OSDs. -#. The total capacity of the cluster - -If you divide the total capacity of your cluster by the number of OSDs in your -cluster, you will find the mean average capacity of an OSD within your cluster. -Consider multiplying that number by the number of OSDs you expect will fail -simultaneously during normal operations (a relatively small number). Finally -multiply the capacity of the cluster by the full ratio to arrive at a maximum -operating capacity; then, subtract the number of amount of data from the OSDs -you expect to fail to arrive at a reasonable full ratio. Repeat the foregoing -process with a higher number of OSD failures (e.g., a rack of OSDs) to arrive at -a reasonable number for a near full ratio. - -.. code-block:: ini - - [global] - - mon osd full ratio = .80 - mon osd backfillfull ratio = .75 - mon osd nearfull ratio = .70 - - -``mon osd full ratio`` - -:Description: The percentage of disk space used before an OSD is - considered ``full``. - -:Type: Float -:Default: ``.95`` - - -``mon osd backfillfull ratio`` - -:Description: The percentage of disk space used before an OSD is - considered too ``full`` to backfill. - -:Type: Float -:Default: ``.90`` - - -``mon osd nearfull ratio`` - -:Description: The percentage of disk space used before an OSD is - considered ``nearfull``. - -:Type: Float -:Default: ``.85`` - - -.. tip:: If some OSDs are nearfull, but others have plenty of capacity, you - may have a problem with the CRUSH weight for the nearfull OSDs. - -.. index:: heartbeat - -Heartbeat ---------- - -Ceph monitors know about the cluster by requiring reports from each OSD, and by -receiving reports from OSDs about the status of their neighboring OSDs. Ceph -provides reasonable default settings for monitor/OSD interaction; however, you -may modify them as needed. See `Monitor/OSD Interaction`_ for details. - - -.. index:: Ceph Monitor; leader, Ceph Monitor; provider, Ceph Monitor; requester, Ceph Monitor; synchronization - -Monitor Store Synchronization ------------------------------ - -When you run a production cluster with multiple monitors (recommended), each -monitor checks to see if a neighboring monitor has a more recent version of the -cluster map (e.g., a map in a neighboring monitor with one or more epoch numbers -higher than the most current epoch in the map of the instant monitor). -Periodically, one monitor in the cluster may fall behind the other monitors to -the point where it must leave the quorum, synchronize to retrieve the most -current information about the cluster, and then rejoin the quorum. For the -purposes of synchronization, monitors may assume one of three roles: - -#. **Leader**: The `Leader` is the first monitor to achieve the most recent - Paxos version of the cluster map. - -#. **Provider**: The `Provider` is a monitor that has the most recent version - of the cluster map, but wasn't the first to achieve the most recent version. - -#. **Requester:** A `Requester` is a monitor that has fallen behind the leader - and must synchronize in order to retrieve the most recent information about - the cluster before it can rejoin the quorum. - -These roles enable a leader to delegate synchronization duties to a provider, -which prevents synchronization requests from overloading the leader--improving -performance. In the following diagram, the requester has learned that it has -fallen behind the other monitors. The requester asks the leader to synchronize, -and the leader tells the requester to synchronize with a provider. - - -.. ditaa:: +-----------+ +---------+ +----------+ - | Requester | | Leader | | Provider | - +-----------+ +---------+ +----------+ - | | | - | | | - | Ask to Synchronize | | - |------------------->| | - | | | - |<-------------------| | - | Tell Requester to | | - | Sync with Provider | | - | | | - | Synchronize | - |--------------------+-------------------->| - | | | - |<-------------------+---------------------| - | Send Chunk to Requester | - | (repeat as necessary) | - | Requester Acks Chuck to Provider | - |--------------------+-------------------->| - | | - | Sync Complete | - | Notification | - |------------------->| - | | - |<-------------------| - | Ack | - | | - - -Synchronization always occurs when a new monitor joins the cluster. During -runtime operations, monitors may receive updates to the cluster map at different -times. This means the leader and provider roles may migrate from one monitor to -another. If this happens while synchronizing (e.g., a provider falls behind the -leader), the provider can terminate synchronization with a requester. - -Once synchronization is complete, Ceph requires trimming across the cluster. -Trimming requires that the placement groups are ``active + clean``. - - -``mon sync trim timeout`` - -:Description: -:Type: Double -:Default: ``30.0`` - - -``mon sync heartbeat timeout`` - -:Description: -:Type: Double -:Default: ``30.0`` - - -``mon sync heartbeat interval`` - -:Description: -:Type: Double -:Default: ``5.0`` - - -``mon sync backoff timeout`` - -:Description: -:Type: Double -:Default: ``30.0`` - - -``mon sync timeout`` - -:Description: Number of seconds the monitor will wait for the next update - message from its sync provider before it gives up and bootstrap - again. -:Type: Double -:Default: ``30.0`` - - -``mon sync max retries`` - -:Description: -:Type: Integer -:Default: ``5`` - - -``mon sync max payload size`` - -:Description: The maximum size for a sync payload (in bytes). -:Type: 32-bit Integer -:Default: ``1045676`` - - -``paxos max join drift`` - -:Description: The maximum Paxos iterations before we must first sync the - monitor data stores. When a monitor finds that its peer is too - far ahead of it, it will first sync with data stores before moving - on. -:Type: Integer -:Default: ``10`` - -``paxos stash full interval`` - -:Description: How often (in commits) to stash a full copy of the PaxosService state. - Current this setting only affects ``mds``, ``mon``, ``auth`` and ``mgr`` - PaxosServices. -:Type: Integer -:Default: 25 - -``paxos propose interval`` - -:Description: Gather updates for this time interval before proposing - a map update. -:Type: Double -:Default: ``1.0`` - - -``paxos min`` - -:Description: The minimum number of paxos states to keep around -:Type: Integer -:Default: 500 - - -``paxos min wait`` - -:Description: The minimum amount of time to gather updates after a period of - inactivity. -:Type: Double -:Default: ``0.05`` - - -``paxos trim min`` - -:Description: Number of extra proposals tolerated before trimming -:Type: Integer -:Default: 250 - - -``paxos trim max`` - -:Description: The maximum number of extra proposals to trim at a time -:Type: Integer -:Default: 500 - - -``paxos service trim min`` - -:Description: The minimum amount of versions to trigger a trim (0 disables it) -:Type: Integer -:Default: 250 - - -``paxos service trim max`` - -:Description: The maximum amount of versions to trim during a single proposal (0 disables it) -:Type: Integer -:Default: 500 - - -``mon max log epochs`` - -:Description: The maximum amount of log epochs to trim during a single proposal -:Type: Integer -:Default: 500 - - -``mon max pgmap epochs`` - -:Description: The maximum amount of pgmap epochs to trim during a single proposal -:Type: Integer -:Default: 500 - - -``mon mds force trim to`` - -:Description: Force monitor to trim mdsmaps to this point (0 disables it. - dangerous, use with care) -:Type: Integer -:Default: 0 - - -``mon osd force trim to`` - -:Description: Force monitor to trim osdmaps to this point, even if there is - PGs not clean at the specified epoch (0 disables it. dangerous, - use with care) -:Type: Integer -:Default: 0 - -``mon osd cache size`` - -:Description: The size of osdmaps cache, not to rely on underlying store's cache -:Type: Integer -:Default: 10 - - -``mon election timeout`` - -:Description: On election proposer, maximum waiting time for all ACKs in seconds. -:Type: Float -:Default: ``5`` - - -``mon lease`` - -:Description: The length (in seconds) of the lease on the monitor's versions. -:Type: Float -:Default: ``5`` - - -``mon lease renew interval factor`` - -:Description: ``mon lease`` \* ``mon lease renew interval factor`` will be the - interval for the Leader to renew the other monitor's leases. The - factor should be less than ``1.0``. -:Type: Float -:Default: ``0.6`` - - -``mon lease ack timeout factor`` - -:Description: The Leader will wait ``mon lease`` \* ``mon lease ack timeout factor`` - for the Providers to acknowledge the lease extension. -:Type: Float -:Default: ``2.0`` - - -``mon accept timeout factor`` - -:Description: The Leader will wait ``mon lease`` \* ``mon accept timeout factor`` - for the Requester(s) to accept a Paxos update. It is also used - during the Paxos recovery phase for similar purposes. -:Type: Float -:Default: ``2.0`` - - -``mon min osdmap epochs`` - -:Description: Minimum number of OSD map epochs to keep at all times. -:Type: 32-bit Integer -:Default: ``500`` - - -``mon max pgmap epochs`` - -:Description: Maximum number of PG map epochs the monitor should keep. -:Type: 32-bit Integer -:Default: ``500`` - - -``mon max log epochs`` - -:Description: Maximum number of Log epochs the monitor should keep. -:Type: 32-bit Integer -:Default: ``500`` - - - -.. index:: Ceph Monitor; clock - -Clock ------ - -Ceph daemons pass critical messages to each other, which must be processed -before daemons reach a timeout threshold. If the clocks in Ceph monitors -are not synchronized, it can lead to a number of anomalies. For example: - -- Daemons ignoring received messages (e.g., timestamps outdated) -- Timeouts triggered too soon/late when a message wasn't received in time. - -See `Monitor Store Synchronization`_ for details. - - -.. tip:: You SHOULD install NTP on your Ceph monitor hosts to - ensure that the monitor cluster operates with synchronized clocks. - -Clock drift may still be noticeable with NTP even though the discrepancy is not -yet harmful. Ceph's clock drift / clock skew warnings may get triggered even -though NTP maintains a reasonable level of synchronization. Increasing your -clock drift may be tolerable under such circumstances; however, a number of -factors such as workload, network latency, configuring overrides to default -timeouts and the `Monitor Store Synchronization`_ settings may influence -the level of acceptable clock drift without compromising Paxos guarantees. - -Ceph provides the following tunable options to allow you to find -acceptable values. - - -``clock offset`` - -:Description: How much to offset the system clock. See ``Clock.cc`` for details. -:Type: Double -:Default: ``0`` - - -.. deprecated:: 0.58 - -``mon tick interval`` - -:Description: A monitor's tick interval in seconds. -:Type: 32-bit Integer -:Default: ``5`` - - -``mon clock drift allowed`` - -:Description: The clock drift in seconds allowed between monitors. -:Type: Float -:Default: ``.050`` - - -``mon clock drift warn backoff`` - -:Description: Exponential backoff for clock drift warnings -:Type: Float -:Default: ``5`` - - -``mon timecheck interval`` - -:Description: The time check interval (clock drift check) in seconds - for the Leader. - -:Type: Float -:Default: ``300.0`` - - -``mon timecheck skew interval`` - -:Description: The time check interval (clock drift check) in seconds when in - presence of a skew in seconds for the Leader. -:Type: Float -:Default: ``30.0`` - - -Client ------- - -``mon client hunt interval`` - -:Description: The client will try a new monitor every ``N`` seconds until it - establishes a connection. - -:Type: Double -:Default: ``3.0`` - - -``mon client ping interval`` - -:Description: The client will ping the monitor every ``N`` seconds. -:Type: Double -:Default: ``10.0`` - - -``mon client max log entries per message`` - -:Description: The maximum number of log entries a monitor will generate - per client message. - -:Type: Integer -:Default: ``1000`` - - -``mon client bytes`` - -:Description: The amount of client message data allowed in memory (in bytes). -:Type: 64-bit Integer Unsigned -:Default: ``100ul << 20`` - - -Pool settings -============= -Since version v0.94 there is support for pool flags which allow or disallow changes to be made to pools. - -Monitors can also disallow removal of pools if configured that way. - -``mon allow pool delete`` - -:Description: If the monitors should allow pools to be removed. Regardless of what the pool flags say. -:Type: Boolean -:Default: ``false`` - -``osd pool default flag hashpspool`` - -:Description: Set the hashpspool flag on new pools -:Type: Boolean -:Default: ``true`` - -``osd pool default flag nodelete`` - -:Description: Set the nodelete flag on new pools. Prevents allow pool removal with this flag in any way. -:Type: Boolean -:Default: ``false`` - -``osd pool default flag nopgchange`` - -:Description: Set the nopgchange flag on new pools. Does not allow the number of PGs to be changed for a pool. -:Type: Boolean -:Default: ``false`` - -``osd pool default flag nosizechange`` - -:Description: Set the nosizechange flag on new pools. Does not allow the size to be changed of pool. -:Type: Boolean -:Default: ``false`` - -For more information about the pool flags see `Pool values`_. - -Miscellaneous -============= - - -``mon max osd`` - -:Description: The maximum number of OSDs allowed in the cluster. -:Type: 32-bit Integer -:Default: ``10000`` - -``mon globalid prealloc`` - -:Description: The number of global IDs to pre-allocate for clients and daemons in the cluster. -:Type: 32-bit Integer -:Default: ``100`` - -``mon subscribe interval`` - -:Description: The refresh interval (in seconds) for subscriptions. The - subscription mechanism enables obtaining the cluster maps - and log information. - -:Type: Double -:Default: ``300`` - - -``mon stat smooth intervals`` - -:Description: Ceph will smooth statistics over the last ``N`` PG maps. -:Type: Integer -:Default: ``2`` - - -``mon probe timeout`` - -:Description: Number of seconds the monitor will wait to find peers before bootstrapping. -:Type: Double -:Default: ``2.0`` - - -``mon daemon bytes`` - -:Description: The message memory cap for metadata server and OSD messages (in bytes). -:Type: 64-bit Integer Unsigned -:Default: ``400ul << 20`` - - -``mon max log entries per event`` - -:Description: The maximum number of log entries per event. -:Type: Integer -:Default: ``4096`` - - -``mon osd prime pg temp`` - -:Description: Enables or disable priming the PGMap with the previous OSDs when an out - OSD comes back into the cluster. With the ``true`` setting the clients - will continue to use the previous OSDs until the newly in OSDs as that - PG peered. -:Type: Boolean -:Default: ``true`` - - -``mon osd prime pg temp max time`` - -:Description: How much time in seconds the monitor should spend trying to prime the - PGMap when an out OSD comes back into the cluster. -:Type: Float -:Default: ``0.5`` - - -``mon osd prime pg temp max time estimate`` - -:Description: Maximum estimate of time spent on each PG before we prime all PGs - in parallel. -:Type: Float -:Default: ``0.25`` - - -``mon osd allow primary affinity`` - -:Description: allow ``primary_affinity`` to be set in the osdmap. -:Type: Boolean -:Default: False - - -``mon osd pool ec fast read`` - -:Description: Whether turn on fast read on the pool or not. It will be used as - the default setting of newly created erasure pools if ``fast_read`` - is not specified at create time. -:Type: Boolean -:Default: False - - -``mon mds skip sanity`` - -:Description: Skip safety assertions on FSMap (in case of bugs where we want to - continue anyway). Monitor terminates if the FSMap sanity check - fails, but we can disable it by enabling this option. -:Type: Boolean -:Default: False - - -``mon max mdsmap epochs`` - -:Description: The maximum amount of mdsmap epochs to trim during a single proposal. -:Type: Integer -:Default: 500 - - -``mon config key max entry size`` - -:Description: The maximum size of config-key entry (in bytes) -:Type: Integer -:Default: 4096 - - -``mon scrub interval`` - -:Description: How often (in seconds) the monitor scrub its store by comparing - the stored checksums with the computed ones of all the stored - keys. -:Type: Integer -:Default: 3600*24 - - -``mon scrub max keys`` - -:Description: The maximum number of keys to scrub each time. -:Type: Integer -:Default: 100 - - -``mon compact on start`` - -:Description: Compact the database used as Ceph Monitor store on - ``ceph-mon`` start. A manual compaction helps to shrink the - monitor database and improve the performance of it if the regular - compaction fails to work. -:Type: Boolean -:Default: False - - -``mon compact on bootstrap`` - -:Description: Compact the database used as Ceph Monitor store on - on bootstrap. Monitor starts probing each other for creating - a quorum after bootstrap. If it times out before joining the - quorum, it will start over and bootstrap itself again. -:Type: Boolean -:Default: False - - -``mon compact on trim`` - -:Description: Compact a certain prefix (including paxos) when we trim its old states. -:Type: Boolean -:Default: True - - -``mon cpu threads`` - -:Description: Number of threads for performing CPU intensive work on monitor. -:Type: Boolean -:Default: True - - -``mon osd mapping pgs per chunk`` - -:Description: We calculate the mapping from placement group to OSDs in chunks. - This option specifies the number of placement groups per chunk. -:Type: Integer -:Default: 4096 - - -``mon osd max split count`` - -:Description: Largest number of PGs per "involved" OSD to let split create. - When we increase the ``pg_num`` of a pool, the placement groups - will be splitted on all OSDs serving that pool. We want to avoid - extreme multipliers on PG splits. -:Type: Integer -:Default: 300 - - -``mon session timeout`` - -:Description: Monitor will terminate inactive sessions stay idle over this - time limit. -:Type: Integer -:Default: 300 - - - -.. _Paxos: http://en.wikipedia.org/wiki/Paxos_(computer_science) -.. _Monitor Keyrings: ../../../dev/mon-bootstrap#secret-keys -.. _Ceph configuration file: ../ceph-conf/#monitors -.. _Network Configuration Reference: ../network-config-ref -.. _Monitor lookup through DNS: ../mon-lookup-dns -.. _ACID: http://en.wikipedia.org/wiki/ACID -.. _Adding/Removing a Monitor: ../../operations/add-or-rm-mons -.. _Add/Remove a Monitor (ceph-deploy): ../../deployment/ceph-deploy-mon -.. _Monitoring a Cluster: ../../operations/monitoring -.. _Monitoring OSDs and PGs: ../../operations/monitoring-osd-pg -.. _Bootstrapping a Monitor: ../../../dev/mon-bootstrap -.. _Changing a Monitor's IP Address: ../../operations/add-or-rm-mons#changing-a-monitor-s-ip-address -.. _Monitor/OSD Interaction: ../mon-osd-interaction -.. _Scalability and High Availability: ../../../architecture#scalability-and-high-availability -.. _Pool values: ../../operations/pools/#set-pool-values