src/ceph/doc/rados/troubleshooting/troubleshooting-pg.rst

   1 =====================
   2  Troubleshooting PGs
   3 =====================
   4
   5 Placement Groups Never Get Clean
   6 ================================
   7
   8 When you create a cluster and your cluster remains in ``active``,
   9 ``active+remapped`` or ``active+degraded`` status and never achieve an
  10 ``active+clean`` status, you likely have a problem with your configuration.
  11
  12 You may need to review settings in the `Pool, PG and CRUSH Config Reference`_
  13 and make appropriate adjustments.
  14
  15 As a general rule, you should run your cluster with more than one OSD and a
  16 pool size greater than 1 object replica.
  17
  18 One Node Cluster
  19 ----------------
  20
  21 Ceph no longer provides documentation for operating on a single node, because
  22 you would never deploy a system designed for distributed computing on a single
  23 node. Additionally, mounting client kernel modules on a single node containing a
  24 Ceph  daemon may cause a deadlock due to issues with the Linux kernel itself
  25 (unless you use VMs for the clients). You can experiment with Ceph in a 1-node
  26 configuration, in spite of the limitations as described herein.
  27
  28 If you are trying to create a cluster on a single node, you must change the
  29 default of the ``osd crush chooseleaf type`` setting from ``1`` (meaning
  30 ``host`` or ``node``) to ``0`` (meaning ``osd``) in your Ceph configuration
  31 file before you create your monitors and OSDs. This tells Ceph that an OSD
  32 can peer with another OSD on the same host. If you are trying to set up a
  33 1-node cluster and ``osd crush chooseleaf type`` is greater than ``0``,
  34 Ceph will try to peer the PGs of one OSD with the PGs of another OSD on
  35 another node, chassis, rack, row, or even datacenter depending on the setting.
  36
  37 .. tip:: DO NOT mount kernel clients directly on the same node as your
  38    Ceph Storage Cluster, because kernel conflicts can arise. However, you
  39    can mount kernel clients within virtual machines (VMs) on a single node.
  40
  41 If you are creating OSDs using a single disk, you must create directories
  42 for the data manually first. For example::
  43
  44         mkdir /var/local/osd0 /var/local/osd1
  45         ceph-deploy osd prepare {localhost-name}:/var/local/osd0 {localhost-name}:/var/local/osd1
  46         ceph-deploy osd activate {localhost-name}:/var/local/osd0 {localhost-name}:/var/local/osd1
  47
  48
  49 Fewer OSDs than Replicas
  50 ------------------------
  51
  52 If you have brought up two OSDs to an ``up`` and ``in`` state, but you still
  53 don't see ``active + clean`` placement groups, you may have an
  54 ``osd pool default size`` set to greater than ``2``.
  55
  56 There are a few ways to address this situation. If you want to operate your
  57 cluster in an ``active + degraded`` state with two replicas, you can set the
  58 ``osd pool default min size`` to ``2`` so that you can write objects in
  59 an ``active + degraded`` state. You may also set the ``osd pool default size``
  60 setting to ``2`` so that you only have two stored replicas (the original and
  61 one replica), in which case the cluster should achieve an ``active + clean``
  62 state.
  63
  64 .. note:: You can make the changes at runtime. If you make the changes in
  65    your Ceph configuration file, you may need to restart your cluster.
  66
  67
  68 Pool Size = 1
  69 -------------
  70
  71 If you have the ``osd pool default size`` set to ``1``, you will only have
  72 one copy of the object. OSDs rely on other OSDs to tell them which objects
  73 they should have. If a first OSD has a copy of an object and there is no
  74 second copy, then no second OSD can tell the first OSD that it should have
  75 that copy. For each placement group mapped to the first OSD (see
  76 ``ceph pg dump``), you can force the first OSD to notice the placement groups
  77 it needs by running::
  78
  79         ceph osd force-create-pg <pgid>
  80
  81
  82 CRUSH Map Errors
  83 ----------------
  84
  85 Another candidate for placement groups remaining unclean involves errors
  86 in your CRUSH map.
  87
  88
  89 Stuck Placement Groups
  90 ======================
  91
  92 It is normal for placement groups to enter states like "degraded" or "peering"
  93 following a failure.  Normally these states indicate the normal progression
  94 through the failure recovery process. However, if a placement group stays in one
  95 of these states for a long time this may be an indication of a larger problem.
  96 For this reason, the monitor will warn when placement groups get "stuck" in a
  97 non-optimal state.  Specifically, we check for:
  98
  99 * ``inactive`` - The placement group has not been ``active`` for too long
 100   (i.e., it hasn't been able to service read/write requests).
 101
 102 * ``unclean`` - The placement group has not been ``clean`` for too long
 103   (i.e., it hasn't been able to completely recover from a previous failure).
 104
 105 * ``stale`` - The placement group status has not been updated by a ``ceph-osd``,
 106   indicating that all nodes storing this placement group may be ``down``.
 107
 108 You can explicitly list stuck placement groups with one of::
 109
 110         ceph pg dump_stuck stale
 111         ceph pg dump_stuck inactive
 112         ceph pg dump_stuck unclean
 113
 114 For stuck ``stale`` placement groups, it is normally a matter of getting the
 115 right ``ceph-osd`` daemons running again.  For stuck ``inactive`` placement
 116 groups, it is usually a peering problem (see :ref:`failures-osd-peering`).  For
 117 stuck ``unclean`` placement groups, there is usually something preventing
 118 recovery from completing, like unfound objects (see
 119 :ref:`failures-osd-unfound`);
 120
 121
 122
 123 .. _failures-osd-peering:
 124
 125 Placement Group Down - Peering Failure
 126 ======================================
 127
 128 In certain cases, the ``ceph-osd`` `Peering` process can run into
 129 problems, preventing a PG from becoming active and usable.  For
 130 example, ``ceph health`` might report::
 131
 132         ceph health detail
 133         HEALTH_ERR 7 pgs degraded; 12 pgs down; 12 pgs peering; 1 pgs recovering; 6 pgs stuck unclean; 114/3300 degraded (3.455%); 1/3 in osds are down
 134         ...
 135         pg 0.5 is down+peering
 136         pg 1.4 is down+peering
 137         ...
 138         osd.1 is down since epoch 69, last address 192.168.106.220:6801/8651
 139
 140 We can query the cluster to determine exactly why the PG is marked ``down`` with::
 141
 142         ceph pg 0.5 query
 143
 144 .. code-block:: javascript
 145
 146  { "state": "down+peering",
 147    ...
 148    "recovery_state": [
 149         { "name": "Started\/Primary\/Peering\/GetInfo",
 150           "enter_time": "2012-03-06 14:40:16.169679",
 151           "requested_info_from": []},
 152         { "name": "Started\/Primary\/Peering",
 153           "enter_time": "2012-03-06 14:40:16.169659",
 154           "probing_osds": [
 155                 0,
 156                 1],
 157           "blocked": "peering is blocked due to down osds",
 158           "down_osds_we_would_probe": [
 159                 1],
 160           "peering_blocked_by": [
 161                 { "osd": 1,
 162                   "current_lost_at": 0,
 163                   "comment": "starting or marking this osd lost may let us proceed"}]},
 164         { "name": "Started",
 165           "enter_time": "2012-03-06 14:40:16.169513"}
 166     ]
 167  }
 168
 169 The ``recovery_state`` section tells us that peering is blocked due to
 170 down ``ceph-osd`` daemons, specifically ``osd.1``.  In this case, we can start that ``ceph-osd``
 171 and things will recover.
 172
 173 Alternatively, if there is a catastrophic failure of ``osd.1`` (e.g., disk
 174 failure), we can tell the cluster that it is ``lost`` and to cope as
 175 best it can.
 176
 177 .. important:: This is dangerous in that the cluster cannot
 178    guarantee that the other copies of the data are consistent
 179    and up to date.
 180
 181 To instruct Ceph to continue anyway::
 182
 183         ceph osd lost 1
 184
 185 Recovery will proceed.
 186
 187
 188 .. _failures-osd-unfound:
 189
 190 Unfound Objects
 191 ===============
 192
 193 Under certain combinations of failures Ceph may complain about
 194 ``unfound`` objects::
 195
 196         ceph health detail
 197         HEALTH_WARN 1 pgs degraded; 78/3778 unfound (2.065%)
 198         pg 2.4 is active+degraded, 78 unfound
 199
 200 This means that the storage cluster knows that some objects (or newer
 201 copies of existing objects) exist, but it hasn't found copies of them.
 202 One example of how this might come about for a PG whose data is on ceph-osds
 203 1 and 2:
 204
 205 * 1 goes down
 206 * 2 handles some writes, alone
 207 * 1 comes up
 208 * 1 and 2 repeer, and the objects missing on 1 are queued for recovery.
 209 * Before the new objects are copied, 2 goes down.
 210
 211 Now 1 knows that these object exist, but there is no live ``ceph-osd`` who
 212 has a copy.  In this case, IO to those objects will block, and the
 213 cluster will hope that the failed node comes back soon; this is
 214 assumed to be preferable to returning an IO error to the user.
 215
 216 First, you can identify which objects are unfound with::
 217
 218         ceph pg 2.4 list_missing [starting offset, in json]
 219
 220 .. code-block:: javascript
 221
 222  { "offset": { "oid": "",
 223       "key": "",
 224       "snapid": 0,
 225       "hash": 0,
 226       "max": 0},
 227   "num_missing": 0,
 228   "num_unfound": 0,
 229   "objects": [
 230      { "oid": "object 1",
 231        "key": "",
 232        "hash": 0,
 233        "max": 0 },
 234      ...
 235   ],
 236   "more": 0}
 237
 238 If there are too many objects to list in a single result, the ``more``
 239 field will be true and you can query for more.  (Eventually the
 240 command line tool will hide this from you, but not yet.)
 241
 242 Second, you can identify which OSDs have been probed or might contain
 243 data::
 244
 245         ceph pg 2.4 query
 246
 247 .. code-block:: javascript
 248
 249    "recovery_state": [
 250         { "name": "Started\/Primary\/Active",
 251           "enter_time": "2012-03-06 15:15:46.713212",
 252           "might_have_unfound": [
 253                 { "osd": 1,
 254                   "status": "osd is down"}]},
 255
 256 In this case, for example, the cluster knows that ``osd.1`` might have
 257 data, but it is ``down``.  The full range of possible states include:
 258
 259 * already probed
 260 * querying
 261 * OSD is down
 262 * not queried (yet)
 263
 264 Sometimes it simply takes some time for the cluster to query possible
 265 locations.
 266
 267 It is possible that there are other locations where the object can
 268 exist that are not listed.  For example, if a ceph-osd is stopped and
 269 taken out of the cluster, the cluster fully recovers, and due to some
 270 future set of failures ends up with an unfound object, it won't
 271 consider the long-departed ceph-osd as a potential location to
 272 consider.  (This scenario, however, is unlikely.)
 273
 274 If all possible locations have been queried and objects are still
 275 lost, you may have to give up on the lost objects. This, again, is
 276 possible given unusual combinations of failures that allow the cluster
 277 to learn about writes that were performed before the writes themselves
 278 are recovered.  To mark the "unfound" objects as "lost"::
 279
 280         ceph pg 2.5 mark_unfound_lost revert|delete
 281
 282 This the final argument specifies how the cluster should deal with
 283 lost objects.
 284
 285 The "delete" option will forget about them entirely.
 286
 287 The "revert" option (not available for erasure coded pools) will
 288 either roll back to a previous version of the object or (if it was a
 289 new object) forget about it entirely.  Use this with caution, as it
 290 may confuse applications that expected the object to exist.
 291
 292
 293 Homeless Placement Groups
 294 =========================
 295
 296 It is possible for all OSDs that had copies of a given placement groups to fail.
 297 If that's the case, that subset of the object store is unavailable, and the
 298 monitor will receive no status updates for those placement groups.  To detect
 299 this situation, the monitor marks any placement group whose primary OSD has
 300 failed as ``stale``.  For example::
 301
 302         ceph health
 303         HEALTH_WARN 24 pgs stale; 3/300 in osds are down
 304
 305 You can identify which placement groups are ``stale``, and what the last OSDs to
 306 store them were, with::
 307
 308         ceph health detail
 309         HEALTH_WARN 24 pgs stale; 3/300 in osds are down
 310         ...
 311         pg 2.5 is stuck stale+active+remapped, last acting [2,0]
 312         ...
 313         osd.10 is down since epoch 23, last address 192.168.106.220:6800/11080
 314         osd.11 is down since epoch 13, last address 192.168.106.220:6803/11539
 315         osd.12 is down since epoch 24, last address 192.168.106.220:6806/11861
 316
 317 If we want to get placement group 2.5 back online, for example, this tells us that
 318 it was last managed by ``osd.0`` and ``osd.2``.  Restarting those ``ceph-osd``
 319 daemons will allow the cluster to recover that placement group (and, presumably,
 320 many others).
 321
 322
 323 Only a Few OSDs Receive Data
 324 ============================
 325
 326 If you have many nodes in your cluster and only a few of them receive data,
 327 `check`_ the number of placement groups in your pool. Since placement groups get
 328 mapped to OSDs, a small number of placement groups will not distribute across
 329 your cluster. Try creating a pool with a placement group count that is a
 330 multiple of the number of OSDs. See `Placement Groups`_ for details. The default
 331 placement group count for pools is not useful, but you can change it `here`_.
 332
 333
 334 Can't Write Data
 335 ================
 336
 337 If your cluster is up, but some OSDs are down and you cannot write data,
 338 check to ensure that you have the minimum number of OSDs running for the
 339 placement group. If you don't have the minimum number of OSDs running,
 340 Ceph will not allow you to write data because there is no guarantee
 341 that Ceph can replicate your data. See ``osd pool default min size``
 342 in the `Pool, PG and CRUSH Config Reference`_ for details.
 343
 344
 345 PGs Inconsistent
 346 ================
 347
 348 If you receive an ``active + clean + inconsistent`` state, this may happen
 349 due to an error during scrubbing. As always, we can identify the inconsistent
 350 placement group(s) with::
 351
 352     $ ceph health detail
 353     HEALTH_ERR 1 pgs inconsistent; 2 scrub errors
 354     pg 0.6 is active+clean+inconsistent, acting [0,1,2]
 355     2 scrub errors
 356
 357 Or if you prefer inspecting the output in a programmatic way::
 358
 359     $ rados list-inconsistent-pg rbd
 360     ["0.6"]
 361
 362 There is only one consistent state, but in the worst case, we could have
 363 different inconsistencies in multiple perspectives found in more than one
 364 objects. If an object named ``foo`` in PG ``0.6`` is truncated, we will have::
 365
 366     $ rados list-inconsistent-obj 0.6 --format=json-pretty
 367
 368 .. code-block:: javascript
 369
 370     {
 371         "epoch": 14,
 372         "inconsistents": [
 373             {
 374                 "object": {
 375                     "name": "foo",
 376                     "nspace": "",
 377                     "locator": "",
 378                     "snap": "head",
 379                     "version": 1
 380                 },
 381                 "errors": [
 382                     "data_digest_mismatch",
 383                     "size_mismatch"
 384                 ],
 385                 "union_shard_errors": [
 386                     "data_digest_mismatch_oi",
 387                     "size_mismatch_oi"
 388                 ],
 389                 "selected_object_info": "0:602f83fe:::foo:head(16'1 client.4110.0:1 dirty|data_digest|omap_digest s 968 uv 1 dd e978e67f od ffffffff alloc_hint [0 0 0])",
 390                 "shards": [
 391                     {
 392                         "osd": 0,
 393                         "errors": [],
 394                         "size": 968,
 395                         "omap_digest": "0xffffffff",
 396                         "data_digest": "0xe978e67f"
 397                     },
 398                     {
 399                         "osd": 1,
 400                         "errors": [],
 401                         "size": 968,
 402                         "omap_digest": "0xffffffff",
 403                         "data_digest": "0xe978e67f"
 404                     },
 405                     {
 406                         "osd": 2,
 407                         "errors": [
 408                             "data_digest_mismatch_oi",
 409                             "size_mismatch_oi"
 410                         ],
 411                         "size": 0,
 412                         "omap_digest": "0xffffffff",
 413                         "data_digest": "0xffffffff"
 414                     }
 415                 ]
 416             }
 417         ]
 418     }
 419
 420 In this case, we can learn from the output:
 421
 422 * The only inconsistent object is named ``foo``, and it is its head that has
 423   inconsistencies.
 424 * The inconsistencies fall into two categories:
 425
 426   * ``errors``: these errors indicate inconsistencies between shards without a
 427     determination of which shard(s) are bad. Check for the ``errors`` in the
 428     `shards` array, if available, to pinpoint the problem.
 429
 430     * ``data_digest_mismatch``: the digest of the replica read from OSD.2 is
 431       different from the ones of OSD.0 and OSD.1
 432     * ``size_mismatch``: the size of the replica read from OSD.2 is 0, while
 433       the size reported by OSD.0 and OSD.1 is 968.
 434   * ``union_shard_errors``: the union of all shard specific ``errors`` in
 435     ``shards`` array. The ``errors`` are set for the given shard that has the
 436     problem. They include errors like ``read_error``. The ``errors`` ending in
 437     ``oi`` indicate a comparison with ``selected_object_info``. Look at the
 438     ``shards`` array to determine which shard has which error(s).
 439
 440     * ``data_digest_mismatch_oi``: the digest stored in the object-info is not
 441       ``0xffffffff``, which is calculated from the shard read from OSD.2
 442     * ``size_mismatch_oi``: the size stored in the object-info is different
 443       from the one read from OSD.2. The latter is 0.
 444
 445 You can repair the inconsistent placement group by executing::
 446
 447         ceph pg repair {placement-group-ID}
 448
 449 Which overwrites the `bad` copies with the `authoritative` ones. In most cases,
 450 Ceph is able to choose authoritative copies from all available replicas using
 451 some predefined criteria. But this does not always work. For example, the stored
 452 data digest could be missing, and the calculated digest will be ignored when
 453 choosing the authoritative copies. So, please use the above command with caution.
 454
 455 If ``read_error`` is listed in the ``errors`` attribute of a shard, the
 456 inconsistency is likely due to disk errors. You might want to check your disk
 457 used by that OSD.
 458
 459 If you receive ``active + clean + inconsistent`` states periodically due to
 460 clock skew, you may consider configuring your `NTP`_ daemons on your
 461 monitor hosts to act as peers. See `The Network Time Protocol`_ and Ceph
 462 `Clock Settings`_ for additional details.
 463
 464
 465 Erasure Coded PGs are not active+clean
 466 ======================================
 467
 468 When CRUSH fails to find enough OSDs to map to a PG, it will show as a
 469 ``2147483647`` which is ITEM_NONE or ``no OSD found``. For instance::
 470
 471      [2,1,6,0,5,8,2147483647,7,4]
 472
 473 Not enough OSDs
 474 ---------------
 475
 476 If the Ceph cluster only has 8 OSDs and the erasure coded pool needs
 477 9, that is what it will show. You can either create another erasure
 478 coded pool that requires less OSDs::
 479
 480      ceph osd erasure-code-profile set myprofile k=5 m=3
 481      ceph osd pool create erasurepool 16 16 erasure myprofile
 482
 483 or add a new OSDs and the PG will automatically use them.
 484
 485 CRUSH constraints cannot be satisfied
 486 -------------------------------------
 487
 488 If the cluster has enough OSDs, it is possible that the CRUSH ruleset
 489 imposes constraints that cannot be satisfied. If there are 10 OSDs on
 490 two hosts and the CRUSH rulesets require that no two OSDs from the
 491 same host are used in the same PG, the mapping may fail because only
 492 two OSD will be found. You can check the constraint by displaying the
 493 ruleset::
 494
 495     $ ceph osd crush rule ls
 496     [
 497         "replicated_ruleset",
 498         "erasurepool"]
 499     $ ceph osd crush rule dump erasurepool
 500     { "rule_id": 1,
 501       "rule_name": "erasurepool",
 502       "ruleset": 1,
 503       "type": 3,
 504       "min_size": 3,
 505       "max_size": 20,
 506       "steps": [
 507             { "op": "take",
 508               "item": -1,
 509               "item_name": "default"},
 510             { "op": "chooseleaf_indep",
 511               "num": 0,
 512               "type": "host"},
 513             { "op": "emit"}]}
 514
 515
 516 You can resolve the problem by creating a new pool in which PGs are allowed
 517 to have OSDs residing on the same host with::
 518
 519      ceph osd erasure-code-profile set myprofile crush-failure-domain=osd
 520      ceph osd pool create erasurepool 16 16 erasure myprofile
 521
 522 CRUSH gives up too soon
 523 -----------------------
 524
 525 If the Ceph cluster has just enough OSDs to map the PG (for instance a
 526 cluster with a total of 9 OSDs and an erasure coded pool that requires
 527 9 OSDs per PG), it is possible that CRUSH gives up before finding a
 528 mapping. It can be resolved by:
 529
 530 * lowering the erasure coded pool requirements to use less OSDs per PG
 531   (that requires the creation of another pool as erasure code profiles
 532   cannot be dynamically modified).
 533
 534 * adding more OSDs to the cluster (that does not require the erasure
 535   coded pool to be modified, it will become clean automatically)
 536
 537 * use a hand made CRUSH ruleset that tries more times to find a good
 538   mapping. It can be done by setting ``set_choose_tries`` to a value
 539   greater than the default.
 540
 541 You should first verify the problem with ``crushtool`` after
 542 extracting the crushmap from the cluster so your experiments do not
 543 modify the Ceph cluster and only work on a local files::
 544
 545     $ ceph osd crush rule dump erasurepool
 546     { "rule_name": "erasurepool",
 547       "ruleset": 1,
 548       "type": 3,
 549       "min_size": 3,
 550       "max_size": 20,
 551       "steps": [
 552             { "op": "take",
 553               "item": -1,
 554               "item_name": "default"},
 555             { "op": "chooseleaf_indep",
 556               "num": 0,
 557               "type": "host"},
 558             { "op": "emit"}]}
 559     $ ceph osd getcrushmap > crush.map
 560     got crush map from osdmap epoch 13
 561     $ crushtool -i crush.map --test --show-bad-mappings \
 562        --rule 1 \
 563        --num-rep 9 \
 564        --min-x 1 --max-x $((1024 * 1024))
 565     bad mapping rule 8 x 43 num_rep 9 result [3,2,7,1,2147483647,8,5,6,0]
 566     bad mapping rule 8 x 79 num_rep 9 result [6,0,2,1,4,7,2147483647,5,8]
 567     bad mapping rule 8 x 173 num_rep 9 result [0,4,6,8,2,1,3,7,2147483647]
 568
 569 Where ``--num-rep`` is the number of OSDs the erasure code crush
 570 ruleset needs, ``--rule`` is the value of the ``ruleset`` field
 571 displayed by ``ceph osd crush rule dump``.  The test will try mapping
 572 one million values (i.e. the range defined by ``[--min-x,--max-x]``)
 573 and must display at least one bad mapping. If it outputs nothing it
 574 means all mappings are successfull and you can stop right there: the
 575 problem is elsewhere.
 576
 577 The crush ruleset can be edited by decompiling the crush map::
 578
 579     $ crushtool --decompile crush.map > crush.txt
 580
 581 and adding the following line to the ruleset::
 582
 583     step set_choose_tries 100
 584
 585 The relevant part of of the ``crush.txt`` file should look something
 586 like::
 587
 588      rule erasurepool {
 589              ruleset 1
 590              type erasure
 591              min_size 3
 592              max_size 20
 593              step set_chooseleaf_tries 5
 594              step set_choose_tries 100
 595              step take default
 596              step chooseleaf indep 0 type host
 597              step emit
 598      }
 599
 600 It can then be compiled and tested again::
 601
 602     $ crushtool --compile crush.txt -o better-crush.map
 603
 604 When all mappings succeed, an histogram of the number of tries that
 605 were necessary to find all of them can be displayed with the
 606 ``--show-choose-tries`` option of ``crushtool``::
 607
 608     $ crushtool -i better-crush.map --test --show-bad-mappings \
 609        --show-choose-tries \
 610        --rule 1 \
 611        --num-rep 9 \
 612        --min-x 1 --max-x $((1024 * 1024))
 613     ...
 614     11:        42
 615     12:        44
 616     13:        54
 617     14:        45
 618     15:        35
 619     16:        34
 620     17:        30
 621     18:        25
 622     19:        19
 623     20:        22
 624     21:        20
 625     22:        17
 626     23:        13
 627     24:        16
 628     25:        13
 629     26:        11
 630     27:        11
 631     28:        13
 632     29:        11
 633     30:        10
 634     31:         6
 635     32:         5
 636     33:        10
 637     34:         3
 638     35:         7
 639     36:         5
 640     37:         2
 641     38:         5
 642     39:         5
 643     40:         2
 644     41:         5
 645     42:         4
 646     43:         1
 647     44:         2
 648     45:         2
 649     46:         3
 650     47:         1
 651     48:         0
 652     ...
 653     102:         0
 654     103:         1
 655     104:         0
 656     ...
 657
 658 It took 11 tries to map 42 PGs, 12 tries to map 44 PGs etc. The highest number of tries is the minimum value of ``set_choose_tries`` that prevents bad mappings (i.e. 103 in the above output because it did not take more than 103 tries for any PG to be mapped).
 659
 660 .. _check: ../../operations/placement-groups#get-the-number-of-placement-groups
 661 .. _here: ../../configuration/pool-pg-config-ref
 662 .. _Placement Groups: ../../operations/placement-groups
 663 .. _Pool, PG and CRUSH Config Reference: ../../configuration/pool-pg-config-ref
 664 .. _NTP: http://en.wikipedia.org/wiki/Network_Time_Protocol
 665 .. _The Network Time Protocol: http://www.ntp.org/
 666 .. _Clock Settings: ../../configuration/mon-config-ref/#clock
 667
 668