5 When you first deploy a cluster without creating a pool, Ceph uses the default
6 pools for storing data. A pool provides you with:
8 - **Resilience**: You can set how many OSD are allowed to fail without losing data.
9 For replicated pools, it is the desired number of copies/replicas of an object.
10 A typical configuration stores an object and one additional copy
11 (i.e., ``size = 2``), but you can determine the number of copies/replicas.
12 For `erasure coded pools <../erasure-code>`_, it is the number of coding chunks
13 (i.e. ``m=2`` in the **erasure code profile**)
15 - **Placement Groups**: You can set the number of placement groups for the pool.
16 A typical configuration uses approximately 100 placement groups per OSD to
17 provide optimal balancing without using up too many computing resources. When
18 setting up multiple pools, be careful to ensure you set a reasonable number of
19 placement groups for both the pool and the cluster as a whole.
21 - **CRUSH Rules**: When you store data in a pool, a CRUSH ruleset mapped to the
22 pool enables CRUSH to identify a rule for the placement of the object
23 and its replicas (or chunks for erasure coded pools) in your cluster.
24 You can create a custom CRUSH rule for your pool.
26 - **Snapshots**: When you create snapshots with ``ceph osd pool mksnap``,
27 you effectively take a snapshot of a particular pool.
29 To organize data into pools, you can list, create, and remove pools.
30 You can also view the utilization statistics for each pool.
35 To list your cluster's pools, execute::
39 On a freshly installed cluster, only the ``rbd`` pool exists.
47 Before creating pools, refer to the `Pool, PG and CRUSH Config Reference`_.
48 Ideally, you should override the default value for the number of placement
49 groups in your Ceph configuration file, as the default is NOT ideal.
50 For details on placement group numbers refer to `setting the number of placement groups`_
52 .. note:: Starting with Luminous, all pools need to be associated to the
53 application using the pool. See `Associate Pool to Application`_ below for
58 osd pool default pg num = 100
59 osd pool default pgp num = 100
61 To create a pool, execute::
63 ceph osd pool create {pool-name} {pg-num} [{pgp-num}] [replicated] \
64 [crush-rule-name] [expected-num-objects]
65 ceph osd pool create {pool-name} {pg-num} {pgp-num} erasure \
66 [erasure-code-profile] [crush-rule-name] [expected_num_objects]
72 :Description: The name of the pool. It must be unique.
78 :Description: The total number of placement groups for the pool. See `Placement
79 Groups`_ for details on calculating a suitable number. The
80 default value ``8`` is NOT suitable for most systems.
88 :Description: The total number of placement groups for placement purposes. This
89 **should be equal to the total number of placement groups**, except
90 for placement group splitting scenarios.
93 :Required: Yes. Picks up default or Ceph configuration value if not specified.
96 ``{replicated|erasure}``
98 :Description: The pool type which may either be **replicated** to
99 recover from lost OSDs by keeping multiple copies of the
100 objects or **erasure** to get a kind of
101 `generalized RAID5 <../erasure-code>`_ capability.
102 The **replicated** pools require more
103 raw storage but implement all Ceph operations. The
104 **erasure** pools require less raw storage but only
105 implement a subset of the available operations.
111 ``[crush-rule-name]``
113 :Description: The name of a CRUSH rule to use for this pool. The specified
118 :Default: For **replicated** pools it is the ruleset specified by the ``osd
119 pool default crush replicated ruleset`` config variable. This
121 For **erasure** pools it is ``erasure-code`` if the ``default``
122 `erasure code profile`_ is used or ``{pool-name}`` otherwise. This
123 ruleset will be created implicitly if it doesn't exist already.
126 ``[erasure-code-profile=profile]``
128 .. _erasure code profile: ../erasure-code-profile
130 :Description: For **erasure** pools only. Use the `erasure code profile`_. It
131 must be an existing profile as defined by
132 **osd erasure-code-profile set**.
137 When you create a pool, set the number of placement groups to a reasonable value
138 (e.g., ``100``). Consider the total number of placement groups per OSD too.
139 Placement groups are computationally expensive, so performance will degrade when
140 you have many pools with many placement groups (e.g., 50 pools with 100
141 placement groups each). The point of diminishing returns depends upon the power
144 See `Placement Groups`_ for details on calculating an appropriate number of
145 placement groups for your pool.
147 .. _Placement Groups: ../placement-groups
149 ``[expected-num-objects]``
151 :Description: The expected number of objects for this pool. By setting this value (
152 together with a negative **filestore merge threshold**), the PG folder
153 splitting would happen at the pool creation time, to avoid the latency
154 impact to do a runtime folder splitting.
158 :Default: 0, no splitting at the pool creation time.
160 Associate Pool to Application
161 =============================
163 Pools need to be associated with an application before use. Pools that will be
164 used with CephFS or pools that are automatically created by RGW are
165 automatically associated. Pools that are intended for use with RBD should be
166 initialized using the ``rbd`` tool (see `Block Device Commands`_ for more
169 For other cases, you can manually associate a free-form application name to
172 ceph osd pool application enable {pool-name} {application-name}
174 .. note:: CephFS uses the application name ``cephfs``, RBD uses the
175 application name ``rbd``, and RGW uses the application name ``rgw``.
180 You can set pool quotas for the maximum number of bytes and/or the maximum
181 number of objects per pool. ::
183 ceph osd pool set-quota {pool-name} [max_objects {obj-count}] [max_bytes {bytes}]
187 ceph osd pool set-quota data max_objects 10000
189 To remove a quota, set its value to ``0``.
195 To delete a pool, execute::
197 ceph osd pool delete {pool-name} [{pool-name} --yes-i-really-really-mean-it]
200 To remove a pool the mon_allow_pool_delete flag must be set to true in the Monitor's
201 configuration. Otherwise they will refuse to remove a pool.
203 See `Monitor Configuration`_ for more information.
205 .. _Monitor Configuration: ../../configuration/mon-config-ref
207 If you created your own rulesets and rules for a pool you created, you should
208 consider removing them when you no longer need your pool::
210 ceph osd pool get {pool-name} crush_ruleset
212 If the ruleset was "123", for example, you can check the other pools like so::
214 ceph osd dump | grep "^pool" | grep "crush_ruleset 123"
216 If no other pools use that custom ruleset, then it's safe to delete that
217 ruleset from the cluster.
219 If you created users with permissions strictly for a pool that no longer
220 exists, you should consider deleting those users too::
222 ceph auth ls | grep -C 5 {pool-name}
229 To rename a pool, execute::
231 ceph osd pool rename {current-pool-name} {new-pool-name}
233 If you rename a pool and you have per-pool capabilities for an authenticated
234 user, you must update the user's capabilities (i.e., caps) with the new pool
237 .. note:: Version ``0.48`` Argonaut and above.
242 To show a pool's utilization statistics, execute::
247 Make a Snapshot of a Pool
248 =========================
250 To make a snapshot of a pool, execute::
252 ceph osd pool mksnap {pool-name} {snap-name}
254 .. note:: Version ``0.48`` Argonaut and above.
257 Remove a Snapshot of a Pool
258 ===========================
260 To remove a snapshot of a pool, execute::
262 ceph osd pool rmsnap {pool-name} {snap-name}
264 .. note:: Version ``0.48`` Argonaut and above.
272 To set a value to a pool, execute the following::
274 ceph osd pool set {pool-name} {key} {value}
276 You may set values for the following keys:
278 .. _compression_algorithm:
280 ``compression_algorithm``
281 :Description: Sets inline compression algorithm to use for underlying BlueStore.
282 This setting overrides the `global setting <rados/configuration/bluestore-config-ref/#inline-compression>`_ of ``bluestore compression algorithm``.
285 :Valid Settings: ``lz4``, ``snappy``, ``zlib``, ``zstd``
289 :Description: Sets the policy for the inline compression algorithm for underlying BlueStore.
290 This setting overrides the `global setting <rados/configuration/bluestore-config-ref/#inline-compression>`_ of ``bluestore compression mode``.
293 :Valid Settings: ``none``, ``passive``, ``aggressive``, ``force``
295 ``compression_min_blob_size``
297 :Description: Chunks smaller than this are never compressed.
298 This setting overrides the `global setting <rados/configuration/bluestore-config-ref/#inline-compression>`_ of ``bluestore compression min blob *``.
300 :Type: Unsigned Integer
302 ``compression_max_blob_size``
304 :Description: Chunks larger than this are broken into smaller blobs sizing
305 ``compression_max_blob_size`` before being compressed.
307 :Type: Unsigned Integer
313 :Description: Sets the number of replicas for objects in the pool.
314 See `Set the Number of Object Replicas`_ for further details.
315 Replicated pools only.
323 :Description: Sets the minimum number of replicas required for I/O.
324 See `Set the Number of Object Replicas`_ for further details.
325 Replicated pools only.
328 :Version: ``0.54`` and above
334 :Description: The effective number of placement groups to use when calculating
337 :Valid Range: Superior to ``pg_num`` current value.
343 :Description: The effective number of placement groups for placement to use
344 when calculating data placement.
347 :Valid Range: Equal to or less than ``pg_num``.
353 :Description: The ruleset to use for mapping object placement in the cluster.
356 .. _allow_ec_overwrites:
358 ``allow_ec_overwrites``
360 :Description: Whether writes to an erasure coded pool can update part
361 of an object, so cephfs and rbd can use it. See
362 `Erasure Coding with Overwrites`_ for more details.
364 :Version: ``12.2.0`` and above
370 :Description: Set/Unset HASHPSPOOL flag on a given pool.
372 :Valid Range: 1 sets flag, 0 unsets flag
373 :Version: Version ``0.48`` Argonaut and above.
379 :Description: Set/Unset NODELETE flag on a given pool.
381 :Valid Range: 1 sets flag, 0 unsets flag
382 :Version: Version ``FIXME``
388 :Description: Set/Unset NOPGCHANGE flag on a given pool.
390 :Valid Range: 1 sets flag, 0 unsets flag
391 :Version: Version ``FIXME``
397 :Description: Set/Unset NOSIZECHANGE flag on a given pool.
399 :Valid Range: 1 sets flag, 0 unsets flag
400 :Version: Version ``FIXME``
402 .. _write_fadvise_dontneed:
404 ``write_fadvise_dontneed``
406 :Description: Set/Unset WRITE_FADVISE_DONTNEED flag on a given pool.
408 :Valid Range: 1 sets flag, 0 unsets flag
414 :Description: Set/Unset NOSCRUB flag on a given pool.
416 :Valid Range: 1 sets flag, 0 unsets flag
422 :Description: Set/Unset NODEEP_SCRUB flag on a given pool.
424 :Valid Range: 1 sets flag, 0 unsets flag
430 :Description: Enables hit set tracking for cache pools.
431 See `Bloom Filter`_ for additional information.
434 :Valid Settings: ``bloom``, ``explicit_hash``, ``explicit_object``
435 :Default: ``bloom``. Other values are for testing.
441 :Description: The number of hit sets to store for cache pools. The higher
442 the number, the more RAM consumed by the ``ceph-osd`` daemon.
445 :Valid Range: ``1``. Agent doesn't handle > 1 yet.
451 :Description: The duration of a hit set period in seconds for cache pools.
452 The higher the number, the more RAM consumed by the
456 :Example: ``3600`` 1hr
462 :Description: The false positive probability for the ``bloom`` hit set type.
463 See `Bloom Filter`_ for additional information.
466 :Valid Range: 0.0 - 1.0
469 .. _cache_target_dirty_ratio:
471 ``cache_target_dirty_ratio``
473 :Description: The percentage of the cache pool containing modified (dirty)
474 objects before the cache tiering agent will flush them to the
475 backing storage pool.
480 .. _cache_target_dirty_high_ratio:
482 ``cache_target_dirty_high_ratio``
484 :Description: The percentage of the cache pool containing modified (dirty)
485 objects before the cache tiering agent will flush them to the
486 backing storage pool with a higher speed.
491 .. _cache_target_full_ratio:
493 ``cache_target_full_ratio``
495 :Description: The percentage of the cache pool containing unmodified (clean)
496 objects before the cache tiering agent will evict them from the
502 .. _target_max_bytes:
506 :Description: Ceph will begin flushing or evicting objects when the
507 ``max_bytes`` threshold is triggered.
510 :Example: ``1000000000000`` #1-TB
512 .. _target_max_objects:
514 ``target_max_objects``
516 :Description: Ceph will begin flushing or evicting objects when the
517 ``max_objects`` threshold is triggered.
520 :Example: ``1000000`` #1M objects
523 ``hit_set_grade_decay_rate``
525 :Description: Temperature decay rate between two successive hit_sets
527 :Valid Range: 0 - 100
531 ``hit_set_search_last_n``
533 :Description: Count at most N appearance in hit_sets for temperature calculation
535 :Valid Range: 0 - hit_set_count
539 .. _cache_min_flush_age:
541 ``cache_min_flush_age``
543 :Description: The time (in seconds) before the cache tiering agent will flush
544 an object from the cache pool to the storage pool.
547 :Example: ``600`` 10min
549 .. _cache_min_evict_age:
551 ``cache_min_evict_age``
553 :Description: The time (in seconds) before the cache tiering agent will evict
554 an object from the cache pool.
557 :Example: ``1800`` 30min
563 :Description: On Erasure Coding pool, if this flag is turned on, the read request
564 would issue sub reads to all shards, and waits until it receives enough
565 shards to decode to serve the client. In the case of jerasure and isa
566 erasure plugins, once the first K replies return, client's request is
567 served immediately using the data decoded from these replies. This
568 helps to tradeoff some resources for better performance. Currently this
569 flag is only supported for Erasure Coding pool.
574 .. _scrub_min_interval:
576 ``scrub_min_interval``
578 :Description: The minimum interval in seconds for pool scrubbing when
579 load is low. If it is 0, the value osd_scrub_min_interval
585 .. _scrub_max_interval:
587 ``scrub_max_interval``
589 :Description: The maximum interval in seconds for pool scrubbing
590 irrespective of cluster load. If it is 0, the value
591 osd_scrub_max_interval from config is used.
596 .. _deep_scrub_interval:
598 ``deep_scrub_interval``
600 :Description: The interval in seconds for pool “deep” scrubbing. If it
601 is 0, the value osd_deep_scrub_interval from config is used.
610 To get a value from a pool, execute the following::
612 ceph osd pool get {pool-name} {key}
614 You may get values for the following keys:
618 :Description: see size_
624 :Description: see min_size_
627 :Version: ``0.54`` and above
631 :Description: see pg_num_
638 :Description: see pgp_num_
641 :Valid Range: Equal to or less than ``pg_num``.
646 :Description: see crush_ruleset_
651 :Description: see hit_set_type_
654 :Valid Settings: ``bloom``, ``explicit_hash``, ``explicit_object``
658 :Description: see hit_set_count_
665 :Description: see hit_set_period_
672 :Description: see hit_set_fpp_
677 ``cache_target_dirty_ratio``
679 :Description: see cache_target_dirty_ratio_
684 ``cache_target_dirty_high_ratio``
686 :Description: see cache_target_dirty_high_ratio_
691 ``cache_target_full_ratio``
693 :Description: see cache_target_full_ratio_
700 :Description: see target_max_bytes_
705 ``target_max_objects``
707 :Description: see target_max_objects_
712 ``cache_min_flush_age``
714 :Description: see cache_min_flush_age_
719 ``cache_min_evict_age``
721 :Description: see cache_min_evict_age_
728 :Description: see fast_read_
733 ``scrub_min_interval``
735 :Description: see scrub_min_interval_
740 ``scrub_max_interval``
742 :Description: see scrub_max_interval_
747 ``deep_scrub_interval``
749 :Description: see deep_scrub_interval_
754 Set the Number of Object Replicas
755 =================================
757 To set the number of object replicas on a replicated pool, execute the following::
759 ceph osd pool set {poolname} size {num-replicas}
761 .. important:: The ``{num-replicas}`` includes the object itself.
762 If you want the object and two copies of the object for a total of
763 three instances of the object, specify ``3``.
767 ceph osd pool set data size 3
769 You may execute this command for each pool. **Note:** An object might accept
770 I/Os in degraded mode with fewer than ``pool size`` replicas. To set a minimum
771 number of required replicas for I/O, you should use the ``min_size`` setting.
774 ceph osd pool set data min_size 2
776 This ensures that no object in the data pool will receive I/O with fewer than
777 ``min_size`` replicas.
780 Get the Number of Object Replicas
781 =================================
783 To get the number of object replicas, execute the following::
785 ceph osd dump | grep 'replicated size'
787 Ceph will list the pools, with the ``replicated size`` attribute highlighted.
788 By default, ceph creates two replicas of an object (a total of three copies, or
793 .. _Pool, PG and CRUSH Config Reference: ../../configuration/pool-pg-config-ref
794 .. _Bloom Filter: http://en.wikipedia.org/wiki/Bloom_filter
795 .. _setting the number of placement groups: ../placement-groups#set-the-number-of-placement-groups
796 .. _Erasure Coding with Overwrites: ../erasure-code#erasure-coding-with-overwrites
797 .. _Block Device Commands: ../../../rbd/rados-rbd-cmds/#create-a-block-device-pool