src/ceph/doc/rados/configuration/storage-devices.rst

   1 =================
   2  Storage Devices
   3 =================
   4
   5 There are two Ceph daemons that store data on disk:
   6
   7 * **Ceph OSDs** (or Object Storage Daemons) are where most of the
   8   data is stored in Ceph.  Generally speaking, each OSD is backed by
   9   a single storage device, like a traditional hard disk (HDD) or
  10   solid state disk (SSD).  OSDs can also be backed by a combination
  11   of devices, like a HDD for most data and an SSD (or partition of an
  12   SSD) for some metadata.  The number of OSDs in a cluster is
  13   generally a function of how much data will be stored, how big each
  14   storage device will be, and the level and type of redundancy
  15   (replication or erasure coding).
  16 * **Ceph Monitor** daemons manage critical cluster state like cluster
  17   membership and authentication information.  For smaller clusters a
  18   few gigabytes is all that is needed, although for larger clusters
  19   the monitor database can reach tens or possibly hundreds of
  20   gigabytes.
  21
  22
  23 OSD Backends
  24 ============
  25
  26 There are two ways that OSDs can manage the data they store.  Starting
  27 with the Luminous 12.2.z release, the new default (and recommended) backend is
  28 *BlueStore*.  Prior to Luminous, the default (and only option) was
  29 *FileStore*.
  30
  31 BlueStore
  32 ---------
  33
  34 BlueStore is a special-purpose storage backend designed specifically
  35 for managing data on disk for Ceph OSD workloads.  It is motivated by
  36 experience supporting and managing OSDs using FileStore over the
  37 last ten years.  Key BlueStore features include:
  38
  39 * Direct management of storage devices.  BlueStore consumes raw block
  40   devices or partitions.  This avoids any intervening layers of
  41   abstraction (such as local file systems like XFS) that may limit
  42   performance or add complexity.
  43 * Metadata management with RocksDB.  We embed RocksDB's key/value database
  44   in order to manage internal metadata, such as the mapping from object
  45   names to block locations on disk.
  46 * Full data and metadata checksumming.  By default all data and
  47   metadata written to BlueStore is protected by one or more
  48   checksums.  No data or metadata will be read from disk or returned
  49   to the user without being verified.
  50 * Inline compression.  Data written may be optionally compressed
  51   before being written to disk.
  52 * Multi-device metadata tiering.  BlueStore allows its internal
  53   journal (write-ahead log) to be written to a separate, high-speed
  54   device (like an SSD, NVMe, or NVDIMM) to increased performance.  If
  55   a significant amount of faster storage is available, internal
  56   metadata can also be stored on the faster device.
  57 * Efficient copy-on-write.  RBD and CephFS snapshots rely on a
  58   copy-on-write *clone* mechanism that is implemented efficiently in
  59   BlueStore.  This results in efficient IO both for regular snapshots
  60   and for erasure coded pools (which rely on cloning to implement
  61   efficient two-phase commits).
  62
  63 For more information, see :doc:`bluestore-config-ref`.
  64
  65 FileStore
  66 ---------
  67
  68 FileStore is the legacy approach to storing objects in Ceph.  It
  69 relies on a standard file system (normally XFS) in combination with a
  70 key/value database (traditionally LevelDB, now RocksDB) for some
  71 metadata.
  72
  73 FileStore is well-tested and widely used in production but suffers
  74 from many performance deficiencies due to its overall design and
  75 reliance on a traditional file system for storing object data.
  76
  77 Although FileStore is generally capable of functioning on most
  78 POSIX-compatible file systems (including btrfs and ext4), we only
  79 recommend that XFS be used.  Both btrfs and ext4 have known bugs and
  80 deficiencies and their use may lead to data loss.  By default all Ceph
  81 provisioning tools will use XFS.
  82
  83 For more information, see :doc:`filestore-config-ref`.