X-Git-Url: https://gerrit.opnfv.org/gerrit/gitweb?a=blobdiff_plain;f=src%2Fceph%2Fdoc%2Fcephfs%2Ffull.rst;fp=src%2Fceph%2Fdoc%2Fcephfs%2Ffull.rst;h=cc9eb5961d7a675ce001dfedb13584f0d0b82c34;hb=812ff6ca9fcd3e629e49d4328905f33eee8ca3f5;hp=0000000000000000000000000000000000000000;hpb=15280273faafb77777eab341909a3f495cf248d9;p=stor4nfv.git diff --git a/src/ceph/doc/cephfs/full.rst b/src/ceph/doc/cephfs/full.rst new file mode 100644 index 0000000..cc9eb59 --- /dev/null +++ b/src/ceph/doc/cephfs/full.rst @@ -0,0 +1,60 @@ + +Handling a full Ceph filesystem +=============================== + +When a RADOS cluster reaches its ``mon_osd_full_ratio`` (default +95%) capacity, it is marked with the OSD full flag. This flag causes +most normal RADOS clients to pause all operations until it is resolved +(for example by adding more capacity to the cluster). + +The filesystem has some special handling of the full flag, explained below. + +Hammer and later +---------------- + +Since the hammer release, a full filesystem will lead to ENOSPC +results from: + + * Data writes on the client + * Metadata operations other than deletes and truncates + +Because the full condition may not be encountered until +data is flushed to disk (sometime after a ``write`` call has already +returned 0), the ENOSPC error may not be seen until the application +calls ``fsync`` or ``fclose`` (or equivalent) on the file handle. + +Calling ``fsync`` is guaranteed to reliably indicate whether the data +made it to disk, and will return an error if it doesn't. ``fclose`` will +only return an error if buffered data happened to be flushed since +the last write -- a successful ``fclose`` does not guarantee that the +data made it to disk, and in a full-space situation, buffered data +may be discarded after an ``fclose`` if no space is available to persist it. + +.. warning:: + If an application appears to be misbehaving on a full filesystem, + check that it is performing ``fsync()`` calls as necessary to ensure + data is on disk before proceeding. + +Data writes may be cancelled by the client if they are in flight at the +time the OSD full flag is sent. Clients update the ``osd_epoch_barrier`` +when releasing capabilities on files affected by cancelled operations, in +order to ensure that these cancelled operations do not interfere with +subsequent access to the data objects by the MDS or other clients. For +more on the epoch barrier mechanism, see :ref:`background_blacklisting_and_osd_epoch_barrier`. + +Legacy (pre-hammer) behavior +---------------------------- + +In versions of Ceph earlier than hammer, the MDS would ignore +the full status of the RADOS cluster, and any data writes from +clients would stall until the cluster ceased to be full. + +There are two dangerous conditions to watch for with this behaviour: + +* If a client had pending writes to a file, then it was not possible + for the client to release the file to the MDS for deletion: this could + lead to difficulty clearing space on a full filesystem +* If clients continued to create a large number of empty files, the + resulting metadata writes from the MDS could lead to total exhaustion + of space on the OSDs such that no further deletions could be performed. +