X-Git-Url: https://gerrit.opnfv.org/gerrit/gitweb?a=blobdiff_plain;f=src%2Fceph%2Fdoc%2Frbd%2Fqemu-rbd.rst;fp=src%2Fceph%2Fdoc%2Frbd%2Fqemu-rbd.rst;h=80c5dcc4187b585238a2a6ec77cc50651cfa0064;hb=812ff6ca9fcd3e629e49d4328905f33eee8ca3f5;hp=0000000000000000000000000000000000000000;hpb=15280273faafb77777eab341909a3f495cf248d9;p=stor4nfv.git diff --git a/src/ceph/doc/rbd/qemu-rbd.rst b/src/ceph/doc/rbd/qemu-rbd.rst new file mode 100644 index 0000000..80c5dcc --- /dev/null +++ b/src/ceph/doc/rbd/qemu-rbd.rst @@ -0,0 +1,218 @@ +======================== + QEMU and Block Devices +======================== + +.. index:: Ceph Block Device; QEMU KVM + +The most frequent Ceph Block Device use case involves providing block device +images to virtual machines. For example, a user may create a "golden" image +with an OS and any relevant software in an ideal configuration. Then, the user +takes a snapshot of the image. Finally, the user clones the snapshot (usually +many times). See `Snapshots`_ for details. The ability to make copy-on-write +clones of a snapshot means that Ceph can provision block device images to +virtual machines quickly, because the client doesn't have to download an entire +image each time it spins up a new virtual machine. + + +.. ditaa:: +---------------------------------------------------+ + | QEMU | + +---------------------------------------------------+ + | librbd | + +---------------------------------------------------+ + | librados | + +------------------------+-+------------------------+ + | OSDs | | Monitors | + +------------------------+ +------------------------+ + + +Ceph Block Devices can integrate with the QEMU virtual machine. For details on +QEMU, see `QEMU Open Source Processor Emulator`_. For QEMU documentation, see +`QEMU Manual`_. For installation details, see `Installation`_. + +.. important:: To use Ceph Block Devices with QEMU, you must have access to a + running Ceph cluster. + + +Usage +===== + +The QEMU command line expects you to specify the pool name and image name. You +may also specify a snapshot name. + +QEMU will assume that the Ceph configuration file resides in the default +location (e.g., ``/etc/ceph/$cluster.conf``) and that you are executing +commands as the default ``client.admin`` user unless you expressly specify +another Ceph configuration file path or another user. When specifying a user, +QEMU uses the ``ID`` rather than the full ``TYPE:ID``. See `User Management - +User`_ for details. Do not prepend the client type (i.e., ``client.``) to the +beginning of the user ``ID``, or you will receive an authentication error. You +should have the key for the ``admin`` user or the key of another user you +specify with the ``:id={user}`` option in a keyring file stored in default path +(i.e., ``/etc/ceph`` or the local directory with appropriate file ownership and +permissions. Usage takes the following form:: + + qemu-img {command} [options] rbd:{pool-name}/{image-name}[@snapshot-name][:option1=value1][:option2=value2...] + +For example, specifying the ``id`` and ``conf`` options might look like the following:: + + qemu-img {command} [options] rbd:glance-pool/maipo:id=glance:conf=/etc/ceph/ceph.conf + +.. tip:: Configuration values containing ``:``, ``@``, or ``=`` can be escaped with a + leading ``\`` character. + + +Creating Images with QEMU +========================= + +You can create a block device image from QEMU. You must specify ``rbd``, the +pool name, and the name of the image you wish to create. You must also specify +the size of the image. :: + + qemu-img create -f raw rbd:{pool-name}/{image-name} {size} + +For example:: + + qemu-img create -f raw rbd:data/foo 10G + +.. important:: The ``raw`` data format is really the only sensible + ``format`` option to use with RBD. Technically, you could use other + QEMU-supported formats (such as ``qcow2`` or ``vmdk``), but doing + so would add additional overhead, and would also render the volume + unsafe for virtual machine live migration when caching (see below) + is enabled. + + +Resizing Images with QEMU +========================= + +You can resize a block device image from QEMU. You must specify ``rbd``, +the pool name, and the name of the image you wish to resize. You must also +specify the size of the image. :: + + qemu-img resize rbd:{pool-name}/{image-name} {size} + +For example:: + + qemu-img resize rbd:data/foo 10G + + +Retrieving Image Info with QEMU +=============================== + +You can retrieve block device image information from QEMU. You must +specify ``rbd``, the pool name, and the name of the image. :: + + qemu-img info rbd:{pool-name}/{image-name} + +For example:: + + qemu-img info rbd:data/foo + + +Running QEMU with RBD +===================== + +QEMU can pass a block device from the host on to a guest, but since +QEMU 0.15, there's no need to map an image as a block device on +the host. Instead, QEMU can access an image as a virtual block +device directly via ``librbd``. This performs better because it avoids +an additional context switch, and can take advantage of `RBD caching`_. + +You can use ``qemu-img`` to convert existing virtual machine images to Ceph +block device images. For example, if you have a qcow2 image, you could run:: + + qemu-img convert -f qcow2 -O raw debian_squeeze.qcow2 rbd:data/squeeze + +To run a virtual machine booting from that image, you could run:: + + qemu -m 1024 -drive format=raw,file=rbd:data/squeeze + +`RBD caching`_ can significantly improve performance. +Since QEMU 1.2, QEMU's cache options control ``librbd`` caching:: + + qemu -m 1024 -drive format=rbd,file=rbd:data/squeeze,cache=writeback + +If you have an older version of QEMU, you can set the ``librbd`` cache +configuration (like any Ceph configuration option) as part of the +'file' parameter:: + + qemu -m 1024 -drive format=raw,file=rbd:data/squeeze:rbd_cache=true,cache=writeback + +.. important:: If you set rbd_cache=true, you must set cache=writeback + or risk data loss. Without cache=writeback, QEMU will not send + flush requests to librbd. If QEMU exits uncleanly in this + configuration, filesystems on top of rbd can be corrupted. + +.. _RBD caching: ../rbd-config-ref/#rbd-cache-config-settings + + +.. index:: Ceph Block Device; discard trim and libvirt + +Enabling Discard/TRIM +===================== + +Since Ceph version 0.46 and QEMU version 1.1, Ceph Block Devices support the +discard operation. This means that a guest can send TRIM requests to let a Ceph +block device reclaim unused space. This can be enabled in the guest by mounting +``ext4`` or ``XFS`` with the ``discard`` option. + +For this to be available to the guest, it must be explicitly enabled +for the block device. To do this, you must specify a +``discard_granularity`` associated with the drive:: + + qemu -m 1024 -drive format=raw,file=rbd:data/squeeze,id=drive1,if=none \ + -device driver=ide-hd,drive=drive1,discard_granularity=512 + +Note that this uses the IDE driver. The virtio driver does not +support discard. + +If using libvirt, edit your libvirt domain's configuration file using ``virsh +edit`` to include the ``xmlns:qemu`` value. Then, add a ``qemu:commandline`` +block as a child of that domain. The following example shows how to set two +devices with ``qemu id=`` to different ``discard_granularity`` values. + +.. code-block:: guess + + + + + + + + + + + +.. index:: Ceph Block Device; cache options + +QEMU Cache Options +================== + +QEMU's cache options correspond to the following Ceph `RBD Cache`_ settings. + +Writeback:: + + rbd_cache = true + +Writethrough:: + + rbd_cache = true + rbd_cache_max_dirty = 0 + +None:: + + rbd_cache = false + +QEMU's cache settings override Ceph's cache settings (including settings that +are explicitly set in the Ceph configuration file). + +.. note:: Prior to QEMU v2.4.0, if you explicitly set `RBD Cache`_ settings + in the Ceph configuration file, your Ceph settings override the QEMU cache + settings. + +.. _QEMU Open Source Processor Emulator: http://wiki.qemu.org/Main_Page +.. _QEMU Manual: http://wiki.qemu.org/Manual +.. _RBD Cache: ../rbd-config-ref/ +.. _Snapshots: ../rbd-snapshot/ +.. _Installation: ../../install +.. _User Management - User: ../../rados/operations/user-management#user