src/ceph/doc/install/manual-freebsd-deployment.rst

   1 ==============================
   2  Manual Deployment on FreeBSD
   3 ==============================
   4
   5 This a largely a copy of the regular Manual Deployment with FreeBSD specifics.
   6 The difference lies in two parts: The underlying diskformat, and the way to use
   7 the tools.
   8
   9 All Ceph clusters require at least one monitor, and at least as many OSDs as
  10 copies of an object stored on the cluster.  Bootstrapping the initial monitor(s)
  11 is the first step in deploying a Ceph Storage Cluster. Monitor deployment also
  12 sets important criteria for the entire cluster, such as the number of replicas
  13 for pools, the number of placement groups per OSD, the heartbeat intervals,
  14 whether authentication is required, etc. Most of these values are set by
  15 default, so it's useful to know about them when setting up your cluster for
  16 production.
  17
  18 Following the same configuration as `Installation (Quick)`_, we will set up a
  19 cluster with ``node1`` as  the monitor node, and ``node2`` and ``node3`` for
  20 OSD nodes.
  21
  22
  23
  24 .. ditaa::
  25            /------------------\         /----------------\
  26            |    Admin Node    |         |     node1      |
  27            |                  +-------->+                |
  28            |                  |         | cCCC           |
  29            \---------+--------/         \----------------/
  30                      |
  31                      |                  /----------------\
  32                      |                  |     node2      |
  33                      +----------------->+                |
  34                      |                  | cCCC           |
  35                      |                  \----------------/
  36                      |
  37                      |                  /----------------\
  38                      |                  |     node3      |
  39                      +----------------->|                |
  40                                         | cCCC           |
  41                                         \----------------/
  42
  43
  44
  45 Disklayout on FreeBSD
  46 =====================
  47
  48 Current implementation works on ZFS pools
  49
  50 * All Ceph data is created in /var/lib/ceph
  51 * Log files go into /var/log/ceph
  52 * PID files go into /var/log/run
  53 * One ZFS pool is allocated per OSD, like::
  54
  55     gpart create -s GPT ada1
  56     gpart add -t freebsd-zfs -l osd1 ada1
  57     zpool create -o mountpoint=/var/lib/ceph/osd/osd.1 osd
  58
  59 * Some cache and log (ZIL) can be attached.
  60   Please note that this is different from the Ceph journals. Cache and log are
  61   totally transparent for Ceph, and help the filesystem to keep the system
  62   consistant and help performance.
  63   Assuming that ada2 is an SSD::
  64
  65     gpart create -s GPT ada2
  66     gpart add -t freebsd-zfs -l osd1-log -s 1G ada2
  67     zpool add osd1 log gpt/osd1-log
  68     gpart add -t freebsd-zfs -l osd1-cache -s 10G ada2
  69     zpool add osd1 log gpt/osd1-cache
  70
  71 * Note: *UFS2 does not allow large xattribs*
  72
  73
  74 Configuration
  75 -------------
  76
  77 As per FreeBSD default parts of extra software go into ``/usr/local/``. Which
  78 means that for ``/etc/ceph.conf`` the default location is
  79 ``/usr/local/etc/ceph/ceph.conf``. Smartest thing to do is to create a softlink
  80 from ``/etc/ceph`` to ``/usr/local/etc/ceph``::
  81
  82   ln -s /usr/local/etc/ceph /etc/ceph
  83
  84 A sample file is provided in ``/usr/local/share/doc/ceph/sample.ceph.conf``
  85 Note that ``/usr/local/etc/ceph/ceph.conf`` will be found by most tools,
  86 linking it to ``/etc/ceph/ceph.conf`` will help with any scripts that are found
  87 in extra tools, scripts, and/or discussionlists.
  88
  89 Monitor Bootstrapping
  90 =====================
  91
  92 Bootstrapping a monitor (a Ceph Storage Cluster, in theory) requires
  93 a number of things:
  94
  95 - **Unique Identifier:** The ``fsid`` is a unique identifier for the cluster,
  96   and stands for File System ID from the days when the Ceph Storage Cluster was
  97   principally for the Ceph Filesystem. Ceph now supports native interfaces,
  98   block devices, and object storage gateway interfaces too, so ``fsid`` is a
  99   bit of a misnomer.
 100
 101 - **Cluster Name:** Ceph clusters have a cluster name, which is a simple string
 102   without spaces. The default cluster name is ``ceph``, but you may specify
 103   a different cluster name. Overriding the default cluster name is
 104   especially useful when you are working with multiple clusters and you need to
 105   clearly understand which cluster your are working with.
 106
 107   For example, when you run multiple clusters in a `federated architecture`_,
 108   the cluster name (e.g., ``us-west``, ``us-east``) identifies the cluster for
 109   the current CLI session. **Note:** To identify the cluster name on the
 110   command line interface, specify the a Ceph configuration file with the
 111   cluster name (e.g., ``ceph.conf``, ``us-west.conf``, ``us-east.conf``, etc.).
 112   Also see CLI usage (``ceph --cluster {cluster-name}``).
 113
 114 - **Monitor Name:** Each monitor instance within a cluster has a unique name.
 115   In common practice, the Ceph Monitor name is the host name (we recommend one
 116   Ceph Monitor per host, and no commingling of Ceph OSD Daemons with
 117   Ceph Monitors). You may retrieve the short hostname with ``hostname -s``.
 118
 119 - **Monitor Map:** Bootstrapping the initial monitor(s) requires you to
 120   generate a monitor map. The monitor map requires the ``fsid``, the cluster
 121   name (or uses the default), and at least one host name and its IP address.
 122
 123 - **Monitor Keyring**: Monitors communicate with each other via a
 124   secret key. You must generate a keyring with a monitor secret and provide
 125   it when bootstrapping the initial monitor(s).
 126
 127 - **Administrator Keyring**: To use the ``ceph`` CLI tools, you must have
 128   a ``client.admin`` user. So you must generate the admin user and keyring,
 129   and you must also add the ``client.admin`` user to the monitor keyring.
 130
 131 The foregoing requirements do not imply the creation of a Ceph Configuration
 132 file. However, as a best practice, we recommend creating a Ceph configuration
 133 file and populating it with the ``fsid``, the ``mon initial members`` and the
 134 ``mon host`` settings.
 135
 136 You can get and set all of the monitor settings at runtime as well. However,
 137 a Ceph Configuration file may contain only those settings that override the
 138 default values. When you add settings to a Ceph configuration file, these
 139 settings override the default settings. Maintaining those settings in a
 140 Ceph configuration file makes it easier to maintain your cluster.
 141
 142 The procedure is as follows:
 143
 144
 145 #. Log in to the initial monitor node(s)::
 146
 147         ssh {hostname}
 148
 149    For example::
 150
 151         ssh node1
 152
 153
 154 #. Ensure you have a directory for the Ceph configuration file. By default,
 155    Ceph uses ``/etc/ceph``. When you install ``ceph``, the installer will
 156    create the ``/etc/ceph`` directory automatically. ::
 157
 158         ls /etc/ceph
 159
 160    **Note:** Deployment tools may remove this directory when purging a
 161    cluster (e.g., ``ceph-deploy purgedata {node-name}``, ``ceph-deploy purge
 162    {node-name}``).
 163
 164 #. Create a Ceph configuration file. By default, Ceph uses
 165    ``ceph.conf``, where ``ceph`` reflects the cluster name. ::
 166
 167         sudo vim /etc/ceph/ceph.conf
 168
 169
 170 #. Generate a unique ID (i.e., ``fsid``) for your cluster. ::
 171
 172         uuidgen
 173
 174
 175 #. Add the unique ID to your Ceph configuration file. ::
 176
 177         fsid = {UUID}
 178
 179    For example::
 180
 181         fsid = a7f64266-0894-4f1e-a635-d0aeaca0e993
 182
 183
 184 #. Add the initial monitor(s) to your Ceph configuration file. ::
 185
 186         mon initial members = {hostname}[,{hostname}]
 187
 188    For example::
 189
 190         mon initial members = node1
 191
 192
 193 #. Add the IP address(es) of the initial monitor(s) to your Ceph configuration
 194    file and save the file. ::
 195
 196         mon host = {ip-address}[,{ip-address}]
 197
 198    For example::
 199
 200         mon host = 192.168.0.1
 201
 202    **Note:** You may use IPv6 addresses instead of IPv4 addresses, but
 203    you must set ``ms bind ipv6`` to ``true``. See `Network Configuration
 204    Reference`_ for details about network configuration.
 205
 206 #. Create a keyring for your cluster and generate a monitor secret key. ::
 207
 208         ceph-authtool --create-keyring /tmp/ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *'
 209
 210
 211 #. Generate an administrator keyring, generate a ``client.admin`` user and add
 212    the user to the keyring. ::
 213
 214         sudo ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n client.admin --set-uid=0 --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *' --cap mgr 'allow *'
 215
 216
 217 #. Add the ``client.admin`` key to the ``ceph.mon.keyring``. ::
 218
 219         ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
 220
 221
 222 #. Generate a monitor map using the hostname(s), host IP address(es) and the FSID.
 223    Save it as ``/tmp/monmap``::
 224
 225         monmaptool --create --add {hostname} {ip-address} --fsid {uuid} /tmp/monmap
 226
 227    For example::
 228
 229         monmaptool --create --add node1 192.168.0.1 --fsid a7f64266-0894-4f1e-a635-d0aeaca0e993 /tmp/monmap
 230
 231
 232 #. Create a default data directory (or directories) on the monitor host(s). ::
 233
 234         sudo mkdir /var/lib/ceph/mon/{cluster-name}-{hostname}
 235
 236    For example::
 237
 238         sudo mkdir /var/lib/ceph/mon/ceph-node1
 239
 240    See `Monitor Config Reference - Data`_ for details.
 241
 242 #. Populate the monitor daemon(s) with the monitor map and keyring. ::
 243
 244         sudo -u ceph ceph-mon [--cluster {cluster-name}] --mkfs -i {hostname} --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
 245
 246    For example::
 247
 248         sudo -u ceph ceph-mon --mkfs -i node1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
 249
 250
 251 #. Consider settings for a Ceph configuration file. Common settings include
 252    the following::
 253
 254         [global]
 255         fsid = {cluster-id}
 256         mon initial members = {hostname}[, {hostname}]
 257         mon host = {ip-address}[, {ip-address}]
 258         public network = {network}[, {network}]
 259         cluster network = {network}[, {network}]
 260         auth cluster required = cephx
 261         auth service required = cephx
 262         auth client required = cephx
 263         osd journal size = {n}
 264         osd pool default size = {n}  # Write an object n times.
 265         osd pool default min size = {n} # Allow writing n copy in a degraded state.
 266         osd pool default pg num = {n}
 267         osd pool default pgp num = {n}
 268         osd crush chooseleaf type = {n}
 269
 270    In the foregoing example, the ``[global]`` section of the configuration might
 271    look like this::
 272
 273         [global]
 274         fsid = a7f64266-0894-4f1e-a635-d0aeaca0e993
 275         mon initial members = node1
 276         mon host = 192.168.0.1
 277         public network = 192.168.0.0/24
 278         auth cluster required = cephx
 279         auth service required = cephx
 280         auth client required = cephx
 281         osd journal size = 1024
 282         osd pool default size = 2
 283         osd pool default min size = 1
 284         osd pool default pg num = 333
 285         osd pool default pgp num = 333
 286         osd crush chooseleaf type = 1
 287
 288 #. Touch the ``done`` file.
 289
 290    Mark that the monitor is created and ready to be started::
 291
 292         sudo touch /var/lib/ceph/mon/ceph-node1/done
 293
 294 #. And for FreeBSD an entry for every monitor needs to be added to the config
 295    file. (The requirement will be removed in future releases).
 296
 297    The entry should look like::
 298
 299      [mon]
 300          [mon.node1]
 301              host = node1    # this name can be resolve
 302
 303
 304 #. Start the monitor(s).
 305
 306    For Ubuntu, use Upstart::
 307
 308         sudo start ceph-mon id=node1 [cluster={cluster-name}]
 309
 310    In this case, to allow the start of the daemon at each reboot you
 311    must create two empty files like this::
 312
 313         sudo touch /var/lib/ceph/mon/{cluster-name}-{hostname}/upstart
 314
 315    For example::
 316
 317         sudo touch /var/lib/ceph/mon/ceph-node1/upstart
 318
 319    For Debian/CentOS/RHEL, use sysvinit::
 320
 321         sudo /etc/init.d/ceph start mon.node1
 322
 323    For FreeBSD we use the rc.d init scripts (called bsdrc in Ceph)::
 324
 325         sudo service ceph start start mon.node1
 326
 327    For this to work /etc/rc.conf also needs the entry to enable ceph::
 328      cat 'ceph_enable="YES"' >> /etc/rc.conf
 329
 330
 331 #. Verify that Ceph created the default pools. ::
 332
 333         ceph osd lspools
 334
 335    You should see output like this::
 336
 337         0 data,1 metadata,2 rbd,
 338
 339
 340 #. Verify that the monitor is running. ::
 341
 342         ceph -s
 343
 344    You should see output that the monitor you started is up and running, and
 345    you should see a health error indicating that placement groups are stuck
 346    inactive. It should look something like this::
 347
 348         cluster a7f64266-0894-4f1e-a635-d0aeaca0e993
 349           health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds
 350           monmap e1: 1 mons at {node1=192.168.0.1:6789/0}, election epoch 1, quorum 0 node1
 351           osdmap e1: 0 osds: 0 up, 0 in
 352           pgmap v2: 192 pgs, 3 pools, 0 bytes data, 0 objects
 353              0 kB used, 0 kB / 0 kB avail
 354              192 creating
 355
 356    **Note:** Once you add OSDs and start them, the placement group health errors
 357    should disappear. See the next section for details.
 358
 359
 360 Adding OSDs
 361 ===========
 362
 363 Once you have your initial monitor(s) running, you should add OSDs. Your cluster
 364 cannot reach an ``active + clean`` state until you have enough OSDs to handle the
 365 number of copies of an object (e.g., ``osd pool default size = 2`` requires at
 366 least two OSDs). After bootstrapping your monitor, your cluster has a default
 367 CRUSH map; however, the CRUSH map doesn't have any Ceph OSD Daemons mapped to
 368 a Ceph Node.
 369
 370
 371 Short Form
 372 ----------
 373
 374 Ceph provides the ``ceph-disk`` utility, which can prepare a disk, partition or
 375 directory for use with Ceph. The ``ceph-disk`` utility creates the OSD ID by
 376 incrementing the index. Additionally, ``ceph-disk`` will add the new OSD to the
 377 CRUSH map under the host for you. Execute ``ceph-disk -h`` for CLI details.
 378 The ``ceph-disk`` utility automates the steps of the `Long Form`_ below. To
 379 create the first two OSDs with the short form procedure, execute the following
 380 on  ``node2`` and ``node3``:
 381
 382
 383 #. Prepare the OSD.
 384
 385    On FreeBSD only existing directories can be use to create OSDs in::
 386
 387         ssh {node-name}
 388         sudo ceph-disk prepare --cluster {cluster-name} --cluster-uuid {uuid} {path-to-ceph-osd-directory}
 389
 390    For example::
 391
 392         ssh node1
 393         sudo ceph-disk prepare --cluster ceph --cluster-uuid a7f64266-0894-4f1e-a635-d0aeaca0e993 /var/lib/ceph/osd/osd.1
 394
 395
 396 #. Activate the OSD::
 397
 398         sudo ceph-disk activate {data-path} [--activate-key {path}]
 399
 400    For example::
 401
 402         sudo ceph-disk activate /var/lib/ceph/osd/osd.1
 403
 404    **Note:** Use the ``--activate-key`` argument if you do not have a copy
 405    of ``/var/lib/ceph/bootstrap-osd/{cluster}.keyring`` on the Ceph Node.
 406
 407    FreeBSD does not auto start the OSDs, but also requires a entry in
 408    ``ceph.conf``. One for each OSD::
 409
 410      [osd]
 411      [osd.1]
 412          host = node1    # this name can be resolve
 413
 414
 415 Long Form
 416 ---------
 417
 418 Without the benefit of any helper utilities, create an OSD and add it to the
 419 cluster and CRUSH map with the following procedure. To create the first two
 420 OSDs with the long form procedure, execute the following on ``node2`` and
 421 ``node3``:
 422
 423 #. Connect to the OSD host. ::
 424
 425         ssh {node-name}
 426
 427 #. Generate a UUID for the OSD. ::
 428
 429         uuidgen
 430
 431
 432 #. Create the OSD. If no UUID is given, it will be set automatically when the
 433    OSD starts up. The following command will output the OSD number, which you
 434    will need for subsequent steps. ::
 435
 436         ceph osd create [{uuid} [{id}]]
 437
 438
 439 #. Create the default directory on your new OSD. ::
 440
 441         ssh {new-osd-host}
 442         sudo mkdir /var/lib/ceph/osd/{cluster-name}-{osd-number}
 443
 444    Above are the ZFS instructions to do this for FreeBSD.
 445
 446
 447 #. If the OSD is for a drive other than the OS drive, prepare it
 448    for use with Ceph, and mount it to the directory you just created.
 449
 450
 451 #. Initialize the OSD data directory. ::
 452
 453         ssh {new-osd-host}
 454         sudo ceph-osd -i {osd-num} --mkfs --mkkey --osd-uuid [{uuid}]
 455
 456    The directory must be empty before you can run ``ceph-osd`` with the
 457    ``--mkkey`` option. In addition, the ceph-osd tool requires specification
 458    of custom cluster names with the ``--cluster`` option.
 459
 460
 461 #. Register the OSD authentication key. The value of ``ceph`` for
 462    ``ceph-{osd-num}`` in the path is the ``$cluster-$id``.  If your
 463    cluster name differs from ``ceph``, use your cluster name instead.::
 464
 465         sudo ceph auth add osd.{osd-num} osd 'allow *' mon 'allow profile osd' -i /var/lib/ceph/osd/{cluster-name}-{osd-num}/keyring
 466
 467
 468 #. Add your Ceph Node to the CRUSH map. ::
 469
 470         ceph [--cluster {cluster-name}] osd crush add-bucket {hostname} host
 471
 472    For example::
 473
 474         ceph osd crush add-bucket node1 host
 475
 476
 477 #. Place the Ceph Node under the root ``default``. ::
 478
 479         ceph osd crush move node1 root=default
 480
 481
 482 #. Add the OSD to the CRUSH map so that it can begin receiving data. You may
 483    also decompile the CRUSH map, add the OSD to the device list, add the host as a
 484    bucket (if it's not already in the CRUSH map), add the device as an item in the
 485    host, assign it a weight, recompile it and set it. ::
 486
 487         ceph [--cluster {cluster-name}] osd crush add {id-or-name} {weight} [{bucket-type}={bucket-name} ...]
 488
 489    For example::
 490
 491         ceph osd crush add osd.0 1.0 host=node1
 492
 493
 494 #. After you add an OSD to Ceph, the OSD is in your configuration. However,
 495    it is not yet running. The OSD is ``down`` and ``in``. You must start
 496    your new OSD before it can begin receiving data.
 497
 498    For Ubuntu, use Upstart::
 499
 500         sudo start ceph-osd id={osd-num} [cluster={cluster-name}]
 501
 502    For example::
 503
 504         sudo start ceph-osd id=0
 505         sudo start ceph-osd id=1
 506
 507    For Debian/CentOS/RHEL, use sysvinit::
 508
 509         sudo /etc/init.d/ceph start osd.{osd-num} [--cluster {cluster-name}]
 510
 511    For example::
 512
 513         sudo /etc/init.d/ceph start osd.0
 514         sudo /etc/init.d/ceph start osd.1
 515
 516    In this case, to allow the start of the daemon at each reboot you
 517    must create an empty file like this::
 518
 519         sudo touch /var/lib/ceph/osd/{cluster-name}-{osd-num}/sysvinit
 520
 521    For example::
 522
 523         sudo touch /var/lib/ceph/osd/ceph-0/sysvinit
 524         sudo touch /var/lib/ceph/osd/ceph-1/sysvinit
 525
 526    Once you start your OSD, it is ``up`` and ``in``.
 527
 528    For FreeBSD using rc.d init.
 529
 530    After adding the OSD to ``ceph.conf``::
 531
 532         sudo service ceph start osd.{osd-num}
 533
 534    For example::
 535
 536         sudo service ceph start osd.0
 537         sudo service ceph start osd.1
 538
 539    In this case, to allow the start of the daemon at each reboot you
 540    must create an empty file like this::
 541
 542         sudo touch /var/lib/ceph/osd/{cluster-name}-{osd-num}/bsdrc
 543
 544    For example::
 545
 546         sudo touch /var/lib/ceph/osd/ceph-0/bsdrc
 547         sudo touch /var/lib/ceph/osd/ceph-1/bsdrc
 548
 549    Once you start your OSD, it is ``up`` and ``in``.
 550
 551
 552
 553 Adding MDS
 554 ==========
 555
 556 In the below instructions, ``{id}`` is an arbitrary name, such as the hostname of the machine.
 557
 558 #. Create the mds data directory.::
 559
 560         mkdir -p /var/lib/ceph/mds/{cluster-name}-{id}
 561
 562 #. Create a keyring.::
 563
 564         ceph-authtool --create-keyring /var/lib/ceph/mds/{cluster-name}-{id}/keyring --gen-key -n mds.{id}
 565
 566 #. Import the keyring and set caps.::
 567
 568         ceph auth add mds.{id} osd "allow rwx" mds "allow" mon "allow profile mds" -i /var/lib/ceph/mds/{cluster}-{id}/keyring
 569
 570 #. Add to ceph.conf.::
 571
 572         [mds.{id}]
 573         host = {id}
 574
 575 #. Start the daemon the manual way.::
 576
 577         ceph-mds --cluster {cluster-name} -i {id} -m {mon-hostname}:{mon-port} [-f]
 578
 579 #. Start the daemon the right way (using ceph.conf entry).::
 580
 581         service ceph start
 582
 583 #. If starting the daemon fails with this error::
 584
 585         mds.-1.0 ERROR: failed to authenticate: (22) Invalid argument
 586
 587    Then make sure you do not have a keyring set in ceph.conf in the global section; move it to the client section; or add a keyring setting specific to this mds daemon. And verify that you see the same key in the mds data directory and ``ceph auth get mds.{id}`` output.
 588
 589 #. Now you are ready to `create a Ceph filesystem`_.
 590
 591
 592 Summary
 593 =======
 594
 595 Once you have your monitor and two OSDs up and running, you can watch the
 596 placement groups peer by executing the following::
 597
 598         ceph -w
 599
 600 To view the tree, execute the following::
 601
 602         ceph osd tree
 603
 604 You should see output that looks something like this::
 605
 606         # id    weight  type name       up/down reweight
 607         -1      2       root default
 608         -2      2               host node1
 609         0       1                       osd.0   up      1
 610         -3      1               host node2
 611         1       1                       osd.1   up      1
 612
 613 To add (or remove) additional monitors, see `Add/Remove Monitors`_.
 614 To add (or remove) additional Ceph OSD Daemons, see `Add/Remove OSDs`_.
 615
 616
 617 .. _federated architecture: ../../radosgw/federated-config
 618 .. _Installation (Quick): ../../start
 619 .. _Add/Remove Monitors: ../../rados/operations/add-or-rm-mons
 620 .. _Add/Remove OSDs: ../../rados/operations/add-or-rm-osds
 621 .. _Network Configuration Reference: ../../rados/configuration/network-config-ref
 622 .. _Monitor Config Reference - Data: ../../rados/configuration/mon-config-ref#data
 623 .. _create a Ceph filesystem: ../../cephfs/createfs