1 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
2 .. http://creativecommons.org/licenses/by/4.0
3 .. (c) OPNFV, Intel Corporation and others.
7 Barometer has enabled the following collectd plugins:
9 * dpdkstat plugin: A read plugin that retrieve stats from the DPDK extended
12 * `ceilometer plugin`_: A write plugin that pushes the retrieved stats to
13 Ceilometer. It's capable of pushing any stats read through collectd to
14 Ceilometer, not just the DPDK stats.
16 * hugepages plugin: A read plugin that retrieves the number of available
17 and free hugepages on a platform as well as what is available in terms of
20 * RDT plugin: A read plugin that provides the last level cache utilitzation and
21 memory bandwidth utilization
23 All the plugins above are available on the collectd master, except for the
24 ceilometer plugin as it's a python based plugin and only C plugins are accepted
25 by the collectd community. The ceilometer plugin lives in the OpenStack
28 Other plugins under development or existing as a pull request into collectd master:
30 * dpdkevents: A read plugin that retrieves DPDK link status and DPDK
31 forwarding cores liveliness status (DPDK Keep Alive).
33 * Open vSwitch events Plugin: A read plugin that retrieves events from OVS.
35 * Open vSwitch stats Plugin: A read plugin that retrieve flow and interface
38 * mcelog plugin: A read plugin that uses mcelog client protocol to check for
39 memory Machine Check Exceptions and sends the stats for reported exceptions.
41 * SNMP write: A write plugin that will act as a SNMP subagent and will map
42 collectd metrics to relavent OIDs. Will only support SNMP: get, getnext and
45 * Legacy/IPMI: A read plugin that will report platform thermals, voltages,
48 Building collectd with the Barometer plugins and installing the dependencies
49 =============================================================================
51 All Upstreamed plugins
52 -----------------------
53 The plugins that have been merged to the collectd master branch can all be
54 built and configured through the barometer repository.
56 **Note**: sudo permissions are required to install collectd.
58 **Note**: These are instructions for Ubuntu 16.04.
60 To build and install these dependencies, clone the barometer repo:
64 $ git clone https://gerrit.opnfv.org/gerrit/barometer
66 Install the build dependencies
70 $ ./src/install_build_deps.sh
72 To install collectd as a service and install all it's dependencies:
76 $ cd barometer/src && sudo make && sudo make install
78 This will install collectd as a service and the base install directory
81 Sample configuration files can be found in '/opt/collectd/etc/collectd.conf.d'
83 Please note if you are using any Open vSwitch plugins you need to run:
87 $ sudo ovs-vsctl set-manager ptcp:6640
89 DPDK statistics plugin
90 -----------------------
91 Repo: https://github.com/collectd/collectd
95 Dependencies: DPDK (http://dpdk.org/)
97 To build and install DPDK to /usr please see:
98 https://github.com/collectd/collectd/blob/master/docs/BUILD.dpdkstat.md
100 Building and installing collectd:
104 $ git clone https://github.com/collectd/collectd.git
107 $ ./configure --enable-syslog --enable-logfile --enable-debug
112 This will install collectd to /opt/collectd
113 The collectd configuration file can be found at /opt/collectd/etc
114 To configure the hugepages plugin you need to modify the configuration file to
122 ProcessType "secondary"
124 EnabledPortMask 0xffff
127 For more information on the plugin parameters, please see:
128 https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod
130 Please note if you are configuring collectd with the **static DPDK library**
131 you must compile the DPDK library with the -fPIC flag:
135 $ make EXTRA_CFLAGS=-fPIC
137 You must also modify the configuration step when building collectd:
141 $ ./configure CFLAGS=" -lpthread -Wl,--whole-archive -Wl,-ldpdk -Wl,-lm -Wl,-lrt -Wl,-lpcap -Wl,-ldl -Wl,--no-whole-archive"
143 Please also note that if you are not building and installing DPDK system-wide
144 you will need to specify the specific paths to the header files and libraries
145 using LIBDPDK_CPPFLAGS and LIBDPDK_LDFLAGS. You will also need to add the DPDK
146 library symbols to the shared library path using ldconfig. Note that this
147 update to the shared library path is not persistant (i.e. it will not survive a
148 reboot). Pending a merge of https://github.com/collectd/collectd/pull/2073.
152 $ ./configure LIBDPDK_CPPFLAGS="path to DPDK header files" LIBDPDK_LDFLAGS="path to DPDK libraries"
157 Repo: https://github.com/collectd/collectd
161 Dependencies: None, but assumes hugepages are configured.
163 To configure some hugepages:
167 sudo mkdir -p /mnt/huge
168 sudo mount -t hugetlbfs nodev /mnt/huge
169 sudo echo 14336 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
171 Building and installing collectd:
175 $ git clone https://github.com/collectd/collectd.git
178 $ ./configure --enable-syslog --enable-logfile --enable-hugepages --enable-debug
182 This will install collectd to /opt/collectd
183 The collectd configuration file can be found at /opt/collectd/etc
184 To configure the hugepages plugin you need to modify the configuration file to
195 ValuesPercentage false
198 For more information on the plugin parameters, please see:
199 https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod
203 Repo: https://github.com/collectd/collectd
209 * PQoS/Intel RDT library https://github.com/01org/intel-cmt-cat.git
212 Building and installing PQoS/Intel RDT library:
216 $ git clone https://github.com/01org/intel-cmt-cat.git
217 $ cd intel-cmt-cat.git
219 $ make install PREFIX=/usr
221 Building and installing collectd:
225 $ git clone https://github.com/collectd/collectd.git
228 $ ./configure --enable-syslog --enable-logfile --with-libpqos=/usr/ --enable-debug
232 This will install collectd to /opt/collectd
233 The collectd configuration file can be found at /opt/collectd/etc
234 To configure the RDT plugin you need to modify the configuration file to
239 <LoadPlugin intel_rdt>
246 For more information on the plugin parameters, please see:
247 https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod
251 Repo: https://github.com/maryamtahhan/collectd
257 Start by installing mcelog. Note: The kernel has to have CONFIG_X86_MCE
258 enabled. For 32bit kernels you need at least a 2.6,30 kernel.
264 $ apt-get update && apt-get install mcelog
270 $ git clone git://git.kernel.org/pub/scm/utils/cpu/mce/mcelog.git
275 $ cp mcelog.service /etc/systemd/system/
276 $ systemctl enable mcelog.service
277 $ systemctl start mcelog.service
279 Verify you got a /dev/mcelog. You can verify the daemon is running completely
286 This should query the information in the running daemon. If it prints nothing
287 that is fine (no errors logged yet). More info @
288 http://www.mcelog.org/installation.html
290 Modify the mcelog configuration file "/etc/mcelog/mcelog.conf" to include or
295 socket-path = /var/run/mcelog-client
297 Clone and install the collectd mcelog plugin:
301 $ git clone https://github.com/maryamtahhan/collectd
303 $ git checkout feat_ras
305 $ ./configure --enable-syslog --enable-logfile --enable-debug
309 This will install collectd to /opt/collectd
310 The collectd configuration file can be found at /opt/collectd/etc
311 To configure the mcelog plugin you need to modify the configuration file to
320 McelogClientSocket "/var/run/mcelog-client"
323 For more information on the plugin parameters, please see:
324 https://github.com/maryamtahhan/collectd/blob/feat_ras/src/collectd.conf.pod
326 Simulating a Machine Check Exception can be done in one of 3 ways:
328 * Running $make test in the mcelog cloned directory - mcelog test suite
332 **mcelog test suite:**
334 It is always a good idea to test an error handling mechanism before it is
335 really needed. mcelog includes a test suite. The test suite relies on
336 mce-inject which needs to be installed and in $PATH.
338 You also need the mce-inject kernel module configured (with
339 CONFIG_X86_MCE_INJECT=y), compiled, installed and loaded:
343 $ modprobe mce-inject
345 Then you can run the mcelog test suite with
351 This will inject different classes of errors and check that the mcelog triggers
352 runs. There will be some kernel messages about page offlining attempts. The
353 test will also lose a few pages of memory in your system (not significant)
354 **Note this test will kill any running mcelog, which needs to be restarted
355 manually afterwards**.
359 A utility to inject corrected, uncorrected and fatal machine check exceptions
363 $ git clone https://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git
366 $ modprobe mce-inject
367 $ ./mce-inject < test/corrected
369 **Note: the uncorrected and fatal scripts under test will cause a platform reset.
370 Only the fatal script generates the memory errors**. In order to quickly
371 emulate uncorrected memory errors and avoid host reboot following test errors
372 from mce-test suite can be injected:
376 $ mce-inject mce-test/cases/coverage/soft-inj/recoverable_ucr/data/srao_mem_scrub
380 In addition an more in-depth test of the Linux kernel machine check facilities
381 can be done with the mce-test test suite. mce-test supports testing uncorrected
382 error handling, real error injection, handling of different soft offlining
383 cases, and other tests.
385 **Corrected memory error injection:**
387 To inject corrected memory errors:
389 * Remove sb_edac and edac_core kernel modules: rmmod sb_edac rmmod edac_core
390 * Insert einj module: modprobe einj param_extension=1
391 * Inject an error by specifying details (last command should be repeated at least two times):
395 $ APEI_IF=/sys/kernel/debug/apei/einj
396 $ echo 0x8 > $APEI_IF/error_type
397 $ echo 0x01f5591000 > $APEI_IF/param1
398 $ echo 0xfffffffffffff000 > $APEI_IF/param2
399 $ echo 1 > $APEI_IF/notrigger
400 $ echo 1 > $APEI_IF/error_inject
402 * Check the MCE statistic: mcelog --client. Check the mcelog log for injected error details: less /var/log/mcelog.
405 ---------------------
406 Repo: https://github.com/maryamtahhan/collectd
408 Branch: feat_ovs_link, feat_ovs_stats
410 Dependencies: Open vSwitch, libyajl
412 On Ubuntu, install the dependencies:
416 $ sudo apt-get install libyajl-dev openvswitch-switch
418 Start the Open vSwitch service:
422 $ sudo service openvswitch-switch start
424 configure the ovsdb-server manager:
428 $ sudo ovs-vsctl set-manager ptcp:6640
431 Clone and install the collectd ovs plugin:
435 $ git clone https://github.com/maryamtahhan/collectd
437 $ git checkout $BRANCH
439 $ ./configure --enable-syslog --enable-logfile --enable-debug
443 Where $BRANCH is feat_ovs_link or feat_ovs_stats.
445 This will install collectd to /opt/collectd
446 The collectd configuration file can be found at /opt/collectd/etc
447 To configure the OVS plugins you need to modify the configuration file to
452 <LoadPlugin ovs_events>
455 <Plugin "ovs_events">
457 Socket "/var/run/openvswitch/db.sock"
458 Interfaces "br0" "veth0"
459 SendNotification false
462 For more information on the plugin parameters, please see:
463 https://github.com/maryamtahhan/collectd/blob/feat_ovs_link/src/collectd.conf.pod
465 https://github.com/maryamtahhan/collectd/blob/feat_ovs_stats/src/collectd.conf.pod
467 Installing collectd as a service
468 --------------------------------
469 Collectd service scripts are available in the collectd/contrib directory.
470 To install collectd as a service:
474 $ sudo cp contrib/systemd.collectd.service /etc/systemd/system/
475 $ cd /etc/systemd/system/
476 $ sudo mv systemd.collectd.service collectd.service
477 $ sudo chmod +x collectd.service
479 Modify collectd.service
484 ExecStart=/opt/collectd/sbin/collectd
485 EnvironmentFile=-/opt/collectd/etc/
486 EnvironmentFile=-/opt/collectd/etc/
487 CapabilityBoundingSet=CAP_SETUID CAP_SETGID
493 $ sudo systemctl daemon-reload
494 $ sudo systemctl start collectd.service
495 $ sudo systemctl status collectd.service should show success
497 Additional useful plugins
498 --------------------------
503 Can be used to show you when notifications are being generated by calling a
504 bash script that dumps notifications to file. (handy for debug). Modify
505 /opt/collectd/etc/collectd.conf:
511 # Exec "user:group" "/path/to/exec"
512 NotificationExec "user" "<path to barometer>/barometer/src/collectd/collectd_sample_configs/write_notification.sh"
515 write_notification.sh (just writes the notification passed from exec through
516 STDIN to a file (/tmp/notifications)):
521 rm -f /tmp/notifications
524 echo $x$y >> /tmp/notifications
527 output to /tmp/notifications should look like:
537 TypeInstance:link_status
538 uuid:f2aafeec-fa98-4e76-aec5-18ae9fc74589
540 linkstate of "br-ex" interface has been changed to "DOWN"
544 Can be used to log collectd activity. Modify /opt/collectd/etc/collectd.conf to
552 File "/var/log/collectd.log"
557 Monitoring Interfaces and Openstack Support
558 -------------------------------------------
559 .. Figure:: monitoring_interfaces.png
561 Monitoring Interfaces and Openstack Support
563 The figure above shows the DPDK L2 forwarding application running on a compute
564 node, sending and receiving traffic. collectd is also running on this compute
565 node retrieving the stats periodically from DPDK through the dpdkstat plugin
566 and publishing the retrieved stats to Ceilometer through the ceilometer plugin.
568 To see this demo in action please checkout: `Barometer OPNFV Summit demo`_
572 [1] https://collectd.org/wiki/index.php/Naming_schema
573 [2] https://github.com/collectd/collectd/blob/master/src/daemon/plugin.h
574 [3] https://collectd.org/wiki/index.php/Value_list_t
575 [4] https://collectd.org/wiki/index.php/Data_set
576 [5] https://collectd.org/documentation/manpages/types.db.5.shtml
577 [6] https://collectd.org/wiki/index.php/Data_source
578 [7] https://collectd.org/wiki/index.php/Meta_Data_Interface
580 .. _Barometer OPNFV Summit demo: https://prezi.com/kjv6o8ixs6se/software-fastpath-service-quality-metrics-demo/
581 .. _ceilometer plugin: https://github.com/openstack/collectd-ceilometer-plugin/tree/stable/mitaka