3 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
4 .. http://creativecommons.org/licenses/by/4.0
5 .. (c) Anuket and others
7 ===========================
8 Anuket Barometer User Guide
9 ===========================
11 Barometer collectd plugins description
12 ---------------------------------------
13 .. Describe the specific features and how it is realised in the scenario in a brief manner
14 .. to ensure the user understand the context for the user guide instructions to follow.
16 Collectd is a daemon which collects system performance statistics periodically
17 and provides a variety of mechanisms to publish the collected metrics. It
18 supports more than 90 different input and output plugins. Input plugins
19 retrieve metrics and publish them to the collectd deamon, while output plugins
20 publish the data they receive to an end point. collectd also has infrastructure
21 to support thresholding and notification.
23 Barometer has enabled the following collectd plugins:
25 * *dpdk_telemetry plugin*: A read plugin to collect dpdk interface stats and
26 application or global stats from dpdk telemetry library. The ``dpdk_telemetry``
27 plugin provides both DPDK NIC Stats and DPDK application stats.
28 This plugin doesn't deal with dpdk events.
29 The mimimum dpdk version required to use this plugin is 19.08.
32 The ``dpdk_telemetry`` plugin should only be used if your dpdk application
33 doesn't already have more relevant metrics available (e.g.ovs_stats).
35 * `gnocchi plugin`_: A write plugin that pushes the retrieved stats to
36 Gnocchi. It's capable of pushing any stats read through collectd to
37 Gnocchi, not just the DPDK stats.
39 * `aodh plugin`_: A notification plugin that pushes events to Aodh, and
40 creates/updates alarms appropriately.
42 * *hugepages plugin*: A read plugin that retrieves the number of available
43 and free hugepages on a platform as well as what is available in terms of
46 * *Open vSwitch events Plugin*: A read plugin that retrieves events from OVS.
48 * *Open vSwitch stats Plugin*: A read plugin that retrieves flow and interface
51 * *mcelog plugin*: A read plugin that uses mcelog client protocol to check for
52 memory Machine Check Exceptions and sends the stats for reported exceptions.
54 * *PMU plugin*: A read plugin that provides performance counters data on
55 Intel CPUs using Linux perf interface.
57 * *RDT plugin*: A read plugin that provides the last level cache utilization and
58 memory bandwidth utilization.
60 * *virt*: A read plugin that uses virtualization API *libvirt* to gather
61 statistics about virtualized guests on a system directly from the hypervisor,
62 without a need to install collectd instance on the guest.
64 * *SNMP Agent*: A write plugin that will act as a AgentX subagent that receives
65 and handles queries from SNMP master agent and returns the data collected
66 by read plugins. The SNMP Agent plugin handles requests only for OIDs
67 specified in configuration file. To handle SNMP queries the plugin gets data
68 from collectd and translates requested values from collectd's internal format
69 to SNMP format. Supports SNMP: get, getnext and walk requests.
71 All the plugins above are available on the collectd main branch, except for
72 the Gnocchi and Aodh plugins as they are Python-based plugins and only C
73 plugins are accepted by the collectd community. The Gnocchi and Aodh plugins
74 live in the OpenStack repositories.
76 .. TODO: Update this to reflect merging of these PRs
77 Other plugins existing as a pull request into collectd main:
79 * *Legacy/IPMI*: A read plugin that reports platform thermals, voltages,
80 fanspeed, current, flow, power etc. Also, the plugin monitors Intelligent
81 Platform Management Interface (IPMI) System Event Log (SEL) and sends the
82 appropriate notifications based on monitored SEL events.
84 * *PCIe AER*: A read plugin that monitors PCIe standard and advanced errors and
85 sends notifications about those errors.
88 Third party application in Barometer repository:
90 * *Open vSwitch PMD stats*: An aplication that retrieves PMD stats from OVS. It is run
93 **Plugins and application included in the Euphrates release:**
95 Write Plugins: aodh plugin, SNMP agent plugin, gnocchi plugin.
97 Read Plugins/application: Intel RDT plugin, virt plugin, Open vSwitch stats plugin,
98 Open vSwitch PMD stats application.
100 Collectd capabilities and usage
101 -------------------------------
102 .. Describe the specific capabilities and usage for <XYZ> feature.
103 .. Provide enough information that a user will be able to operate the feature on a deployed scenario.
105 The collectd plugins in Anuket are configured with reasonable defaults, but can
108 Building all Barometer upstreamed plugins from scratch
109 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
110 The plugins that have been merged to the collectd main branch can all be
111 built and configured through the barometer repository.
114 * sudo permissions are required to install collectd.
115 * These instructions are for Centos 7.
117 To build all the upstream plugins, clone the barometer repo:
121 $ git clone https://gerrit.opnfv.org/gerrit/barometer
123 To install collectd as a service and install all it's dependencies:
127 $ cd barometer/systems && ./build_base_machine.sh
129 This will install collectd as a service and the base install directory
130 will be /opt/collectd.
132 Sample configuration files can be found in '/opt/collectd/etc/collectd.conf.d'
135 If you don't want to use one of the Barometer plugins, simply remove the
136 sample config file from '/opt/collectd/etc/collectd.conf.d'
138 If you plan on using the Exec plugin (for OVS_PMD_STATS or for executing scripts
139 on notification generation), the plugin requires a non-root user to execute scripts.
140 By default, `collectd_exec` user is used in the exec.conf provided in the sample
141 configurations directory under src/collectd in the Barometer repo. These scripts *DO NOT* create this user.
142 You need to create this user or modify the configuration in the sample configurations directory
143 under src/collectd to use another existing non root user before running build_base_machine.sh.
146 If you are using any Open vSwitch plugins you need to run:
150 $ sudo ovs-vsctl set-manager ptcp:6640
152 After this, you should be able to start collectd as a service
156 $ sudo systemctl status collectd
158 If you want to use granfana to display the metrics you collect, please see:
161 For more information on configuring and installing OpenStack plugins for
162 collectd, check out the `collectd-openstack-plugins GSG`_.
164 Below is the per plugin installation and configuration guide, if you only want
165 to install some/particular plugins.
167 DPDK telemetry plugin
168 ^^^^^^^^^^^^^^^^^^^^^
169 Repo: https://github.com/collectd/collectd
173 Dependencies: `DPDK <https://www.dpdk.org/>`_ (runtime), libjansson (compile-time)
175 .. note:: DPDK telemetry plugin requires DPDK version 19.08 or later.
177 To build and install DPDK to /usr please see:
178 https://github.com/collectd/collectd/blob/main/docs/BUILD.dpdkstat.md
180 Building and installing collectd:
184 $ git clone https://github.com/collectd/collectd.git
187 $ ./configure --enable-syslog --enable-logfile --enable-debug
191 This will install collectd to default folder ``/opt/collectd``. The collectd
192 configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``.
194 To configure the dpdk_telemetry plugin you need to modify the configuration file to
199 LoadPlugin dpdk_telemetry
200 <Plugin dpdk_telemetry>
201 #ClientSocketPath "/var/run/.client"
202 #DpdkSocketPath "/var/run/dpdk/rte/telemetry"
205 The plugin uses default values (as shown) for the socket paths, if you use different values,
206 uncomment and update ``ClientSocketPath`` and ``DpdkSocketPath`` as required.
208 For more information on the plugin parameters, please see:
209 https://github.com/collectd/collectd/blob/main/src/collectd.conf.pod
213 To gather metrics from a DPDK application, telemetry needs to be enabled.
214 This can be done by setting the ``CONFIG_RTE_LIBRTE_TELEMETRY=y`` config flag.
215 The application then needs to be run with the ``--telemetry`` EAL option, e.g.
217 $dpdk/app/testpmd --telemetry -l 2,3,4 -n 4
219 For more information on the ``dpdk_telemetry`` plugin, see the `anuket wiki <https://wiki.anuket.io/display/HOME/DPDK+Telemetry+Plugin>`_.
221 The Address-Space Layout Randomization (ASLR) security feature in Linux should be
222 disabled, in order for the same hugepage memory mappings to be present in all
223 DPDK multi-process applications.
229 $ sudo echo 0 > /proc/sys/kernel/randomize_va_space
231 To fully enable ASLR:
235 $ sudo echo 2 > /proc/sys/kernel/randomize_va_space
237 .. warning:: Disabling Address-Space Layout Randomization (ASLR) may have security
238 implications. It is recommended to be disabled only when absolutely necessary,
239 and only when all implications of this change have been understood.
241 For more information on multi-process support, please see:
242 https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html
247 Repo: https://github.com/collectd/collectd
251 Dependencies: None, but assumes hugepages are configured.
253 To configure some hugepages:
257 $ sudo mkdir -p /mnt/huge
258 $ sudo mount -t hugetlbfs nodev /mnt/huge
259 $ sudo bash -c "echo 14336 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages"
261 Building and installing collectd:
265 $ git clone https://github.com/collectd/collectd.git
268 $ ./configure --enable-syslog --enable-logfile --enable-hugepages --enable-debug
272 This will install collectd to default folder ``/opt/collectd``. The collectd
273 configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``.
274 To configure the hugepages plugin you need to modify the configuration file to
277 .. literalinclude:: ../../../src/collectd/collectd_sample_configs/hugepages.conf
278 :start-at: LoadPlugin
281 For more information on the plugin parameters, please see:
282 https://github.com/collectd/collectd/blob/main/src/collectd.conf.pod
286 Repo: https://github.com/collectd/collectd
292 * PMU tools (jevents library) https://github.com/andikleen/pmu-tools
294 To be suitable for use in collectd plugin shared library *libjevents* should be
295 compiled as position-independent code. To do this add the following line to
296 *pmu-tools/jevents/Makefile*:
302 Building and installing *jevents* library:
306 $ git clone https://github.com/andikleen/pmu-tools.git
307 $ cd pmu-tools/jevents/
311 Download the Hardware Events that are relevant to your CPU, download the appropriate
312 CPU event list json file:
316 $ wget https://raw.githubusercontent.com/andikleen/pmu-tools/main/event_download.py
317 $ python event_download.py
319 This will download the json files to the location: $HOME/.cache/pmu-events/. If you don't want to
320 download these files to the aforementioned location, set the environment variable XDG_CACHE_HOME to
321 the location you want the files downloaded to.
323 Building and installing collectd:
327 $ git clone https://github.com/collectd/collectd.git
330 $ ./configure --enable-syslog --enable-logfile --with-libjevents=/usr/local --enable-debug
334 This will install collectd to default folder ``/opt/collectd``. The collectd
335 configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``.
336 To configure the PMU plugin you need to modify the configuration file to
339 .. literalinclude:: ../../../src/collectd/collectd_sample_configs/intel_pmu.conf
340 :start-at: LoadPlugin
343 If you want to monitor Intel CPU specific CPU events, make sure to uncomment the
344 ``EventList`` and ``HardwareEvents`` options above.
347 If you set XDG_CACHE_HOME to anything other than the variable above - you will need to modify
348 the path for the EventList configuration.
350 Use ``Cores`` option to monitor metrics only for configured cores. If an empty string is provided
351 as value for this field default cores configuration is applied - that is all available cores
352 are monitored separately. To limit monitoring to cores 0-7 set the option as shown below:
358 For more information on the plugin parameters, please see:
359 https://github.com/collectd/collectd/blob/main/src/collectd.conf.pod
363 The plugin opens file descriptors whose quantity depends on number of
364 monitored CPUs and number of monitored counters. Depending on configuration,
365 it might be required to increase the limit on the number of open file
366 descriptors allowed. This can be done using 'ulimit -n' command. If collectd
367 is executed as a service 'LimitNOFILE=' directive should be defined in
368 [Service] section of *collectd.service* file.
372 Repo: https://github.com/collectd/collectd
378 * PQoS/Intel RDT library https://github.com/intel/intel-cmt-cat
381 Building and installing PQoS/Intel RDT library:
385 $ git clone https://github.com/intel/intel-cmt-cat
388 $ make install PREFIX=/usr
390 You will need to insert the msr kernel module:
396 Building and installing collectd:
400 $ git clone https://github.com/collectd/collectd.git
403 $ ./configure --enable-syslog --enable-logfile --with-libpqos=/usr/ --enable-debug
407 This will install collectd to default folder ``/opt/collectd``. The collectd
408 configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``.
409 To configure the RDT plugin you need to modify the configuration file to
412 .. literalinclude:: ../../../src/collectd/collectd_sample_configs/rdt.conf
413 :start-at: LoadPlugin
416 For more information on the plugin parameters, please see:
417 https://github.com/collectd/collectd/blob/main/src/collectd.conf.pod
421 Repo: https://github.com/collectd/collectd
423 Branch: feat_ipmi_events, feat_ipmi_analog
425 Dependencies: `OpenIPMI library <https://openipmi.sourceforge.io/>`_
427 The IPMI plugin is already implemented in the latest collectd and sensors
428 like temperature, voltage, fanspeed, current are already supported there.
429 The list of supported IPMI sensors has been extended and sensors like flow,
430 power are supported now. Also, a System Event Log (SEL) notification feature
433 * The feat_ipmi_events branch includes new SEL feature support in collectd
434 IPMI plugin. If this feature is enabled, the collectd IPMI plugin will
435 dispatch notifications about new events in System Event Log.
437 * The feat_ipmi_analog branch includes the support of extended IPMI sensors in
438 collectd IPMI plugin.
440 **Install dependencies**
442 On Centos, install OpenIPMI library:
446 $ sudo yum install OpenIPMI ipmitool
448 Anyway, it's recommended to use the latest version of the OpenIPMI library as
449 it includes fixes of known issues which aren't included in standard OpenIPMI
450 library package. The latest version of the library can be found at
451 https://sourceforge.net/p/openipmi/code/ci/master/tree/. Steps to install the
452 library from sources are described below.
454 Remove old version of OpenIPMI library:
458 $ sudo yum remove OpenIPMI ipmitool
460 Build and install OpenIPMI library:
464 $ git clone https://git.code.sf.net/p/openipmi/code openipmi-code
466 $ autoreconf --install
467 $ ./configure --prefix=/usr
471 Add the directory containing ``OpenIPMI*.pc`` files to the ``PKG_CONFIG_PATH``
472 environment variable:
476 export PKG_CONFIG_PATH=/usr/lib/pkgconfig
478 Enable IPMI support in the kernel:
482 $ sudo modprobe ipmi_devintf
483 $ sudo modprobe ipmi_si
486 If HW supports IPMI, the ``/dev/ipmi0`` character device will be
489 Clone and install the collectd IPMI plugin:
493 $ git clone https://github.com/collectd/collectd
496 $ ./configure --enable-syslog --enable-logfile --enable-debug
500 This will install collectd to default folder ``/opt/collectd``. The collectd
501 configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``.
502 To configure the IPMI plugin you need to modify the file to include:
509 SELEnabled true # only feat_ipmi_events branch supports this
514 By default, IPMI plugin will read all available analog sensor values,
515 dispatch the values to collectd and send SEL notifications.
517 For more information on the IPMI plugin parameters and SEL feature configuration,
518 please see: https://github.com/collectd/collectd/blob/main/src/collectd.conf.pod
520 Extended analog sensors support doesn't require additional configuration. The usual
521 collectd IPMI documentation can be used:
523 - https://collectd.org/wiki/index.php/Plugin:IPMI
524 - https://collectd.org/documentation/manpages/collectd.conf.5.shtml#plugin_ipmi
528 - https://www.kernel.org/doc/Documentation/IPMI.txt
529 - https://www.intel.com/content/www/us/en/products/docs/servers/ipmi/ipmi-second-gen-interface-spec-v2-rev1-1.html
533 Repo: https://github.com/collectd/collectd
537 Dependencies: `mcelog <http://mcelog.org/>`_
539 Start by installing mcelog.
542 The kernel has to have CONFIG_X86_MCE enabled. For 32bit kernels you
543 need atleast a 2.6,30 kernel.
549 $ sudo yum install mcelog
555 $ git clone https://git.kernel.org/pub/scm/utils/cpu/mce/mcelog.git
560 $ cp mcelog.service /etc/systemd/system/
561 $ systemctl enable mcelog.service
562 $ systemctl start mcelog.service
565 Verify you got a /dev/mcelog. You can verify the daemon is running completely
572 This should query the information in the running daemon. If it prints nothing
573 that is fine (no errors logged yet). More info @
574 http://www.mcelog.org/installation.html
576 Modify the mcelog configuration file "/etc/mcelog/mcelog.conf" to include or
581 socket-path = /var/run/mcelog-client
583 dimm-tracking-enabled = yes
584 dmi-prepopulate = yes
585 uc-error-threshold = 1 / 24h
586 ce-error-threshold = 10 / 24h
589 socket-tracking-enabled = yes
590 mem-uc-error-threshold = 100 / 24h
591 mem-ce-error-threshold = 100 / 24h
592 mem-ce-error-log = yes
595 memory-ce-threshold = 10 / 24h
597 memory-ce-action = soft
601 directory = /etc/mcelog
604 Clone and install the collectd mcelog plugin:
608 $ git clone https://github.com/collectd/collectd
611 $ ./configure --enable-syslog --enable-logfile --enable-debug
615 This will install collectd to default folder ``/opt/collectd``. The collectd
616 configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``.
617 To configure the mcelog plugin you need to modify the configuration file to
620 .. literalinclude:: ../../../src/collectd/collectd_sample_configs/mcelog.conf
621 :start-at: LoadPlugin
624 For more information on the plugin parameters, please see:
625 https://github.com/collectd/collectd/blob/main/src/collectd.conf.pod
627 Simulating a Machine Check Exception can be done in one of 3 ways:
629 * Running $make test in the mcelog cloned directory - mcelog test suite
633 **mcelog test suite:**
635 It is always a good idea to test an error handling mechanism before it is
636 really needed. mcelog includes a test suite. The test suite relies on
637 mce-inject which needs to be installed and in $PATH.
639 You also need the mce-inject kernel module configured (with
640 CONFIG_X86_MCE_INJECT=y), compiled, installed and loaded:
644 $ modprobe mce-inject
646 Then you can run the mcelog test suite with
652 This will inject different classes of errors and check that the mcelog triggers
653 runs. There will be some kernel messages about page offlining attempts. The
654 test will also lose a few pages of memory in your system (not significant).
657 This test will kill any running mcelog, which needs to be restarted
662 A utility to inject corrected, uncorrected and fatal machine check exceptions
666 $ git clone https://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git
669 $ modprobe mce-inject
671 Modify the test/corrected script to include the following:
676 STATUS 0xcc00008000010090
682 $ ./mce-inject < test/corrected
685 The uncorrected and fatal scripts under test will cause a platform reset.
686 Only the fatal script generates the memory errors**. In order to quickly
687 emulate uncorrected memory errors and avoid host reboot following test errors
688 from mce-test suite can be injected:
692 $ mce-inject mce-test/cases/coverage/soft-inj/recoverable_ucr/data/srao_mem_scrub
696 In addition a more in-depth test of the Linux kernel machine check facilities
697 can be done with the mce-test test suite. mce-test supports testing uncorrected
698 error handling, real error injection, handling of different soft offlining
699 cases, and other tests.
701 **Corrected memory error injection:**
703 To inject corrected memory errors:
705 * Remove sb_edac and edac_core kernel modules: rmmod sb_edac rmmod edac_core
706 * Insert einj module: modprobe einj param_extension=1
707 * Inject an error by specifying details (last command should be repeated at least two times):
711 $ APEI_IF=/sys/kernel/debug/apei/einj
712 $ echo 0x8 > $APEI_IF/error_type
713 $ echo 0x01f5591000 > $APEI_IF/param1
714 $ echo 0xfffffffffffff000 > $APEI_IF/param2
715 $ echo 1 > $APEI_IF/notrigger
716 $ echo 1 > $APEI_IF/error_inject
718 * Check the MCE statistic: mcelog --client. Check the mcelog log for injected error details: less /var/log/mcelog.
722 OvS Plugins Repo: https://github.com/collectd/collectd
724 OvS Plugins Branch: main
726 OvS Events MIBs: The SNMP OVS interface link status is provided by standard
727 `IF-MIB <http://www.net-snmp.org/docs/mibs/IF-MIB.txt>`_
729 Dependencies: Open vSwitch, `Yet Another JSON Library <https://github.com/lloyd/yajl>`_
731 On Centos, install the dependencies and Open vSwitch:
735 $ sudo yum install yajl-devel
737 Steps to install Open vSwtich can be found at
738 https://docs.openvswitch.org/en/latest/intro/install/fedora/
740 Start the Open vSwitch service:
744 $ sudo service openvswitch-switch start
746 Configure the ovsdb-server manager:
750 $ sudo ovs-vsctl set-manager ptcp:6640
752 Clone and install the collectd ovs plugin:
760 $ ./configure --enable-syslog --enable-logfile --enable-debug
764 This will install collectd to default folder ``/opt/collectd``. The collectd
765 configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``.
766 To configure the OVS events plugin you need to modify the configuration file
767 (uncommenting and updating values as appropriate) to include:
769 .. literalinclude:: ../../../src/collectd/collectd_sample_configs/ovs_events.conf
770 :start-at: LoadPlugin
773 To configure the OVS stats plugin you need to modify the configuration file
774 (uncommenting and updating values as appropriate) to include:
776 .. literalinclude:: ../../../src/collectd/collectd_sample_configs/ovs_stats.conf
777 :start-at: LoadPlugin
780 For more information on the plugin parameters, please see:
781 https://github.com/collectd/collectd/blob/main/src/collectd.conf.pod
785 Repo: https://gerrit.opnfv.org/gerrit/gitweb?p=barometer.git
789 #. Open vSwitch dependencies are installed.
790 #. Open vSwitch service is running.
791 #. Ovsdb-server manager is configured.
793 You can refer `Open vSwitch Plugins`_ section above for each one of them.
795 OVS PMD stats application is run through the exec plugin.
797 To configure the OVS PMD stats application you need to modify the exec plugin configuration
806 Exec "user:group" "<path to ovs_pmd_stat.sh>"
809 .. note:: Exec plugin configuration has to be changed to use appropriate user before starting collectd service.
811 ``ovs_pmd_stat.sh`` calls the script for OVS PMD stats application with its argument:
813 .. literalinclude:: ../../../src/collectd/collectd_sample_configs/ovs_pmd_stats.sh
819 Repo: https://github.com/collectd/collectd
823 Dependencies: NET-SNMP library
825 Start by installing net-snmp and dependencies.
831 $ sudo yum install net-snmp net-snmp-libs net-snmp-utils net-snmp-devel
832 $ sudo systemctl start snmpd.service
834 go to the `snmp configuration`_ steps.
838 Clone and build net-snmp:
842 $ git clone https://github.com/haad/net-snmp.git
844 $ ./configure --with-persistent-directory="/var/net-snmp" --with-systemd --enable-shared --prefix=/usr
853 Copy default configuration to persistent folder:
857 $ cp EXAMPLE.conf /usr/share/snmp/snmpd.conf
859 Set library path and default MIB configuration:
864 $ echo export LD_LIBRARY_PATH=/usr/lib >> .bashrc
865 $ net-snmp-config --default-mibdirs
866 $ net-snmp-config --snmpconfpath
868 Configure snmpd as a service:
873 $ cp ./dist/snmpd.service /etc/systemd/system/
874 $ systemctl enable snmpd.service
875 $ systemctl start snmpd.service
877 .. _`snmp configuration`:
879 Add the following line to snmpd.conf configuration file
880 ``/etc/snmp/snmpd.conf`` to make all OID tree visible for SNMP clients:
884 view systemview included .1
886 To verify that SNMP is working you can get IF-MIB table using SNMP client
887 to view the list of Linux interfaces:
891 $ snmpwalk -v 2c -c public localhost IF-MIB::interfaces
893 Get the default MIB location:
897 $ net-snmp-config --default-mibdirs
898 /opt/stack/.snmp/mibs:/usr/share/snmp/mibs
900 Install Intel specific MIBs (if needed) into location received by
901 ``net-snmp-config`` command (e.g. ``/usr/share/snmp/mibs``).
905 $ git clone https://gerrit.opnfv.org/gerrit/barometer.git
906 $ sudo cp -f barometer/mibs/*.txt /usr/share/snmp/mibs/
907 $ sudo systemctl restart snmpd.service
909 Clone and install the collectd snmp_agent plugin:
914 $ git clone https://github.com/collectd/collectd
917 $ ./configure --enable-syslog --enable-logfile --enable-debug --enable-snmp --with-libnetsnmp
921 This will install collectd to default folder ``/opt/collectd``. The collectd
922 configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``.
924 **SNMP Agent plugin is a generic plugin and cannot work without configuration**.
925 To configure the snmp_agent plugin you need to modify the configuration file to
926 include OIDs mapped to collectd types. The following example maps scalar
927 memAvailReal OID to value represented as free memory type of memory plugin:
931 LoadPlugin snmp_agent
932 <Plugin "snmp_agent">
933 <Data "memAvailReal">
937 OIDs "1.3.6.1.4.1.2021.4.6.0"
942 The ``snmpwalk`` command can be used to validate the collectd configuration:
946 $ snmpwalk -v 2c -c public localhost 1.3.6.1.4.1.2021.4.6.0
947 UCD-SNMP-MIB::memAvailReal.0 = INTEGER: 135237632 kB
952 * Object instance with Counter64 type is not supported in SNMPv1. When GetNext
953 request is received, Counter64 type objects will be skipped. When Get
954 request is received for Counter64 type object, the error will be returned.
955 * Interfaces that are not visible to Linux like DPDK interfaces cannot be
956 retreived using standard IF-MIB tables.
958 For more information on the plugin parameters, please see:
959 https://github.com/collectd/collectd/blob/main/src/collectd.conf.pod
961 For more details on AgentX subagent, please see:
962 http://www.net-snmp.org/tutorial/tutorial-5/toolkit/demon/
968 Repo: https://github.com/collectd/collectd
972 Dependencies: `libvirt <https://libvirt.org/>`_, libxml2
974 On Centos, install the dependencies:
978 $ sudo yum install libxml2-devel libpciaccess-devel yajl-devel device-mapper-devel
982 .. note:: libvirt version in package manager might be quite old and offer only
983 limited functionality. Hence, building and installing libvirt from sources
984 is recommended. Detailed instructions can bet found at:
985 https://libvirt.org/compiling.html
989 $ sudo yum install libvirt-devel
991 Certain metrics provided by the plugin have a requirement on a minimal version of
992 the libvirt API. *File system information* statistics require a *Guest Agent (GA)*
993 to be installed and configured in a VM. User must make sure that installed GA
994 version supports retrieving file system information. Number of *Performance monitoring events*
995 metrics depends on running libvirt daemon version.
997 .. note:: Please keep in mind that RDT metrics (part of *Performance monitoring
998 events*) have to be supported by hardware. For more details on hardware support,
1000 https://github.com/intel/intel-cmt-cat
1002 Additionally perf metrics **cannot** be collected if *Intel RDT* plugin is enabled.
1004 libvirt version can be checked with following commands:
1009 $ libvirtd --version
1011 .. table:: Extended statistics requirements
1013 +-------------------------------+--------------------------+-------------+
1014 | Statistic | Min. libvirt API version | Requires GA |
1015 +===============================+==========================+=============+
1016 | Domain reason | 0.9.2 | No |
1017 +-------------------------------+--------------------------+-------------+
1018 | Disk errors | 0.9.10 | No |
1019 +-------------------------------+--------------------------+-------------+
1020 | Job statistics | 1.2.9 | No |
1021 +-------------------------------+--------------------------+-------------+
1022 | File system information | 1.2.11 | Yes |
1023 +-------------------------------+--------------------------+-------------+
1024 | Performance monitoring events | 1.3.3 | No |
1025 +-------------------------------+--------------------------+-------------+
1027 Start libvirt daemon:
1031 $ systemctl start libvirtd
1033 Create domain (VM) XML configuration file. For more information on domain XML
1034 format and examples, please see:
1035 https://libvirt.org/formatdomain.html
1037 .. note:: Installing additional hypervisor dependencies might be required before
1038 deploying virtual machine.
1040 Create domain, based on created XML file:
1044 $ virsh define DOMAIN_CFG_FILE.xml
1050 $ virsh start DOMAIN_NAME
1052 Check if domain is running:
1058 Check list of available *Performance monitoring events* and their settings:
1062 $ virsh perf DOMAIN_NAME
1064 Enable or disable *Performance monitoring events* for domain:
1068 $ virsh perf DOMAIN_NAME [--enable | --disable] EVENT_NAME --live
1070 Clone and install the collectd virt plugin:
1077 $ ./configure --enable-syslog --enable-logfile --enable-debug
1081 Where ``$REPO`` is equal to information provided above.
1083 This will install collectd to ``/opt/collectd``. The collectd configuration file
1084 ``collectd.conf`` can be found at ``/opt/collectd/etc``.
1085 To load the virt plugin user needs to modify the configuration file to include:
1091 Additionally, user can specify plugin configuration parameters in this file,
1092 such as connection URL, domain name and much more. By default extended virt plugin
1093 statistics are disabled. They can be enabled with ``ExtraStats`` option.
1099 ExtraStats "cpu_util disk disk_err domain_state fs_info job_stats_background pcpu perf vcpupin"
1102 For more information on the plugin parameters, please see:
1103 https://github.com/collectd/collectd/blob/main/src/collectd.conf.pod
1105 .. _install-collectd-as-a-service:
1107 Installing collectd as a service
1108 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1110 Collectd service scripts are available in the collectd/contrib directory.
1111 To install collectd as a service:
1115 $ sudo cp contrib/systemd.collectd.service /etc/systemd/system/
1116 $ cd /etc/systemd/system/
1117 $ sudo mv systemd.collectd.service collectd.service
1118 $ sudo chmod +x collectd.service
1120 Modify collectd.service
1125 ExecStart=/opt/collectd/sbin/collectd
1126 EnvironmentFile=-/opt/collectd/etc/
1127 EnvironmentFile=-/opt/collectd/etc/
1128 CapabilityBoundingSet=CAP_SETUID CAP_SETGID
1134 $ sudo systemctl daemon-reload
1135 $ sudo systemctl start collectd.service
1136 $ sudo systemctl status collectd.service should show success
1138 Additional useful plugins
1139 ^^^^^^^^^^^^^^^^^^^^^^^^^
1141 **Exec Plugin** : Can be used to show you when notifications are being
1142 generated by calling a bash script that dumps notifications to file. (handy
1143 for debug). Modify ``/opt/collectd/etc/collectd.conf`` to include the
1144 ``NotificationExec`` config option, taking care to add the right directory path
1145 to the ``write_notification.sh`` script:
1147 .. literalinclude:: ../../../src/collectd/collectd_sample_configs/exec.conf
1148 :start-at: LoadPlugin
1152 ``write_notification.sh`` writes the notification passed from exec through
1153 STDIN to a file (``/tmp/notifications``):
1155 .. literalinclude:: ../../../src/collectd/collectd_sample_configs/write_notification.sh
1159 output to ``/tmp/notifications`` should look like:
1167 PluginInstance:br-ex
1169 TypeInstance:link_status
1170 uuid:f2aafeec-fa98-4e76-aec5-18ae9fc74589
1172 linkstate of "br-ex" interface has been changed to "DOWN"
1174 * **logfile plugin**: Can be used to log collectd activity. Modify
1175 /opt/collectd/etc/collectd.conf to include:
1182 File "/var/log/collectd.log"
1188 Monitoring Interfaces and Openstack Support
1189 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1190 .. Figure:: monitoring_interfaces.png
1192 Monitoring Interfaces and Openstack Support
1194 The figure above shows the DPDK L2 forwarding application running on a compute
1195 node, sending and receiving traffic. Collectd is also running on this compute
1196 node retrieving the stats periodically from DPDK through the dpdkstat plugin
1197 and publishing the retrieved stats to OpenStack through the
1198 collectd-openstack-plugins.
1200 To see this demo in action please checkout: `Barometer OPNFV Summit demo`_
1202 For more information on configuring and installing OpenStack plugins for
1203 collectd, check out the `collectd-openstack-plugins GSG`_.
1207 * AAA – on top of collectd there secure agents like SNMP V3, Openstack agents
1208 etc. with their own AAA methods.
1210 * Collectd runs as a daemon with root permissions.
1212 * The `Exec plugin`_ allows the execution of external programs but counters the security
1215 * Ensuring that only one instance of the program is executed by collectd at any time
1216 * Forcing the plugin to check that custom programs are never executed with superuser
1219 * Protection of Data in flight:
1221 * It's recommend to use a minimum version of 4.7 of the Network plugin which provides
1222 the possibility to cryptographically sign or encrypt the network traffic.
1223 * Write Redis plugin or the Write MongoDB plugin are recommended to store the data.
1224 * For more information, please see: https://collectd.org/wiki/index.php?title=Networking_introduction
1226 * Known vulnerabilities include:
1228 * https://www.cvedetails.com/vulnerability-list/vendor_id-11242/Collectd.html
1230 * `CVE-2017-7401`_ fixed https://github.com/collectd/collectd/issues/2174 in Version 5.7.2.
1231 * `CVE-2016-6254`_ fixed https://mailman.verplant.org/pipermail/collectd/2016-July/006838.html
1233 * `CVE-2010-4336`_ fixed https://mailman.verplant.org/pipermail/collectd/2010-November/004277.html
1236 * https://www.cvedetails.com/product/20310/Collectd-Collectd.html?vendor_id=11242
1238 * It's recommended to only use collectd plugins from signed packages.
1242 .. [1] https://collectd.org/wiki/index.php/Naming_schema
1243 .. [2] https://github.com/collectd/collectd/blob/main/src/daemon/plugin.h
1244 .. [3] https://collectd.org/wiki/index.php/Value_list_t
1245 .. [4] https://collectd.org/wiki/index.php/Data_set
1246 .. [5] https://collectd.org/documentation/manpages/types.db.5.shtml
1247 .. [6] https://collectd.org/wiki/index.php/Data_source
1248 .. [7] https://collectd.org/wiki/index.php/Meta_Data_Interface
1250 .. _Barometer OPNFV Summit demo: https://prezi.com/kjv6o8ixs6se/software-fastpath-service-quality-metrics-demo/
1251 .. _gnocchi plugin: https://opendev.org/x/collectd-openstack-plugins/src/branch/stable/ocata/
1252 .. _aodh plugin: https://opendev.org/x/collectd-openstack-plugins/src/branch/stable/ocata/
1253 .. _collectd-openstack-plugins GSG: https://opendev.org/x/collectd-openstack-plugins/src/branch/master/doc/source/GSG.rst
1254 .. _grafana guide: https://wiki.anuket.io/display/HOME/Installing+and+configuring+InfluxDB+and+Grafana+to+display+metrics+with+collectd
1255 .. _CVE-2017-7401: https://www.cvedetails.com/cve/CVE-2017-7401/
1256 .. _CVE-2016-6254: https://www.cvedetails.com/cve/CVE-2016-6254/
1257 .. _CVE-2010-4336: https://www.cvedetails.com/cve/CVE-2010-4336/
1258 .. _Exec plugin: https://collectd.org/wiki/index.php/Plugin:Exec