-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Faxm0dem/syslog notif #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Deduplicated some code from logfile and syslog and added NotifyLevel option to plugin. Change-Id: I364067189d628420333cb625c885a256399e076a
Change-Id: I48e61fc2613ca50c477794167f180e3b4f4fc673
…0dem/syslog-notif
/me pokes octo |
Hi, Thank you very much for your changes and thanks for being so patient with me :) I've rebased your changes on top of the master branch and fixed some issues I've found. Best regards, |
vincentbernat
added a commit
to vincentbernat/collectd
that referenced
this pull request
Aug 19, 2016
When using a configuration like this: ``` LoadPlugin curl_json <Plugin curl_json > <URL "http://api.facebook.com/method/fql.query?format=json&query=select+fan_count+from+page+where+page_id=280922605253831" > Instance "facebook" <Key "*/fan_count"> Type "gauge" </Key> </URL> </Plugin> ``` collectd segfaults in `src/curl_json.c` in `cj_cb_number`. >>> bt #0 __strcmp_sse2_unaligned () at ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S:31 collectd#1 0x0000000000414da2 in search (t=0x2304a80, t=0x2304a80, key=0x1) at ../../../src/daemon/utils_avltree.c:119 collectd#2 c_avl_get (t=0x2304a80, key=key@entry=0x1, value=value@entry=0x7f01184f2ab8) at ../../../src/daemon/utils_avltree.c:599 collectd#3 0x000000000040cbd1 in plugin_get_ds (name=0x1 <error: Cannot access memory at address 0x1>) at ../../../src/daemon/plugin.c:2559 collectd#4 0x00007f011f93b7ef in cj_get_type (key=0x23324f0) at ../../src/curl_json.c:147 collectd#5 cj_cb_number (ctx=0x231f2f0, number=<optimized out>, number_len=2) at ../../src/curl_json.c:244 collectd#6 0x00007f011f4b7874 in yajl_do_parse (hand=0x7f01100008e0, jsonText=jsonText@entry=0x234f990 "{\"error_code\":12,\"error_msg\":\"REST API is deprecated for versions v2.1 and higher (12)\",\"request_args\":[{\"key\":\"method\",\"value\":\"fql.query\"},{\"key\":\"format\",\"value\":\"json\"},{\"key\":\"query\",\"value\":\"sel"..., jsonTextLen=jsonTextLen@entry=257) at /tmp/buildd/yajl-2.1.0/src/yajl_parser.c:307 In fact, `db->state[db->depth].key` is not a key (it's a tree). This is correctly detected at the beginning, but despite that, there is a call to `cj_cb_inc_array_index()` (without effect, not in an array) and `key` is set again to `db->state[db->depth].key` which is still not a key. In a previous call to `cj_cb_map_key()` where the corresponding value is set, `(c_avl_get (tree, CJ_ANY, (void *) &value) == 0)` is true but `(CJ_IS_KEY((cj_key_t*)value))` is false. Therefore, this confirms this a tree which is stored in this node, not a key. This is a tentative fix to ignore the value if we are expecting a map, but didn't get one and aren't in an inhomogeneous array.
hlounent
pushed a commit
to hlounent/collectd
that referenced
this pull request
Jan 3, 2021
Previously published messages could get delayed by several hours per day at least (or even) in case there were (only) around 100 values / messages published every 10 seconds. The problem was reproducible on a i7-6700HQ CPU with only the cpu (with default configuration) and mqtt plugins enabled and the mqtt plugin configured to send to a local Mosquitto (v2.0.4) broker instance. See the note collectd#3 for more information. The same behavior was observed with various other combinations of mosquitto and collectd versions including e.g. mosquitto 1.6 and collectd 5.8 series. The latest working versions were mosquitto 1.6.7 and collectd 5.8.1 built with Yocto Zeus (including any patches OpenEmbedded might have had for the versions). The mosquitto API documentation [1] says mosquitto_loop [...] This must be called frequently to keep communications between the client and broker working. [...] Outgoing commands, from e.g. mosquitto_publish, are normally sent immediately that their function is called, but this is not always possible. mosquitto_loop will also attempt to send any remaining outgoing messages, which also includes commands that are part of the flow for messages with QoS>0. However, mosquitto_loop only returns MOSQ_ERR_SUCCESS or an error code in case of some failure but does not indicate whether there is still more work to do so the callback seem like the only means to decide if mosquitto_loop still needs to be called. Secondly, adding the callbacks was not enough, but the timeout argument of mosquitto_loop calls needed to be changed to 0 ms in order to return immediately because using the previous value of 1000 ms (which is the default value; see the note collectd#2) still resulted in messages being delayed. The pattern of using the callbacks and the zero timeout was adopted from the implementation of mosquitto_pub in mosquitto v2.0.4. This adds some overhead when QoS > 0 because mosquitto_loop may be called many times until the callbacks on_connect or on_publish. However, this seems like the only option without better knowledge of how mosquitto works internally, because adding any timeout to mosquitto_loop calls leads to delays. The overhead with QoS=1 on a i7-6700HQ with only the cpu and mqtt plugins was in the range of 5 %. The significant difference of 2 minute perf profiles perf record -ag -p <pid of collectd> -- sleep 120 without and with the fix looks like perf diff -o 0 main/perf.data fix/perf.data # # Baseline Delta Abs Shared Object Symbol # ........ ......... .................. .......................................... # [...] 0.55% +5.10% libmosquitto.so.1 [.] mosquitto_loop.part.0 Notes ===== 1. Maximum number of inflight messages There is also mosquitto_max_inflight_messages_set, whose API documentation [2] says Set the number of QoS 1 and 2 messages that can be "in flight" at one time. An in flight message is part way through its delivery flow. Attempts to send further messages with mosquitto_publish will result in the messages being queued until the number of in flight messages reduces. and A higher number here results in greater message throughput, but if set higher than the maximum in flight messages on the broker may lead to delays in the messages being acknowledged. However, the docs also say This function is [deprecated]. and while Use the mosquitto_int_option function with the MOSQ_OPT_SEND_MAXIMUM option instead. is offered as a replacement, MOSQ_OPT_SEND_MAXIMUM seems not applicable in collectd's MQTT plugin MOSQ_OPT_SEND_MAXIMUM - [...] This option is not valid for MQTT v3.1 or v3.1.1 clients. [...] as the MQTT plugin uses the (mosquitto default) protocol level MOSQ_OPT_PROTOCOL_VERSION - [...] Defaults to MQTT_PROTOCOL_V311. as MOSQ_OPT_PROTOCOL_VERSION is not set by the plugin. 2. 1000 (ms) is the default timeout [1]: mosquitto_loop libmosq_EXPORT int mosquitto_loop(struct mosquitto *mosq, int timeout, int max_packets ) [...] Parameters [...] timeout [...] Set negative to use the default of 1000ms. 3. Steps to reproduce Build and run mosquitto broker git clone https://github.com/eclipse/mosquitto.git cd mosquitto git checkout v2.0.4 make && sudo make install cat << EOF > mosquitto.conf listener 1883 bind_interface lo allow_anonymous true EOF /usr/local/sbin/mosquitto -c mosquitto.conf -v Observe received messages and their timestamps LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/mosquitto_sub -t "collectd/#" Build and run collectd git clone git://github.com/collectd/collectd.git cd collectd git checkout collectd-5.12.0 sudo rm -rf /opt/collectd && \ mkdir build && \ cd build && \ ../configure --disable-all-plugins --enable-mqtt --enable-cpu --enable-debug && \ make && \ sudo make install cat << EOF > collectd.conf LoadPlugin cpu LoadPlugin mqtt <Plugin cpu> ReportByCpu true ReportByState true ValuesPercentage false ReportNumCpu false ReportGuestState false SubtractGuestState true </Plugin> <Plugin mqtt> <Publish "localhost"> Host "localhost" Port 1883 ClientId "localhost" QoS 1 # Or QoS 0 Prefix "collectd" StoreRates true Retain false </Publish> </Plugin> EOF LD_LIBRARY_PATH=/usr/local/lib /opt/collectd/sbin/collectd -f -B -C collectd.conf Right after starting the system, messages written through the MQTT plugin would appear almost intantaneously on the broker. After running the system for 2-3 minutes messages received by the broker were the ones written to the mqtt plugin 20 seconds earlier, i.e. messages were delayed by 20 seconds in less than 180 seconds. If the system were left running, the delay would continue accumulating resulting in messages being delayed by several hours per day. [1] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_loop [2] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_max_inflight_messages_set Signed-off-by: Hannu Lounento <hannu.lounento@iki.fi>
hlounent
pushed a commit
to hlounent/collectd
that referenced
this pull request
Jan 3, 2021
Previously published messages could get delayed by several hours per day at least (or even) in case there were (only) around 100 values / messages published every 10 seconds. The problem was reproducible on a i7-6700HQ CPU with only the cpu (with default configuration) and mqtt plugins enabled and the mqtt plugin configured to send to a local Mosquitto (v2.0.4) broker instance. See the note collectd#3 for more information. The same behavior was observed with various other combinations of mosquitto and collectd versions including e.g. mosquitto 1.6 and collectd 5.8 series. The latest working versions were mosquitto 1.6.7 and collectd 5.8.1 built with Yocto Zeus (including any patches OpenEmbedded might have had for the versions). The mosquitto API documentation [1] says mosquitto_loop [...] This must be called frequently to keep communications between the client and broker working. [...] Outgoing commands, from e.g. mosquitto_publish, are normally sent immediately that their function is called, but this is not always possible. mosquitto_loop will also attempt to send any remaining outgoing messages, which also includes commands that are part of the flow for messages with QoS>0. However, mosquitto_loop only returns MOSQ_ERR_SUCCESS or an error code in case of some failure but does not indicate whether there is still more work to do so the callback seem like the only means to decide if mosquitto_loop still needs to be called. Secondly, adding the callbacks was not enough, but the timeout argument of mosquitto_loop calls needed to be changed to 0 ms in order to return immediately because using the previous value of 1000 ms (which is the default value; see the note collectd#2) still resulted in messages being delayed. The pattern of using the callbacks and the zero timeout was adopted from the implementation of mosquitto_pub in mosquitto v2.0.4. This adds some overhead when QoS > 0 because mosquitto_loop may be called many times until the callbacks on_connect or on_publish. However, this seems like the only option without better knowledge of how mosquitto works internally, because adding any timeout to mosquitto_loop calls leads to delays. The overhead with QoS=1 on a i7-6700HQ with only the cpu and mqtt plugins was in the range of 5 %. The significant difference of 2 minute perf profiles perf record -ag -p <pid of collectd> -- sleep 120 without and with the fix looks like perf diff -o 0 main/perf.data fix/perf.data # # Baseline Delta Abs Shared Object Symbol # ........ ......... .................. .......................................... # [...] 0.55% +5.10% libmosquitto.so.1 [.] mosquitto_loop.part.0 Notes ===== 1. Maximum number of inflight messages There is also mosquitto_max_inflight_messages_set, whose API documentation [2] says Set the number of QoS 1 and 2 messages that can be "in flight" at one time. An in flight message is part way through its delivery flow. Attempts to send further messages with mosquitto_publish will result in the messages being queued until the number of in flight messages reduces. and A higher number here results in greater message throughput, but if set higher than the maximum in flight messages on the broker may lead to delays in the messages being acknowledged. However, the docs also say This function is [deprecated]. and while Use the mosquitto_int_option function with the MOSQ_OPT_SEND_MAXIMUM option instead. is offered as a replacement, MOSQ_OPT_SEND_MAXIMUM seems not applicable in collectd's MQTT plugin MOSQ_OPT_SEND_MAXIMUM - [...] This option is not valid for MQTT v3.1 or v3.1.1 clients. [...] as the MQTT plugin uses the (mosquitto default) protocol level MOSQ_OPT_PROTOCOL_VERSION - [...] Defaults to MQTT_PROTOCOL_V311. as MOSQ_OPT_PROTOCOL_VERSION is not set by the plugin. 2. 1000 (ms) is the default timeout [1]: mosquitto_loop libmosq_EXPORT int mosquitto_loop(struct mosquitto *mosq, int timeout, int max_packets ) [...] Parameters [...] timeout [...] Set negative to use the default of 1000ms. 3. Steps to reproduce Build and run mosquitto broker git clone https://github.com/eclipse/mosquitto.git cd mosquitto git checkout v2.0.4 make && sudo make install cat << EOF > mosquitto.conf listener 1883 bind_interface lo allow_anonymous true EOF /usr/local/sbin/mosquitto -c mosquitto.conf -v Observe received messages and their timestamps LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/mosquitto_sub -t "collectd/#" Build and run collectd git clone git://github.com/collectd/collectd.git cd collectd git checkout collectd-5.12.0 sudo rm -rf /opt/collectd && \ mkdir build && \ cd build && \ ../configure --disable-all-plugins --enable-mqtt --enable-cpu --enable-debug && \ make && \ sudo make install cat << EOF > collectd.conf LoadPlugin cpu LoadPlugin mqtt <Plugin cpu> ReportByCpu true ReportByState true ValuesPercentage false ReportNumCpu false ReportGuestState false SubtractGuestState true </Plugin> <Plugin mqtt> <Publish "localhost"> Host "localhost" Port 1883 ClientId "localhost" QoS 1 # Or QoS 0 Prefix "collectd" StoreRates true Retain false </Publish> </Plugin> EOF LD_LIBRARY_PATH=/usr/local/lib /opt/collectd/sbin/collectd -f -B -C collectd.conf Right after starting the system, messages written through the MQTT plugin would appear almost intantaneously on the broker. After running the system for 2-3 minutes messages received by the broker were the ones written to the mqtt plugin 20 seconds earlier, i.e. messages were delayed by 20 seconds in less than 180 seconds. If the system were left running, the delay would continue accumulating resulting in messages being delayed by several hours per day. [1] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_loop [2] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_max_inflight_messages_set Signed-off-by: Hannu Lounento <hannu.lounento@iki.fi>
hlounent
pushed a commit
to hlounent/collectd
that referenced
this pull request
Jan 3, 2021
Previously published messages could get delayed by several hours per day even when values were published (i.e. messages were sent) with relatively low frequencies (e.g. < 10 values / s) with QoS 1. The problem was reproducible on a i7-6700HQ CPU with only the cpu (with default configuration) and mqtt plugins enabled and the mqtt plugin configured to send to a local Mosquitto (v2.0.4) broker instance. See the note collectd#3 for more information. The same behavior was observed with various other combinations of mosquitto and collectd versions including e.g. mosquitto 1.6 and collectd 5.8 series. The latest working versions were mosquitto 1.6.7 and collectd 5.8.1 built with Yocto Zeus (including any patches OpenEmbedded might have had for the versions). The mosquitto API documentation [1] says mosquitto_loop [...] This must be called frequently to keep communications between the client and broker working. [...] Outgoing commands, from e.g. mosquitto_publish, are normally sent immediately that their function is called, but this is not always possible. mosquitto_loop will also attempt to send any remaining outgoing messages, which also includes commands that are part of the flow for messages with QoS>0. However, mosquitto_loop only returns MOSQ_ERR_SUCCESS or an error code in case of some failure but does not indicate whether there is still more work to do so the callback seem like the only means to decide if mosquitto_loop still needs to be called. Secondly, adding the callbacks was not enough, but the timeout argument of mosquitto_loop calls needed to be changed to 0 ms in order to return immediately because using the previous value of 1000 ms (which is the default value; see the note collectd#2) still resulted in messages being delayed. The pattern of using the callbacks and the zero timeout was adopted from the implementation of mosquitto_pub in mosquitto v2.0.4. This adds some overhead when QoS > 0 because mosquitto_loop may be called many times until the callbacks on_connect or on_publish. However, this seems like the only option without better knowledge of how mosquitto works internally, because adding any timeout to mosquitto_loop calls leads to delays. The overhead with QoS=1 on a i7-6700HQ with only the cpu and mqtt plugins was in the range of 5 %. The significant difference of 2 minute perf profiles perf record -ag -p <pid of collectd> -- sleep 120 without and with the fix looks like perf diff -o 0 main/perf.data fix/perf.data # # Baseline Delta Abs Shared Object Symbol # ........ ......... .................. .......................................... # [...] 0.55% +5.10% libmosquitto.so.1 [.] mosquitto_loop.part.0 Notes ===== 1. Maximum number of inflight messages There is also mosquitto_max_inflight_messages_set, whose API documentation [2] says Set the number of QoS 1 and 2 messages that can be "in flight" at one time. An in flight message is part way through its delivery flow. Attempts to send further messages with mosquitto_publish will result in the messages being queued until the number of in flight messages reduces. and A higher number here results in greater message throughput, but if set higher than the maximum in flight messages on the broker may lead to delays in the messages being acknowledged. However, the docs also say This function is [deprecated]. and while Use the mosquitto_int_option function with the MOSQ_OPT_SEND_MAXIMUM option instead. is offered as a replacement, MOSQ_OPT_SEND_MAXIMUM seems not applicable in collectd's MQTT plugin MOSQ_OPT_SEND_MAXIMUM - [...] This option is not valid for MQTT v3.1 or v3.1.1 clients. [...] as the MQTT plugin uses the (mosquitto default) protocol level MOSQ_OPT_PROTOCOL_VERSION - [...] Defaults to MQTT_PROTOCOL_V311. as MOSQ_OPT_PROTOCOL_VERSION is not set by the plugin. 2. 1000 (ms) is the default timeout [1]: mosquitto_loop libmosq_EXPORT int mosquitto_loop(struct mosquitto *mosq, int timeout, int max_packets ) [...] Parameters [...] timeout [...] Set negative to use the default of 1000ms. 3. Steps to reproduce Build and run mosquitto broker git clone https://github.com/eclipse/mosquitto.git cd mosquitto git checkout v2.0.4 make && sudo make install cat << EOF > mosquitto.conf listener 1883 bind_interface lo allow_anonymous true EOF /usr/local/sbin/mosquitto -c mosquitto.conf -v Observe received messages and their timestamps LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/mosquitto_sub -t "collectd/#" Build and run collectd git clone git://github.com/collectd/collectd.git cd collectd git checkout collectd-5.12.0 sudo rm -rf /opt/collectd && \ mkdir build && \ cd build && \ ../configure --disable-all-plugins --enable-mqtt --enable-cpu --enable-debug && \ make && \ sudo make install cat << EOF > collectd.conf LoadPlugin cpu LoadPlugin mqtt <Plugin cpu> ReportByCpu true ReportByState true ValuesPercentage false ReportNumCpu false ReportGuestState false SubtractGuestState true </Plugin> <Plugin mqtt> <Publish "localhost"> Host "localhost" Port 1883 ClientId "localhost" QoS 1 Prefix "collectd" StoreRates true Retain false </Publish> </Plugin> EOF LD_LIBRARY_PATH=/usr/local/lib /opt/collectd/sbin/collectd -f -B -C collectd.conf Right after starting the system, messages written through the MQTT plugin would appear almost intantaneously on the broker. After running the system for 2-3 minutes messages received by the broker were the ones written to the mqtt plugin 20 seconds earlier, i.e. messages were delayed by 20 seconds in less than 180 seconds. If the system were left running, the delay would continue accumulating resulting in messages being delayed by several hours per day. [1] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_loop [2] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_max_inflight_messages_set Signed-off-by: Hannu Lounento <hannu.lounento@iki.fi>
hlounent
pushed a commit
to hlounent/collectd
that referenced
this pull request
Jan 3, 2021
Previously published messages could get delayed by several hours per day even when values were published (i.e. messages were sent) with relatively low frequencies (e.g. < 10 values / s) with QoS 1. The problem was reproducible on a i7-6700HQ CPU with only the cpu (with default configuration) and mqtt plugins enabled and the mqtt plugin configured to send to a local Mosquitto (v2.0.4) broker instance. See the note collectd#3 for more information. The same behavior was observed with various other combinations of mosquitto and collectd versions including e.g. mosquitto 1.6 and collectd 5.8 series. The latest working versions were mosquitto 1.6.7 and collectd 5.8.1 built with Yocto Zeus (including any patches OpenEmbedded might have had for the versions). The mosquitto API documentation [1] says mosquitto_loop [...] This must be called frequently to keep communications between the client and broker working. [...] Outgoing commands, from e.g. mosquitto_publish, are normally sent immediately that their function is called, but this is not always possible. mosquitto_loop will also attempt to send any remaining outgoing messages, which also includes commands that are part of the flow for messages with QoS>0. However, mosquitto_loop only returns MOSQ_ERR_SUCCESS or an error code in case of some failure but does not indicate whether there is still more work to do so the callback seem like the only means to decide if mosquitto_loop still needs to be called. Secondly, adding the callbacks was not enough, but the timeout argument of mosquitto_loop calls needed to be changed to 0 ms in order to return immediately because using the previous value of 1000 ms (which is the default value; see the note collectd#2) still resulted in messages being delayed. The pattern of using the callbacks and the zero timeout was adopted from the implementation of mosquitto_pub in mosquitto v2.0.4. This adds some overhead when QoS > 0 because mosquitto_loop may be called many times until the callbacks on_connect or on_publish. However, this seems like the only option without better knowledge of how mosquitto works internally, because adding any timeout to mosquitto_loop calls leads to delays. The overhead with QoS=1 on a i7-6700HQ with only the cpu and mqtt plugins was in the range of 5 %. The significant difference of 2 minute perf profiles perf record -ag -p <pid of collectd> -- sleep 120 without and with the fix looks like perf diff -o 0 main/perf.data fix/perf.data # # Baseline Delta Abs Shared Object Symbol # ........ ......... .................. .......................................... # [...] 0.55% +5.10% libmosquitto.so.1 [.] mosquitto_loop.part.0 Notes ===== 1. Maximum number of inflight messages There is also mosquitto_max_inflight_messages_set, whose API documentation [2] says Set the number of QoS 1 and 2 messages that can be "in flight" at one time. An in flight message is part way through its delivery flow. Attempts to send further messages with mosquitto_publish will result in the messages being queued until the number of in flight messages reduces. and A higher number here results in greater message throughput, but if set higher than the maximum in flight messages on the broker may lead to delays in the messages being acknowledged. However, the docs also say This function is [deprecated]. and while Use the mosquitto_int_option function with the MOSQ_OPT_SEND_MAXIMUM option instead. is offered as a replacement, MOSQ_OPT_SEND_MAXIMUM seems not applicable in collectd's MQTT plugin MOSQ_OPT_SEND_MAXIMUM - [...] This option is not valid for MQTT v3.1 or v3.1.1 clients. [...] as the MQTT plugin uses the (mosquitto default) protocol level MOSQ_OPT_PROTOCOL_VERSION - [...] Defaults to MQTT_PROTOCOL_V311. as MOSQ_OPT_PROTOCOL_VERSION is not set by the plugin. 2. 1000 (ms) is the default timeout [1]: mosquitto_loop libmosq_EXPORT int mosquitto_loop(struct mosquitto *mosq, int timeout, int max_packets ) [...] Parameters [...] timeout [...] Set negative to use the default of 1000ms. 3. Steps to reproduce Build and run mosquitto broker git clone https://github.com/eclipse/mosquitto.git cd mosquitto git checkout v2.0.4 make && sudo make install cat << EOF > mosquitto.conf listener 1883 bind_interface lo allow_anonymous true EOF /usr/local/sbin/mosquitto -c mosquitto.conf -v Observe received messages and their timestamps LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/mosquitto_sub -t "collectd/#" Build and run collectd git clone git://github.com/collectd/collectd.git cd collectd git checkout collectd-5.12.0 sudo rm -rf /opt/collectd && \ mkdir build && \ cd build && \ ../configure --disable-all-plugins --enable-mqtt --enable-cpu --enable-debug && \ make && \ sudo make install cat << EOF > collectd.conf LoadPlugin cpu LoadPlugin mqtt <Plugin cpu> ReportByCpu true ReportByState true ValuesPercentage false ReportNumCpu false ReportGuestState false SubtractGuestState true </Plugin> <Plugin mqtt> <Publish "localhost"> Host "localhost" Port 1883 ClientId "localhost" QoS 1 Prefix "collectd" StoreRates true Retain false </Publish> </Plugin> EOF LD_LIBRARY_PATH=/usr/local/lib /opt/collectd/sbin/collectd -f -B -C collectd.conf Right after starting the system, messages written through the MQTT plugin would appear almost intantaneously on the broker. After running the system for 2-3 minutes messages received by the broker were the ones written to the mqtt plugin 20 seconds earlier, i.e. messages were delayed by 20 seconds in less than 180 seconds. If the system were left running, the delay would continue accumulating resulting in messages being delayed by several hours per day. [1] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_loop [2] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_max_inflight_messages_set Signed-off-by: Hannu Lounento <hannu.lounento@iki.fi>
hlounent
pushed a commit
to hlounent/collectd
that referenced
this pull request
Jan 3, 2021
Previously published messages could get delayed by several hours per day even when values were published (i.e. messages were sent) with relatively low frequencies (e.g. < 10 values / s) with QoS 1. The problem was reproducible on a i7-6700HQ CPU with only the cpu (with default configuration) and mqtt plugins enabled and the mqtt plugin configured to send to a local Mosquitto (v2.0.4) broker instance. See the note collectd#3 for more information. The same behavior was observed with various other combinations of mosquitto and collectd versions including e.g. mosquitto 1.6 and collectd 5.8 series. The latest working versions were mosquitto 1.6.7 and collectd 5.8.1 built with Yocto Zeus (including any patches OpenEmbedded might have had for the versions). The mosquitto API documentation [1] says mosquitto_loop [...] This must be called frequently to keep communications between the client and broker working. [...] Outgoing commands, from e.g. mosquitto_publish, are normally sent immediately that their function is called, but this is not always possible. mosquitto_loop will also attempt to send any remaining outgoing messages, which also includes commands that are part of the flow for messages with QoS>0. However, mosquitto_loop only returns MOSQ_ERR_SUCCESS or an error code in case of some failure but does not indicate whether there is still more work to do so the callback seem like the only means to decide if mosquitto_loop still needs to be called. Secondly, adding the callbacks was not enough, but the timeout argument of mosquitto_loop calls needed to be changed to 0 ms in order to return immediately because using the previous value of 1000 ms (which is the default value; see the note collectd#2) still resulted in messages being delayed. The pattern of using the callbacks and the zero timeout was adopted from the implementation of mosquitto_pub in mosquitto v2.0.4. This adds some overhead when QoS > 0 because mosquitto_loop may be called many times until the callbacks on_connect or on_publish. However, this seems like the only option without better knowledge of how mosquitto works internally, because adding any timeout to mosquitto_loop calls leads to delays. The overhead with QoS=1 on a i7-6700HQ with only the cpu and mqtt plugins was in the range of 5 %. The significant difference of 2 minute perf profiles perf record -ag -p <pid of collectd> -- sleep 120 without and with the fix looks like perf diff -o 0 main/perf.data fix/perf.data # # Baseline Delta Abs Shared Object Symbol # ........ ......... .................. .......................................... # [...] 0.55% +5.10% libmosquitto.so.1 [.] mosquitto_loop.part.0 [...] Notes ===== 1. Maximum number of inflight messages There is also mosquitto_max_inflight_messages_set, whose API documentation [2] says Set the number of QoS 1 and 2 messages that can be "in flight" at one time. An in flight message is part way through its delivery flow. Attempts to send further messages with mosquitto_publish will result in the messages being queued until the number of in flight messages reduces. and A higher number here results in greater message throughput, but if set higher than the maximum in flight messages on the broker may lead to delays in the messages being acknowledged. However, the docs also say This function is [deprecated]. and while Use the mosquitto_int_option function with the MOSQ_OPT_SEND_MAXIMUM option instead. is offered as a replacement, MOSQ_OPT_SEND_MAXIMUM seems not applicable in collectd's MQTT plugin MOSQ_OPT_SEND_MAXIMUM - [...] This option is not valid for MQTT v3.1 or v3.1.1 clients. [...] as the MQTT plugin uses the (mosquitto default) protocol level MOSQ_OPT_PROTOCOL_VERSION - [...] Defaults to MQTT_PROTOCOL_V311. as MOSQ_OPT_PROTOCOL_VERSION is not set by the plugin. 2. 1000 (ms) is the default timeout [1]: mosquitto_loop libmosq_EXPORT int mosquitto_loop(struct mosquitto *mosq, int timeout, int max_packets ) [...] Parameters [...] timeout [...] Set negative to use the default of 1000ms. 3. Steps to reproduce Build and run mosquitto broker git clone https://github.com/eclipse/mosquitto.git cd mosquitto git checkout v2.0.4 make && sudo make install cat << EOF > mosquitto.conf listener 1883 bind_interface lo allow_anonymous true EOF /usr/local/sbin/mosquitto -c mosquitto.conf -v Observe received messages and their timestamps LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/mosquitto_sub -t "collectd/#" Build and run collectd git clone git://github.com/collectd/collectd.git cd collectd git checkout collectd-5.12.0 sudo rm -rf /opt/collectd && \ mkdir build && \ cd build && \ ../configure --disable-all-plugins --enable-mqtt --enable-cpu --enable-debug && \ make && \ sudo make install cat << EOF > collectd.conf LoadPlugin cpu LoadPlugin mqtt <Plugin cpu> ReportByCpu true ReportByState true ValuesPercentage false ReportNumCpu false ReportGuestState false SubtractGuestState true </Plugin> <Plugin mqtt> <Publish "localhost"> Host "localhost" Port 1883 ClientId "localhost" QoS 1 Prefix "collectd" StoreRates true Retain false </Publish> </Plugin> EOF LD_LIBRARY_PATH=/usr/local/lib /opt/collectd/sbin/collectd -f -B -C collectd.conf Right after starting the system, messages written through the MQTT plugin would appear almost intantaneously on the broker. After running the system for 2-3 minutes messages received by the broker were the ones written to the mqtt plugin 20 seconds earlier, i.e. messages were delayed by 20 seconds in less than 180 seconds. If the system were left running, the delay would continue accumulating resulting in messages being delayed by several hours per day. [1] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_loop [2] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_max_inflight_messages_set Signed-off-by: Hannu Lounento <hannu.lounento@iki.fi>
hlounent
pushed a commit
to hlounent/collectd
that referenced
this pull request
Jan 3, 2021
Previously published messages could get delayed by several hours per day even when values were published (i.e. messages were sent) with relatively low frequencies (e.g. < 10 values / s) with QoS 1. The problem was reproducible on a i7-6700HQ CPU with only the cpu (with default configuration) and mqtt plugins enabled and the mqtt plugin configured to send to a local Mosquitto (v2.0.4) broker instance. See the note collectd#3 for more information. The same behavior was observed with various other combinations of mosquitto and collectd versions including e.g. mosquitto 1.6 and collectd 5.8 series. The latest working versions were mosquitto 1.6.7 and collectd 5.8.1 built with Yocto Zeus (including any patches OpenEmbedded might have had for the versions). The mosquitto API documentation [1] says mosquitto_loop [...] This must be called frequently to keep communications between the client and broker working. [...] Outgoing commands, from e.g. mosquitto_publish, are normally sent immediately that their function is called, but this is not always possible. mosquitto_loop will also attempt to send any remaining outgoing messages, which also includes commands that are part of the flow for messages with QoS>0. However, mosquitto_loop only returns MOSQ_ERR_SUCCESS or an error code in case of some failure but does not indicate whether there is still more work to do so the callback seem like the only means to decide if mosquitto_loop still needs to be called. Secondly, adding the callbacks was not enough, but the timeout argument of mosquitto_loop calls needed to be changed to 0 ms in order to return immediately because using the previous value of 1000 ms (which is the default value; see the note collectd#2) still resulted in messages being delayed. The pattern of using the callbacks and the zero timeout was adopted from the implementation of mosquitto_pub in mosquitto v2.0.4. This adds some overhead when QoS > 0 because mosquitto_loop may be called many times until the callbacks on_connect or on_publish. However, this seems like the only option without better knowledge of how mosquitto works internally, because adding any timeout to mosquitto_loop calls leads to delays. The overhead with QoS=1 on a i7-6700HQ with only the cpu and mqtt plugins was in the range of 5 %. The significant difference of 2 minute perf profiles perf record -ag -p <pid of collectd> -- sleep 120 without and with the fix looks like perf diff -o 0 main/perf.data fix/perf.data # # Baseline Delta Abs Shared Object Symbol # ........ ......... .................. .......................................... # [...] 0.55% +5.10% libmosquitto.so.1 [.] mosquitto_loop.part.0 [...] Notes ===== 1. Maximum number of inflight messages There is also mosquitto_max_inflight_messages_set, whose API documentation [2] says Set the number of QoS 1 and 2 messages that can be "in flight" at one time. An in flight message is part way through its delivery flow. Attempts to send further messages with mosquitto_publish will result in the messages being queued until the number of in f 8000 light messages reduces. and A higher number here results in greater message throughput, but if set higher than the maximum in flight messages on the broker may lead to delays in the messages being acknowledged. However, the docs also say This function is [deprecated]. and while Use the mosquitto_int_option function with the MOSQ_OPT_SEND_MAXIMUM option instead. is offered as a replacement, MOSQ_OPT_SEND_MAXIMUM seems not applicable in collectd's MQTT plugin MOSQ_OPT_SEND_MAXIMUM - [...] This option is not valid for MQTT v3.1 or v3.1.1 clients. [...] as the MQTT plugin uses the (mosquitto default) protocol level MOSQ_OPT_PROTOCOL_VERSION - [...] Defaults to MQTT_PROTOCOL_V311. as MOSQ_OPT_PROTOCOL_VERSION is not set by the plugin. 2. 1000 (ms) is the default timeout [1]: mosquitto_loop libmosq_EXPORT int mosquitto_loop(struct mosquitto *mosq, int timeout, int max_packets ) [...] Parameters [...] timeout [...] Set negative to use the default of 1000ms. 3. Steps to reproduce Build and run mosquitto broker git clone https://github.com/eclipse/mosquitto.git cd mosquitto git checkout v2.0.4 make && sudo make install cat << EOF > mosquitto.conf listener 1883 bind_interface lo allow_anonymous true EOF /usr/local/sbin/mosquitto -c mosquitto.conf -v Observe received messages and their timestamps LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/mosquitto_sub -t "collectd/#" Build and run collectd git clone git://github.com/collectd/collectd.git cd collectd git checkout collectd-5.12.0 sudo rm -rf /opt/collectd && \ mkdir build && \ cd build && \ ../configure --disable-all-plugins --enable-mqtt --enable-cpu --enable-debug && \ make && \ sudo make install cat << EOF > collectd.conf LoadPlugin cpu LoadPlugin mqtt <Plugin cpu> ReportByCpu true ReportByState true ValuesPercentage false ReportNumCpu false ReportGuestState false SubtractGuestState true </Plugin> <Plugin mqtt> <Publish "localhost"> Host "localhost" Port 1883 ClientId "localhost" QoS 1 Prefix "collectd" StoreRates true Retain false </Publish> </Plugin> EOF LD_LIBRARY_PATH=/usr/local/lib /opt/collectd/sbin/collectd -f -B -C collectd.conf Right after starting the system, messages written through the MQTT plugin would appear almost intantaneously on the broker. After running the system for 2-3 minutes messages received by the broker were the ones written to the mqtt plugin 20 seconds earlier, i.e. messages were delayed by 20 seconds in less than 180 seconds. If the system were left running, the delay would continue accumulating resulting in messages being delayed by several hours per day. [1] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_loop [2] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_max_inflight_messages_set Signed-off-by: Hannu Lounento <hannu.lounento@iki.fi>
hlounent
pushed a commit
to hlounent/collectd
that referenced
this pull request
Jan 3, 2021
Previously published messages could get delayed by several hours per day even when values were published (i.e. messages were sent) with relatively low frequencies (e.g. < 10 values / s) with QoS 1. The problem was reproducible on a i7-6700HQ CPU with only the cpu (with default configuration) and mqtt plugins enabled and the mqtt plugin configured to send to a local Mosquitto (v2.0.4) broker instance. See the note collectd#3 for more information. The same behavior was observed with various other combinations of mosquitto and collectd versions including e.g. mosquitto 1.6 and collectd 5.8 series. The latest working versions were mosquitto 1.6.7 and collectd 5.8.1 built with Yocto Zeus (including any patches OpenEmbedded might have had for the versions). The mosquitto API documentation [1] says mosquitto_loop [...] This must be called frequently to keep communications between the client and broker working. [...] Outgoing commands, from e.g. mosquitto_publish, are normally sent immediately that their function is called, but this is not always possible. mosquitto_loop will also attempt to send any remaining outgoing messages, which also includes commands that are part of the flow for messages with QoS>0. However, mosquitto_loop only returns MOSQ_ERR_SUCCESS or an error code in case of some failure but does not indicate whether there is still more work to do so the callback seem like the only means to decide if mosquitto_loop still needs to be called. Secondly, adding the callbacks was not enough, but the timeout argument of mosquitto_loop calls needed to be changed to 0 ms in order to return immediately because using the previous value of 1000 ms (which is the default value; see the note collectd#2) still resulted in messages being delayed. The pattern of using the callbacks and the zero timeout was adopted from the implementation of mosquitto_pub in mosquitto v2.0.4. This adds some overhead when QoS > 0 because mosquitto_loop may be called many times until the callbacks on_connect or on_publish. However, this seems like the only option without better knowledge of how mosquitto works internally, because adding any timeout to mosquitto_loop calls leads to delays. The overhead with QoS=1 on a i7-6700HQ with only the cpu and mqtt plugins was in the range of 5 %. The significant difference of 2 minute perf profiles perf record -ag -p <pid of collectd> -- sleep 120 without and with the fix looks like perf diff -o 0 main/perf.data fix/perf.data # # Baseline Delta Abs Shared Object Symbol # ........ ......... .................. .......................................... # [...] 0.55% +5.10% libmosquitto.so.1 [.] mosquitto_loop.part.0 [...] Notes ===== 1. Maximum number of inflight messages There is also mosquitto_max_inflight_messages_set, whose API documentation [2] says Set the number of QoS 1 and 2 messages that can be "in flight" at one time. An in flight message is part way through its delivery flow. Attempts to send further messages with mosquitto_publish will result in the messages being queued until the number of in flight messages reduces. and A higher number here results in greater message throughput, but if set higher than the maximum in flight messages on the broker may lead to delays in the messages being acknowledged. However, the docs also say This function is [deprecated]. and while Use the mosquitto_int_option function with the MOSQ_OPT_SEND_MAXIMUM option instead. is offered as a replacement, MOSQ_OPT_SEND_MAXIMUM seems not applicable in collectd's MQTT plugin MOSQ_OPT_SEND_MAXIMUM - [...] This option is not valid for MQTT v3.1 or v3.1.1 clients. [...] as the MQTT plugin uses the (mosquitto default) protocol level MOSQ_OPT_PROTOCOL_VERSION - [...] Defaults to MQTT_PROTOCOL_V311. as MOSQ_OPT_PROTOCOL_VERSION is not set by the plugin. 2. 1000 (ms) is the default timeout [1]: mosquitto_loop libmosq_EXPORT int mosquitto_loop(struct mosquitto *mosq, int timeout, int max_packets ) [...] Parameters [...] timeout [...] Set negative to use the default of 1000ms. 3. Steps to reproduce Build and run mosquitto broker git clone https://github.com/eclipse/mosquitto.git cd mosquitto git checkout v2.0.4 make && sudo make install cat << EOF > mosquitto.conf listener 1883 bind_interface lo allow_anonymous true EOF /usr/local/sbin/mosquitto -c mosquitto.conf -v Observe received messages and their timestamps LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/mosquitto_sub -t "collectd/#" Build and run collectd git clone git://github.com/collectd/collectd.git cd collectd git checkout collectd-5.12.0 sudo rm -rf /opt/collectd && \ mkdir build && \ cd build && \ ../configure --disable-all-plugins --enable-mqtt --enable-cpu --enable-debug && \ make && \ sudo make install cat << EOF > collectd.conf LoadPlugin cpu LoadPlugin mqtt <Plugin cpu> ReportByCpu true ReportByState true ValuesPercentage false ReportNumCpu false ReportGuestState false SubtractGuestState true </Plugin> <Plugin mqtt> <Publish "localhost"> Host "localhost" Port 1883 ClientId "localhost" QoS 1 Prefix "collectd" StoreRates true Retain false </Publish> </Plugin> EOF LD_LIBRARY_PATH=/usr/local/lib /opt/collectd/sbin/collectd -f -B -C collectd.conf Right after starting the system, messages written through the MQTT plugin would appear almost intantaneously on the broker. After running the system for 2-3 minutes messages received by the broker were the ones written to the mqtt plugin 20 seconds earlier, i.e. messages were delayed by 20 seconds in less than 180 seconds. If the system were left running, the delay would continue accumulating resulting in messages being delayed by several hours per day. [1] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_loop [2] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_max_inflight_messages_set Signed-off-by: Hannu Lounento <hannu.lounento@iki.fi>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds syslog notifications. See manpage