8000 Faxm0dem/syslog notif by faxm0dem · Pull Request #3 · collectd/collectd · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Faxm0dem/syslog notif #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

faxm0dem
Copy link
Contributor
@faxm0dem faxm0dem commented Feb 2, 2012

Adds syslog notifications. See manpage

Fabien Wernli and others added 5 commits January 19, 2012 12:49
Deduplicated some code from logfile and syslog
and added NotifyLevel option to plugin.

Change-Id: I364067189d628420333cb625c885a256399e076a
Change-Id: I48e61fc2613ca50c477794167f180e3b4f4fc673
@faxm0dem
Copy link
Contributor Author

/me pokes octo

@octo
Copy link
Member
octo commented Feb 25, 2012

Hi,

Thank you very much for your changes and thanks for being so patient with me :) I've rebased your changes on top of the master branch and fixed some issues I've found.

Best regards,
—octo

@octo octo closed this Feb 25, 2012
pkit referenced this pull request in pkit/collectd Dec 15, 2014
rubenk pushed a commit that referenced this pull request Jun 4, 2016
vincentbernat added a commit to vincentbernat/collectd that referenced this pull request Aug 19, 2016
When using a configuration like this:

```
LoadPlugin curl_json
<Plugin curl_json >
  <URL "http://api.facebook.com/method/fql.query?format=json&query=select+fan_count+from+page+where+page_id=280922605253831" >
    Instance "facebook"
    <Key "*/fan_count">
      Type "gauge"
    </Key>
  </URL>
</Plugin>
```

collectd segfaults in `src/curl_json.c` in `cj_cb_number`.

    >>> bt
    #0  __strcmp_sse2_unaligned () at ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S:31
    collectd#1  0x0000000000414da2 in search (t=0x2304a80, t=0x2304a80, key=0x1) at ../../../src/daemon/utils_avltree.c:119
    collectd#2  c_avl_get (t=0x2304a80, key=key@entry=0x1, value=value@entry=0x7f01184f2ab8) at ../../../src/daemon/utils_avltree.c:599
    collectd#3  0x000000000040cbd1 in plugin_get_ds (name=0x1 <error: Cannot access memory at address 0x1>) at ../../../src/daemon/plugin.c:2559
    collectd#4  0x00007f011f93b7ef in cj_get_type (key=0x23324f0) at ../../src/curl_json.c:147
    collectd#5  cj_cb_number (ctx=0x231f2f0, number=<optimized out>, number_len=2) at ../../src/curl_json.c:244
    collectd#6  0x00007f011f4b7874 in yajl_do_parse (hand=0x7f01100008e0, jsonText=jsonText@entry=0x234f990 "{\"error_code\":12,\"error_msg\":\"REST API is deprecated for versions v2.1 and higher (12)\",\"request_args\":[{\"key\":\"method\",\"value\":\"fql.query\"},{\"key\":\"format\",\"value\":\"json\"},{\"key\":\"query\",\"value\":\"sel"..., jsonTextLen=jsonTextLen@entry=257) at /tmp/buildd/yajl-2.1.0/src/yajl_parser.c:307

In fact, `db->state[db->depth].key` is not a key (it's a tree). This is
correctly detected at the beginning, but despite that, there is a call
to `cj_cb_inc_array_index()` (without effect, not in an array) and `key`
is set again to `db->state[db->depth].key` which is still not a key.

In a previous call to `cj_cb_map_key()` where the corresponding value is
set, `(c_avl_get (tree, CJ_ANY, (void *) &value) == 0)` is true but
`(CJ_IS_KEY((cj_key_t*)value))` is false. Therefore, this confirms this
a tree which is stored in this node, not a key.

This is a tentative fix to ignore the value if we are expecting a map,
but didn't get one and aren't in an inhomogeneous array.
@ghost ghost mentioned this pull request Nov 8, 2016
mrunge pushed a commit that referenced this pull request Sep 4, 2019
sysevent plugin initial commit
hlounent pushed a commit to hlounent/collectd that referenced this pull request Jan 3, 2021
Previously published messages could get delayed by several hours per day
at least (or even) in case there were (only) around 100 values /
messages published every 10 seconds.

The problem was reproducible on a i7-6700HQ CPU with only the cpu (with
default configuration) and mqtt plugins enabled and the mqtt plugin
configured to send to a local Mosquitto (v2.0.4) broker instance. See
the note collectd#3 for more information.

The same behavior was observed with various other combinations of
mosquitto and collectd versions including e.g. mosquitto 1.6 and
collectd 5.8 series. The latest working versions were mosquitto 1.6.7
and collectd 5.8.1 built with Yocto Zeus (including any patches
OpenEmbedded might have had for the versions).

The mosquitto API documentation [1] says

    mosquitto_loop

    [...] This must be called frequently to keep communications between
    the client and broker working. [...] Outgoing commands, from e.g.
    mosquitto_publish, are normally sent immediately that their function
    is called, but this is not always possible. mosquitto_loop will also
    attempt to send any remaining outgoing messages, which also includes
    commands that are part of the flow for messages with QoS>0.

However, mosquitto_loop only returns MOSQ_ERR_SUCCESS or an error code
in case of some failure but does not indicate whether there is still
more work to do so the callback seem like the only means to decide if
mosquitto_loop still needs to be called.

Secondly, adding the callbacks was not enough, but the timeout argument
of mosquitto_loop calls needed to be changed to 0 ms in order to return
immediately because using the previous value of 1000 ms (which is the
default value; see the note collectd#2) still resulted in messages being
delayed.

The pattern of using the callbacks and the zero timeout was adopted from
the implementation of mosquitto_pub in mosquitto v2.0.4.

This adds some overhead when QoS > 0 because mosquitto_loop may be
called many times until the callbacks on_connect or on_publish. However,
this seems like the only option without better knowledge of how
mosquitto works internally, because adding any timeout to mosquitto_loop
calls leads to delays. The overhead with QoS=1 on a i7-6700HQ with only
the cpu and mqtt plugins was in the range of 5 %. The significant
difference of 2 minute perf profiles

    perf record -ag -p <pid of collectd> -- sleep 120

without and with the fix looks like

    perf diff -o 0 main/perf.data fix/perf.data
    #
    # Baseline  Delta Abs  Shared Object       Symbol
    # ........  .........  ..................  ..........................................
    #
        [...]
        0.55%     +5.10%  libmosquitto.so.1   [.] mosquitto_loop.part.0

Notes
=====

1. Maximum number of inflight messages

There is also mosquitto_max_inflight_messages_set, whose API
documentation [2] says

    Set the number of QoS 1 and 2 messages that can be "in flight" at
    one time.  An in flight message is part way through its delivery
    flow. Attempts to send further messages with mosquitto_publish will
    result in the messages being queued until the number of in flight
    messages reduces.

and

    A higher number here results in greater message throughput, but if
    set higher than the maximum in flight messages on the broker may
    lead to delays in the messages being acknowledged.

However, the docs also say

    This function is [deprecated].

and while

    Use the mosquitto_int_option function with the MOSQ_OPT_SEND_MAXIMUM
    option instead.

is offered as a replacement, MOSQ_OPT_SEND_MAXIMUM seems not applicable
in collectd's MQTT plugin

    MOSQ_OPT_SEND_MAXIMUM - [...] This option is not valid for MQTT v3.1
    or v3.1.1 clients. [...]

as the MQTT plugin uses the (mosquitto default) protocol level

    MOSQ_OPT_PROTOCOL_VERSION - [...] Defaults to MQTT_PROTOCOL_V311.

as MOSQ_OPT_PROTOCOL_VERSION is not set by the plugin.

2. 1000 (ms) is the default timeout [1]:

    mosquitto_loop

        libmosq_EXPORT int mosquitto_loop(struct mosquitto *mosq,
                    int     timeout,
                    int     max_packets	)

    [...]

    Parameters

    [...]

    timeout	[...] Set negative to use the default of 1000ms.

3. Steps to reproduce

Build and run mosquitto broker

    git clone https://github.com/eclipse/mosquitto.git
    cd mosquitto
    git checkout v2.0.4
    make && sudo make install
    cat << EOF > mosquitto.conf
    listener 1883
    bind_interface lo
    allow_anonymous true
    EOF
    /usr/local/sbin/mosquitto -c mosquitto.conf -v

Observe received messages and their timestamps

    LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/mosquitto_sub -t "collectd/#"

Build and run collectd

    git clone git://github.com/collectd/collectd.git
    cd collectd
    git checkout collectd-5.12.0
    sudo rm -rf /opt/collectd && \
        mkdir build && \
        cd build && \
        ../configure --disable-all-plugins --enable-mqtt --enable-cpu --enable-debug && \
        make && \
        sudo make install
    cat << EOF > collectd.conf
    LoadPlugin cpu
    LoadPlugin mqtt
    <Plugin cpu>
        ReportByCpu true
        ReportByState true
        ValuesPercentage false
        ReportNumCpu false
        ReportGuestState false
        SubtractGuestState true
    </Plugin>
    <Plugin mqtt>
        <Publish "localhost">
            Host "localhost"
            Port 1883
            ClientId "localhost"
            QoS 1 # Or QoS 0
            Prefix "collectd"
            StoreRates true
            Retain false
        </Publish>
    </Plugin>
    EOF
    LD_LIBRARY_PATH=/usr/local/lib /opt/collectd/sbin/collectd -f -B -C collectd.conf

Right after starting the system, messages written through the MQTT
plugin would appear almost intantaneously on the broker. After running
the system for 2-3 minutes messages received by the broker were the ones
written to the mqtt plugin 20 seconds earlier, i.e. messages were
delayed by 20 seconds in less than 180 seconds. If the system were left
running, the delay would continue accumulating resulting in messages
being delayed by several hours per day.

[1] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_loop
[2] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_max_inflight_messages_set

Signed-off-by: Hannu Lounento <hannu.lounento@iki.fi>
hlounent pushed a commit to hlounent/collectd that referenced this pull request Jan 3, 2021
Previously published messages could get delayed by several hours per day
at least (or even) in case there were (only) around 100 values /
messages published every 10 seconds.

The problem was reproducible on a i7-6700HQ CPU with only the cpu (with
default configuration) and mqtt plugins enabled and the mqtt plugin
configured to send to a local Mosquitto (v2.0.4) broker instance. See
the note collectd#3 for more information.

The same behavior was observed with various other combinations of
mosquitto and collectd versions including e.g. mosquitto 1.6 and
collectd 5.8 series. The latest working versions were mosquitto 1.6.7
and collectd 5.8.1 built with Yocto Zeus (including any patches
OpenEmbedded might have had for the versions).

The mosquitto API documentation [1] says

    mosquitto_loop

    [...] This must be called frequently to keep communications between
    the client and broker working. [...] Outgoing commands, from e.g.
    mosquitto_publish, are normally sent immediately that their function
    is called, but this is not always possible. mosquitto_loop will also
    attempt to send any remaining outgoing messages, which also includes
    commands that are part of the flow for messages with QoS>0.

However, mosquitto_loop only returns MOSQ_ERR_SUCCESS or an error code
in case of some failure but does not indicate whether there is still
more work to do so the callback seem like the only means to decide if
mosquitto_loop still needs to be called.

Secondly, adding the callbacks was not enough, but the timeout argument
of mosquitto_loop calls needed to be changed to 0 ms in order to return
immediately because using the previous value of 1000 ms (which is the
default value; see the note collectd#2) still resulted in messages being
delayed.

The pattern of using the callbacks and the zero timeout was adopted from
the implementation of mosquitto_pub in mosquitto v2.0.4.

This adds some overhead when QoS > 0 because mosquitto_loop may be
called many times until the callbacks on_connect or on_publish. However,
this seems like the only option without better knowledge of how
mosquitto works internally, because adding any timeout to mosquitto_loop
calls leads to delays. The overhead with QoS=1 on a i7-6700HQ with only
the cpu and mqtt plugins was in the range of 5 %. The significant
difference of 2 minute perf profiles

    perf record -ag -p <pid of collectd> -- sleep 120

without and with the fix looks like

    perf diff -o 0 main/perf.data fix/perf.data
    #
    # Baseline  Delta Abs  Shared Object       Symbol
    # ........  .........  ..................  ..........................................
    #
        [...]
        0.55%     +5.10%  libmosquitto.so.1   [.] mosquitto_loop.part.0

Notes
=====

1. Maximum number of inflight messages

There is also mosquitto_max_inflight_messages_set, whose API
documentation [2] says

    Set the number of QoS 1 and 2 messages that can be "in flight" at
    one time.  An in flight message is part way through its delivery
    flow. Attempts to send further messages with mosquitto_publish will
    result in the messages being queued until the number of in flight
    messages reduces.

and

    A higher number here results in greater message throughput, but if
    set higher than the maximum in flight messages on the broker may
    lead to delays in the messages being acknowledged.

However, the docs also say

    This function is [deprecated].

and while

    Use the mosquitto_int_option function with the MOSQ_OPT_SEND_MAXIMUM
    option instead.

is offered as a replacement, MOSQ_OPT_SEND_MAXIMUM seems not applicable
in collectd's MQTT plugin

    MOSQ_OPT_SEND_MAXIMUM - [...] This option is not valid for MQTT v3.1
    or v3.1.1 clients. [...]

as the MQTT plugin uses the (mosquitto default) protocol level

    MOSQ_OPT_PROTOCOL_VERSION - [...] Defaults to MQTT_PROTOCOL_V311.

as MOSQ_OPT_PROTOCOL_VERSION is not set by the plugin.

2. 1000 (ms) is the default timeout [1]:

    mosquitto_loop

        libmosq_EXPORT int mosquitto_loop(struct mosquitto *mosq,
                    int     timeout,
                    int     max_packets	)

    [...]

    Parameters

    [...]

    timeout	[...] Set negative to use the default of 1000ms.

3. Steps to reproduce

Build and run mosquitto broker

    git clone https://github.com/eclipse/mosquitto.git
    cd mosquitto
    git checkout v2.0.4
    make && sudo make install
    cat << EOF > mosquitto.conf
    listener 1883
    bind_interface lo
    allow_anonymous true
    EOF
    /usr/local/sbin/mosquitto -c mosquitto.conf -v

Observe received messages and their timestamps

    LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/mosquitto_sub -t "collectd/#"

Build and run collectd

    git clone git://github.com/collectd/collectd.git
    cd collectd
    git checkout collectd-5.12.0
    sudo rm -rf /opt/collectd && \
        mkdir build && \
        cd build && \
        ../configure --disable-all-plugins --enable-mqtt --enable-cpu --enable-debug && \
        make && \
        sudo make install
    cat << EOF > collectd.conf
    LoadPlugin cpu
    LoadPlugin mqtt
    <Plugin cpu>
        ReportByCpu true
        ReportByState true
        ValuesPercentage false
        ReportNumCpu false
        ReportGuestState false
        SubtractGuestState true
    </Plugin>
    <Plugin mqtt>
        <Publish "localhost">
            Host "localhost"
            Port 1883
            ClientId "localhost"
            QoS 1 # Or QoS 0
            Prefix "collectd"
            StoreRates true
            Retain false
        </Publish>
    </Plugin>
    EOF
    LD_LIBRARY_PATH=/usr/local/lib /opt/collectd/sbin/collectd -f -B -C collectd.conf

Right after starting the system, messages written through the MQTT
plugin would appear almost intantaneously on the broker. After running
the system for 2-3 minutes messages received by the broker were the ones
written to the mqtt plugin 20 seconds earlier, i.e. messages were
delayed by 20 seconds in less than 180 seconds. If the system were left
running, the delay would continue accumulating resulting in messages
being delayed by several hours per day.

[1] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_loop
[2] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_max_inflight_messages_set

Signed-off-by: Hannu Lounento <hannu.lounento@iki.fi>
hlounent pushed a commit to hlounent/collectd that referenced this pull request Jan 3, 2021
Previously published messages could get delayed by several hours per day
even when values were published (i.e. messages were sent) with
relatively low frequencies (e.g. < 10 values / s) with QoS 1.

The problem was reproducible on a i7-6700HQ CPU with only the cpu (with
default configuration) and mqtt plugins enabled and the mqtt plugin
configured to send to a local Mosquitto (v2.0.4) broker instance. See
the note collectd#3 for more information.

The same behavior was observed with various other combinations of
mosquitto and collectd versions including e.g. mosquitto 1.6 and
collectd 5.8 series. The latest working versions were mosquitto 1.6.7
and collectd 5.8.1 built with Yocto Zeus (including any patches
OpenEmbedded might have had for the versions).

The mosquitto API documentation [1] says

    mosquitto_loop

    [...] This must be called frequently to keep communications between
    the client and broker working. [...] Outgoing commands, from e.g.
    mosquitto_publish, are normally sent immediately that their function
    is called, but this is not always possible. mosquitto_loop will also
    attempt to send any remaining outgoing messages, which also includes
    commands that are part of the flow for messages with QoS>0.

However, mosquitto_loop only returns MOSQ_ERR_SUCCESS or an error code
in case of some failure but does not indicate whether there is still
more work to do so the callback seem like the only means to decide if
mosquitto_loop still needs to be called.

Secondly, adding the callbacks was not enough, but the timeout argument
of mosquitto_loop calls needed to be changed to 0 ms in order to return
immediately because using the previous value of 1000 ms (which is the
default value; see the note collectd#2) still resulted in messages being
delayed.

The pattern of using the callbacks and the zero timeout was adopted from
the implementation of mosquitto_pub in mosquitto v2.0.4.

This adds some overhead when QoS > 0 because mosquitto_loop may be
called many times until the callbacks on_connect or on_publish. However,
this seems like the only option without better knowledge of how
mosquitto works internally, because adding any timeout to mosquitto_loop
calls leads to delays. The overhead with QoS=1 on a i7-6700HQ with only
the cpu and mqtt plugins was in the range of 5 %. The significant
difference of 2 minute perf profiles

    perf record -ag -p <pid of collectd> -- sleep 120

without and with the fix looks like

    perf diff -o 0 main/perf.data fix/perf.data
    #
    # Baseline  Delta Abs  Shared Object       Symbol
    # ........  .........  ..................  ..........................................
    #
        [...]
        0.55%     +5.10%  libmosquitto.so.1   [.] mosquitto_loop.part.0

Notes
=====

1. Maximum number of inflight messages

There is also mosquitto_max_inflight_messages_set, whose API
documentation [2] says

    Set the number of QoS 1 and 2 messages that can be "in flight" at
    one time.  An in flight message is part way through its delivery
    flow. Attempts to send further messages with mosquitto_publish will
    result in the messages being queued until the number of in flight
    messages reduces.

and

    A higher number here results in greater message throughput, but if
    set higher than the maximum in flight messages on the broker may
    lead to delays in the messages being acknowledged.

However, the docs also say

    This function is [deprecated].

and while

    Use the mosquitto_int_option function with the MOSQ_OPT_SEND_MAXIMUM
    option instead.

is offered as a replacement, MOSQ_OPT_SEND_MAXIMUM seems not applicable
in collectd's MQTT plugin

    MOSQ_OPT_SEND_MAXIMUM - [...] This option is not valid for MQTT v3.1
    or v3.1.1 clients. [...]

as the MQTT plugin uses the (mosquitto default) protocol level

    MOSQ_OPT_PROTOCOL_VERSION - [...] Defaults to MQTT_PROTOCOL_V311.

as MOSQ_OPT_PROTOCOL_VERSION is not set by the plugin.

2. 1000 (ms) is the default timeout [1]:

    mosquitto_loop

        libmosq_EXPORT int mosquitto_loop(struct mosquitto *mosq,
                    int     timeout,
                    int     max_packets	)

    [...]

    Parameters

    [...]

    timeout	[...] Set negative to use the default of 1000ms.

3. Steps to reproduce

Build and run mosquitto broker

    git clone https://github.com/eclipse/mosquitto.git
    cd mosquitto
    git checkout v2.0.4
    make && sudo make install
    cat << EOF > mosquitto.conf
    listener 1883
    bind_interface lo
    allow_anonymous true
    EOF
    /usr/local/sbin/mosquitto -c mosquitto.conf -v

Observe received messages and their timestamps

    LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/mosquitto_sub -t "collectd/#"

Build and run collectd

    git clone git://github.com/collectd/collectd.git
    cd collectd
    git checkout collectd-5.12.0
    sudo rm -rf /opt/collectd && \
        mkdir build && \
        cd build && \
        ../configure --disable-all-plugins --enable-mqtt --enable-cpu --enable-debug && \
        make && \
        sudo make install
    cat << EOF > collectd.conf
    LoadPlugin cpu
    LoadPlugin mqtt
    <Plugin cpu>
        ReportByCpu true
        ReportByState true
        ValuesPercentage false
        ReportNumCpu false
        ReportGuestState false
        SubtractGuestState true
    </Plugin>
    <Plugin mqtt>
        <Publish "localhost">
            Host "localhost"
            Port 1883
            ClientId "localhost"
            QoS 1
            Prefix "collectd"
            StoreRates true
            Retain false
        </Publish>
    </Plugin>
    EOF
    LD_LIBRARY_PATH=/usr/local/lib /opt/collectd/sbin/collectd -f -B -C collectd.conf

Right after starting the system, messages written through the MQTT
plugin would appear almost intantaneously on the broker. After running
the system for 2-3 minutes messages received by the broker were the ones
written to the mqtt plugin 20 seconds earlier, i.e. messages were
delayed by 20 seconds in less than 180 seconds. If the system were left
running, the delay would continue accumulating resulting in messages
being delayed by several hours per day.

[1] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_loop
[2] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_max_inflight_messages_set

Signed-off-by: Hannu Lounento <hannu.lounento@iki.fi>
hlounent pushed a commit to hlounent/collectd that referenced this pull request Jan 3, 2021
Previously published messages could get delayed by several hours per day
even when values were published (i.e. messages were sent) with
relatively low frequencies (e.g. < 10 values / s) with QoS 1.

The problem was reproducible on a i7-6700HQ CPU with only the cpu (with
default configuration) and mqtt plugins enabled and the mqtt plugin
configured to send to a local Mosquitto (v2.0.4) broker instance. See
the note collectd#3 for more information.

The same behavior was observed with various other combinations of
mosquitto and collectd versions including e.g. mosquitto 1.6 and
collectd 5.8 series. The latest working versions were mosquitto 1.6.7
and collectd 5.8.1 built with Yocto Zeus (including any patches
OpenEmbedded might have had for the versions).

The mosquitto API documentation [1] says

    mosquitto_loop

    [...] This must be called frequently to keep communications between
    the client and broker working. [...] Outgoing commands, from e.g.
    mosquitto_publish, are normally sent immediately that their function
    is called, but this is not always possible. mosquitto_loop will also
    attempt to send any remaining outgoing messages, which also includes
    commands that are part of the flow for messages with QoS>0.

However, mosquitto_loop only returns MOSQ_ERR_SUCCESS or an error code
in case of some failure but does not indicate whether there is still
more work to do so the callback seem like the only means to decide if
mosquitto_loop still needs to be called.

Secondly, adding the callbacks was not enough, but the timeout argument
of mosquitto_loop calls needed to be changed to 0 ms in order to return
immediately because using the previous value of 1000 ms (which is the
default value; see the note collectd#2) still resulted in messages being
delayed.

The pattern of using the callbacks and the zero timeout was adopted from
the implementation of mosquitto_pub in mosquitto v2.0.4.

This adds some overhead when QoS > 0 because mosquitto_loop may be
called many times until the callbacks on_connect or on_publish. However,
this seems like the only option without better knowledge of how
mosquitto works internally, because adding any timeout to mosquitto_loop
calls leads to delays. The overhead with QoS=1 on a i7-6700HQ with only
the cpu and mqtt plugins was in the range of 5 %. The significant
difference of 2 minute perf profiles

    perf record -ag -p <pid of collectd> -- sleep 120

without and with the fix looks like

    perf diff -o 0 main/perf.data fix/perf.data
    #
    # Baseline  Delta Abs  Shared Object       Symbol
    # ........  .........  ..................  ..........................................
    #
        [...]
        0.55%     +5.10%  libmosquitto.so.1   [.] mosquitto_loop.part.0

Notes
=====

1. Maximum number of inflight messages

There is also mosquitto_max_inflight_messages_set, whose API
documentation [2] says

    Set the number of QoS 1 and 2 messages that can be "in flight" at
    one time.  An in flight message is part way through its delivery
    flow. Attempts to send further messages with mosquitto_publish will
    result in the messages being queued until the number of in flight
    messages reduces.

and

    A higher number here results in greater message throughput, but if
    set higher than the maximum in flight messages on the broker may
    lead to delays in the messages being acknowledged.

However, the docs also say

    This function is [deprecated].

and while

    Use the mosquitto_int_option function with the MOSQ_OPT_SEND_MAXIMUM
    option instead.

is offered as a replacement, MOSQ_OPT_SEND_MAXIMUM seems not applicable
in collectd's MQTT plugin

    MOSQ_OPT_SEND_MAXIMUM - [...] This option is not valid for MQTT v3.1
    or v3.1.1 clients. [...]

as the MQTT plugin uses the (mosquitto default) protocol level

    MOSQ_OPT_PROTOCOL_VERSION - [...] Defaults to MQTT_PROTOCOL_V311.

as MOSQ_OPT_PROTOCOL_VERSION is not set by the plugin.

2. 1000 (ms) is the default timeout [1]:

    mosquitto_loop

        libmosq_EXPORT int mosquitto_loop(struct mosquitto *mosq,
                    int     timeout,
                    int     max_packets	)

    [...]

    Parameters

    [...]

    timeout	[...] Set negative to use the default of 1000ms.

3. Steps to reproduce

Build and run mosquitto broker

    git clone https://github.com/eclipse/mosquitto.git
    cd mosquitto
    git checkout v2.0.4
    make && sudo make install
    cat << EOF > mosquitto.conf
    listener 1883
    bind_interface lo
    allow_anonymous true
    EOF
    /usr/local/sbin/mosquitto -c mosquitto.conf -v

Observe received messages and their timestamps

    LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/mosquitto_sub -t "collectd/#"

Build and run collectd

    git clone git://github.com/collectd/collectd.git
    cd collectd
    git checkout collectd-5.12.0
    sudo rm -rf /opt/collectd && \
        mkdir build && \
        cd build && \
        ../configure --disable-all-plugins --enable-mqtt --enable-cpu --enable-debug && \
        make && \
        sudo make install
    cat << EOF > collectd.conf
    LoadPlugin cpu
    LoadPlugin mqtt
    <Plugin cpu>
        ReportByCpu true
        ReportByState true
        ValuesPercentage false
        ReportNumCpu false
        ReportGuestState false
        SubtractGuestState true
    </Plugin>
    <Plugin mqtt>
        <Publish "localhost">
            Host "localhost"
            Port 1883
            ClientId "localhost"
            QoS 1
            Prefix "collectd"
            StoreRates true
            Retain false
        </Publish>
    </Plugin>
    EOF
    LD_LIBRARY_PATH=/usr/local/lib /opt/collectd/sbin/collectd -f -B -C collectd.conf

Right after starting the system, messages written through the MQTT
plugin would appear almost intantaneously on the broker. After running
the system for 2-3 minutes messages received by the broker were the ones
written to the mqtt plugin 20 seconds earlier, i.e. messages were
delayed by 20 seconds in less than 180 seconds. If the system were left
running, the delay would continue accumulating resulting in messages
being delayed by several hours per day.

[1] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_loop
[2] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_max_inflight_messages_set

Signed-off-by: Hannu Lounento <hannu.lounento@iki.fi>
hlounent pushed a commit to hlounent/collectd that referenced this pull request Jan 3, 2021
Previously published messages could get delayed by several hours per day
even when values were published (i.e. messages were sent) with
relatively low frequencies (e.g. < 10 values / s) with QoS 1.

The problem was reproducible on a i7-6700HQ CPU with only the cpu (with
default configuration) and mqtt plugins enabled and the mqtt plugin
configured to send to a local Mosquitto (v2.0.4) broker instance. See
the note collectd#3 for more information.

The same behavior was observed with various other combinations of
mosquitto and collectd versions including e.g. mosquitto 1.6 and
collectd 5.8 series. The latest working versions were mosquitto 1.6.7
and collectd 5.8.1 built with Yocto Zeus (including any patches
OpenEmbedded might have had for the versions).

The mosquitto API documentation [1] says

    mosquitto_loop

    [...] This must be called frequently to keep communications between
    the client and broker working. [...] Outgoing commands, from e.g.
    mosquitto_publish, are normally sent immediately that their function
    is called, but this is not always possible. mosquitto_loop will also
    attempt to send any remaining outgoing messages, which also includes
    commands that are part of the flow for messages with QoS>0.

However, mosquitto_loop only returns MOSQ_ERR_SUCCESS or an error code
in case of some failure but does not indicate whether there is still
more work to do so the callback seem like the only means to decide if
mosquitto_loop still needs to be called.

Secondly, adding the callbacks was not enough, but the timeout argument
of mosquitto_loop calls needed to be changed to 0 ms in order to return
immediately because using the previous value of 1000 ms (which is the
default value; see the note collectd#2) still resulted in messages being
delayed.

The pattern of using the callbacks and the zero timeout was adopted from
the implementation of mosquitto_pub in mosquitto v2.0.4.

This adds some overhead when QoS > 0 because mosquitto_loop may be
called many times until the callbacks on_connect or on_publish. However,
this seems like the only option without better knowledge of how
mosquitto works internally, because adding any timeout to mosquitto_loop
calls leads to delays. The overhead with QoS=1 on a i7-6700HQ with only
the cpu and mqtt plugins was in the range of 5 %. The significant
difference of 2 minute perf profiles

    perf record -ag -p <pid of collectd> -- sleep 120

without and with the fix looks like

    perf diff -o 0 main/perf.data fix/perf.data
    #
    # Baseline  Delta Abs  Shared Object       Symbol
    # ........  .........  ..................  ..........................................
    #
        [...]
        0.55%     +5.10%  libmosquitto.so.1   [.] mosquitto_loop.part.0
        [...]

Notes
=====

1. Maximum number of inflight messages

There is also mosquitto_max_inflight_messages_set, whose API
documentation [2] says

    Set the number of QoS 1 and 2 messages that can be "in flight" at
    one time.  An in flight message is part way through its delivery
    flow. Attempts to send further messages with mosquitto_publish will
    result in the messages being queued until the number of in flight
    messages reduces.

and

    A higher number here results in greater message throughput, but if
    set higher than the maximum in flight messages on the broker may
    lead to delays in the messages being acknowledged.

However, the docs also say

    This function is [deprecated].

and while

    Use the mosquitto_int_option function with the MOSQ_OPT_SEND_MAXIMUM
    option instead.

is offered as a replacement, MOSQ_OPT_SEND_MAXIMUM seems not applicable
in collectd's MQTT plugin

    MOSQ_OPT_SEND_MAXIMUM - [...] This option is not valid for MQTT v3.1
    or v3.1.1 clients. [...]

as the MQTT plugin uses the (mosquitto default) protocol level

    MOSQ_OPT_PROTOCOL_VERSION - [...] Defaults to MQTT_PROTOCOL_V311.

as MOSQ_OPT_PROTOCOL_VERSION is not set by the plugin.

2. 1000 (ms) is the default timeout [1]:

    mosquitto_loop

        libmosq_EXPORT int mosquitto_loop(struct mosquitto *mosq,
                    int     timeout,
                    int     max_packets	)

    [...]

    Parameters

    [...]

    timeout	[...] Set negative to use the default of 1000ms.

3. Steps to reproduce

Build and run mosquitto broker

    git clone https://github.com/eclipse/mosquitto.git
    cd mosquitto
    git checkout v2.0.4
    make && sudo make install
    cat << EOF > mosquitto.conf
    listener 1883
    bind_interface lo
    allow_anonymous true
    EOF
    /usr/local/sbin/mosquitto -c mosquitto.conf -v

Observe received messages and their timestamps

    LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/mosquitto_sub -t "collectd/#"

Build and run collectd

    git clone git://github.com/collectd/collectd.git
    cd collectd
    git checkout collectd-5.12.0
    sudo rm -rf /opt/collectd && \
        mkdir build && \
        cd build && \
        ../configure --disable-all-plugins --enable-mqtt --enable-cpu --enable-debug && \
        make && \
        sudo make install
    cat << EOF > collectd.conf
    LoadPlugin cpu
    LoadPlugin mqtt
    <Plugin cpu>
        ReportByCpu true
        ReportByState true
        ValuesPercentage false
        ReportNumCpu false
        ReportGuestState false
        SubtractGuestState true
    </Plugin>
    <Plugin mqtt>
        <Publish "localhost">
            Host "localhost"
            Port 1883
            ClientId "localhost"
            QoS 1
            Prefix "collectd"
            StoreRates true
            Retain false
        </Publish>
    </Plugin>
    EOF
    LD_LIBRARY_PATH=/usr/local/lib /opt/collectd/sbin/collectd -f -B -C collectd.conf

Right after starting the system, messages written through the MQTT
plugin would appear almost intantaneously on the broker. After running
the system for 2-3 minutes messages received by the broker were the ones
written to the mqtt plugin 20 seconds earlier, i.e. messages were
delayed by 20 seconds in less than 180 seconds. If the system were left
running, the delay would continue accumulating resulting in messages
being delayed by several hours per day.

[1] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_loop
[2] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_max_inflight_messages_set

Signed-off-by: Hannu Lounento <hannu.lounento@iki.fi>
hlounent pushed a commit to hlounent/collectd that referenced this pull request Jan 3, 2021
Previously published messages could get delayed by several hours per day
even when values were published (i.e. messages were sent) with
relatively low frequencies (e.g. < 10 values / s) with QoS 1.

The problem was reproducible on a i7-6700HQ CPU with only the cpu (with
default configuration) and mqtt plugins enabled and the mqtt plugin
configured to send to a local Mosquitto (v2.0.4) broker instance. See
the note collectd#3 for more information.

The same behavior was observed with various other combinations of
mosquitto and collectd versions including e.g. mosquitto 1.6 and
collectd 5.8 series. The latest working versions were mosquitto 1.6.7
and collectd 5.8.1 built with Yocto Zeus (including any patches
OpenEmbedded might have had for the versions).

The mosquitto API documentation [1] says

    mosquitto_loop

    [...] This must be called frequently to keep communications between
    the client and broker working. [...] Outgoing commands, from e.g.
    mosquitto_publish, are normally sent immediately that their function
    is called, but this is not always possible. mosquitto_loop will also
    attempt to send any remaining outgoing messages, which also includes
    commands that are part of the flow for messages with QoS>0.

However, mosquitto_loop only returns MOSQ_ERR_SUCCESS or an error code
in case of some failure but does not indicate whether there is still
more work to do so the callback seem like the only means to decide if
mosquitto_loop still needs to be called.

Secondly, adding the callbacks was not enough, but the timeout argument
of mosquitto_loop calls needed to be changed to 0 ms in order to return
immediately because using the previous value of 1000 ms (which is the
default value; see the note collectd#2) still resulted in messages being
delayed.

The pattern of using the callbacks and the zero timeout was adopted from
the implementation of mosquitto_pub in mosquitto v2.0.4.

This adds some overhead when QoS > 0 because mosquitto_loop may be
called many times until the callbacks on_connect or on_publish. However,
this seems like the only option without better knowledge of how
mosquitto works internally, because adding any timeout to mosquitto_loop
calls leads to delays. The overhead with QoS=1 on a i7-6700HQ with only
the cpu and mqtt plugins was in the range of 5 %. The significant
difference of 2 minute perf profiles

    perf record -ag -p <pid of collectd> -- sleep 120

without and with the fix looks like

    perf diff -o 0 main/perf.data fix/perf.data
    #
    # Baseline  Delta Abs  Shared Object       Symbol
    # ........  .........  ..................  ..........................................
    #
        [...]
        0.55%     +5.10%  libmosquitto.so.1   [.] mosquitto_loop.part.0
        [...]

Notes
=====

1. Maximum number of inflight messages

There is also mosquitto_max_inflight_messages_set, whose API
documentation [2] says

    Set the number of QoS 1 and 2 messages that can be "in flight" at
    one time.  An in flight message is part way through its delivery
    flow. Attempts to send further messages with mosquitto_publish will
    result in the messages being queued until the number of in f
8000
light
    messages reduces.

and

    A higher number here results in greater message throughput, but if
    set higher than the maximum in flight messages on the broker may
    lead to delays in the messages being acknowledged.

However, the docs also say

    This function is [deprecated].

and while

    Use the mosquitto_int_option function with the MOSQ_OPT_SEND_MAXIMUM
    option instead.

is offered as a replacement, MOSQ_OPT_SEND_MAXIMUM seems not applicable
in collectd's MQTT plugin

    MOSQ_OPT_SEND_MAXIMUM - [...] This option is not valid for MQTT v3.1
    or v3.1.1 clients. [...]

as the MQTT plugin uses the (mosquitto default) protocol level

    MOSQ_OPT_PROTOCOL_VERSION - [...] Defaults to MQTT_PROTOCOL_V311.

as MOSQ_OPT_PROTOCOL_VERSION is not set by the plugin.

2. 1000 (ms) is the default timeout [1]:

    mosquitto_loop

        libmosq_EXPORT int mosquitto_loop(struct mosquitto *mosq,
                    int     timeout,
                    int     max_packets	)

    [...]

    Parameters

    [...]

    timeout	[...] Set negative to use the default of 1000ms.

3. Steps to reproduce

Build and run mosquitto broker

    git clone https://github.com/eclipse/mosquitto.git
    cd mosquitto
    git checkout v2.0.4
    make && sudo make install
    cat << EOF > mosquitto.conf
    listener 1883
    bind_interface lo
    allow_anonymous true
    EOF
    /usr/local/sbin/mosquitto -c mosquitto.conf -v

Observe received messages and their timestamps

    LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/mosquitto_sub -t "collectd/#"

Build and run collectd

    git clone git://github.com/collectd/collectd.git
    cd collectd
    git checkout collectd-5.12.0
    sudo rm -rf /opt/collectd && \
        mkdir build && \
        cd build && \
        ../configure --disable-all-plugins --enable-mqtt --enable-cpu --enable-debug && \
        make && \
        sudo make install
    cat << EOF > collectd.conf
    LoadPlugin cpu
    LoadPlugin mqtt
    <Plugin cpu>
        ReportByCpu true
        ReportByState true
        ValuesPercentage false
        ReportNumCpu false
        ReportGuestState false
        SubtractGuestState true
    </Plugin>
    <Plugin mqtt>
        <Publish "localhost">
            Host "localhost"
            Port 1883
            ClientId "localhost"
            QoS 1
            Prefix "collectd"
            StoreRates true
            Retain false
        </Publish>
    </Plugin>
    EOF
    LD_LIBRARY_PATH=/usr/local/lib /opt/collectd/sbin/collectd -f -B -C collectd.conf

Right after starting the system, messages written through the MQTT
plugin would appear almost intantaneously on the broker. After running
the system for 2-3 minutes messages received by the broker were the ones
written to the mqtt plugin 20 seconds earlier, i.e. messages were
delayed by 20 seconds in less than 180 seconds. If the system were left
running, the delay would continue accumulating resulting in messages
being delayed by several hours per day.

[1] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_loop
[2] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_max_inflight_messages_set

Signed-off-by: Hannu Lounento <hannu.lounento@iki.fi>
hlounent pushed a commit to hlounent/collectd that referenced this pull request Jan 3, 2021
Previously published messages could get delayed by several hours per day
even when values were published (i.e. messages were sent) with
relatively low frequencies (e.g. < 10 values / s) with QoS 1.

The problem was reproducible on a i7-6700HQ CPU with only the cpu (with
default configuration) and mqtt plugins enabled and the mqtt plugin
configured to send to a local Mosquitto (v2.0.4) broker instance. See
the note collectd#3 for more information.

The same behavior was observed with various other combinations of
mosquitto and collectd versions including e.g. mosquitto 1.6 and
collectd 5.8 series. The latest working versions were mosquitto 1.6.7
and collectd 5.8.1 built with Yocto Zeus (including any patches
OpenEmbedded might have had for the versions).

The mosquitto API documentation [1] says

    mosquitto_loop

    [...] This must be called frequently to keep communications between
    the client and broker working. [...] Outgoing commands, from e.g.
    mosquitto_publish, are normally sent immediately that their function
    is called, but this is not always possible. mosquitto_loop will also
    attempt to send any remaining outgoing messages, which also includes
    commands that are part of the flow for messages with QoS>0.

However, mosquitto_loop only returns MOSQ_ERR_SUCCESS or an error code
in case of some failure but does not indicate whether there is still
more work to do so the callback seem like the only means to decide if
mosquitto_loop still needs to be called.

Secondly, adding the callbacks was not enough, but the timeout argument
of mosquitto_loop calls needed to be changed to 0 ms in order to return
immediately because using the previous value of 1000 ms (which is the
default value; see the note collectd#2) still resulted in messages being
delayed.

The pattern of using the callbacks and the zero timeout was adopted from
the implementation of mosquitto_pub in mosquitto v2.0.4.

This adds some overhead when QoS > 0 because mosquitto_loop may be
called many times until the callbacks on_connect or on_publish. However,
this seems like the only option without better knowledge of how
mosquitto works internally, because adding any timeout to mosquitto_loop
calls leads to delays. The overhead with QoS=1 on a i7-6700HQ with only
the cpu and mqtt plugins was in the range of 5 %. The significant
difference of 2 minute perf profiles

    perf record -ag -p <pid of collectd> -- sleep 120

without and with the fix looks like

    perf diff -o 0 main/perf.data fix/perf.data
    #
    # Baseline  Delta Abs  Shared Object       Symbol
    # ........  .........  ..................  ..........................................
    #
        [...]
        0.55%     +5.10%  libmosquitto.so.1   [.] mosquitto_loop.part.0
        [...]

Notes
=====

1. Maximum number of inflight messages

There is also mosquitto_max_inflight_messages_set, whose API
documentation [2] says

    Set the number of QoS 1 and 2 messages that can be "in flight" at
    one time.  An in flight message is part way through its delivery
    flow. Attempts to send further messages with mosquitto_publish will
    result in the messages being queued until the number of in flight
    messages reduces.

and

    A higher number here results in greater message throughput, but if
    set higher than the maximum in flight messages on the broker may
    lead to delays in the messages being acknowledged.

However, the docs also say

    This function is [deprecated].

and while

    Use the mosquitto_int_option function with the MOSQ_OPT_SEND_MAXIMUM
    option instead.

is offered as a replacement, MOSQ_OPT_SEND_MAXIMUM seems not applicable
in collectd's MQTT plugin

    MOSQ_OPT_SEND_MAXIMUM - [...] This option is not valid for MQTT v3.1
    or v3.1.1 clients. [...]

as the MQTT plugin uses the (mosquitto default) protocol level

    MOSQ_OPT_PROTOCOL_VERSION - [...] Defaults to MQTT_PROTOCOL_V311.

as MOSQ_OPT_PROTOCOL_VERSION is not set by the plugin.

2. 1000 (ms) is the default timeout [1]:

    mosquitto_loop

        libmosq_EXPORT int mosquitto_loop(struct mosquitto *mosq,
                    int     timeout,
                    int     max_packets	)

    [...]

    Parameters

    [...]

    timeout	[...] Set negative to use the default of 1000ms.

3. Steps to reproduce

Build and run mosquitto broker

    git clone https://github.com/eclipse/mosquitto.git
    cd mosquitto
    git checkout v2.0.4
    make && sudo make install
    cat << EOF > mosquitto.conf
    listener 1883
    bind_interface lo
    allow_anonymous true
    EOF
    /usr/local/sbin/mosquitto -c mosquitto.conf -v

Observe received messages and their timestamps

    LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/mosquitto_sub -t "collectd/#"

Build and run collectd

    git clone git://github.com/collectd/collectd.git
    cd collectd
    git checkout collectd-5.12.0
    sudo rm -rf /opt/collectd && \
        mkdir build && \
        cd build && \
        ../configure --disable-all-plugins --enable-mqtt --enable-cpu --enable-debug && \
        make && \
        sudo make install
    cat << EOF > collectd.conf
    LoadPlugin cpu
    LoadPlugin mqtt
    <Plugin cpu>
        ReportByCpu true
        ReportByState true
        ValuesPercentage false
        ReportNumCpu false
        ReportGuestState false
        SubtractGuestState true
    </Plugin>
    <Plugin mqtt>
        <Publish "localhost">
            Host "localhost"
            Port 1883
            ClientId "localhost"
            QoS 1
            Prefix "collectd"
            StoreRates true
            Retain false
        </Publish>
    </Plugin>
    EOF
    LD_LIBRARY_PATH=/usr/local/lib /opt/collectd/sbin/collectd -f -B -C collectd.conf

Right after starting the system, messages written through the MQTT
plugin would appear almost intantaneously on the broker. After running
the system for 2-3 minutes messages received by the broker were the ones
written to the mqtt plugin 20 seconds earlier, i.e. messages were
delayed by 20 seconds in less than 180 seconds. If the system were left
running, the delay would continue accumulating resulting in messages
being delayed by several hours per day.

[1] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_loop
[2] https://mosquitto.org/api/files/mosquitto-h.html#mosquitto_max_inflight_messages_set

Signed-off-by: Hannu Lounento <hannu.lounento@iki.fi>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet < 49B9 /div>
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0