8000 Comparing 8.1.1...8.1.2 · valkey-io/valkey · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: valkey-io/valkey
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 8.1.1
Choose a base ref
...
head repository: valkey-io/valkey
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: 8.1.2
Choose a head ref
  • 15 commits
  • 24 files changed
  • 10 contributors

Commits on Jun 11, 2025

  1. Properly escape double quotes and backslash in sdscatrepr (#2036)

    Fixes #2035, a bug introduced
    in #1342
    
    ---------
    
    Signed-off-by: Fusl <fusl@meo.ws>
    Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
    Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
    Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
    2 people authored and hpatro committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    e5686ba View commit details
    Browse the repository at this point in the history
  2. Fix random element in skewed sparse hash table (#2085)

    In cases when half of the elements in a hashtable have been deleted in
    scan order, i.e. using scan and deleting the returned elements, the
    random selection algorithm becomes very slow. Also, if a random element
    is deleted repeatedly, this can make the hash table to become skewed and
    make random selection very slow. Behavior without this fix:
    
    * The fair random algorithm samples some elements by picking a random
      cursor and then scanning from that point until it finds a number of
      elements and then picking a random one of them.
    * Now, assume there is an empty sequence of buckets (in scan order).
    * When we pick a random cursor and it points into this empty sequence,
      we continue scanning until we find some elements and then pick one of
      these. Therefore, the buckets just after an empty sequence are more
      likely to be picked than other buckets.
    * If we repeat SPOP many times, this area just after the empty section
      will be emptied too and the empty section grows, the problem gets
      worse and worse.
    
    Bucket distribution (x = non-empty bucket, . = empty bucket)
    
        x..xx.x.xx.x.........x.xx..x.x..x.xx..x.x
                      ^      ^ ^^  ^ ^
                      |      | ||  | |
                    random   elements
                    cursor    sampled
    
    This PR changes the sampling to pick a new random index to scan for each
    new bucket chain that has been sampled. This is more similar to how the
    fair random works in dict.c.
    
    Additionally, this PR includes two small optimizations/tuning:
    
    1. The change allows us to lower the sample sizes for fair random
       element selection. There are unit tests to verify the fairness of the
       algorithm.
    2. Additionally, the sampling size is decreased even more in very sparse
       tables to prevent it from sampling empty buckets many times which
       makes it slow.
    
    One existing unit test is changed from a stack-allocated to a
    heap-allocated array, because it was observed to use too much stack
    memory when running locally with the `--large-memory` option.
    
    Fixes #2019.
    
    Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
    Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
    zuiderkwast authored and hpatro committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    b62e778 View commit details
    Browse the repository at this point in the history
  3. Only mark the client reprocessing flag when unblocked on keys (#2109)

    When we refactored the blocking framework we introduced the client
    reprocessing infrastructure. In cases the client was blocked on keys, it
    will attempt to reprocess the command. One challenge was to keep track
    of the command timeout, since we are reprocessing and do not want to
    re-register the client with a fresh timeout each time. The solution was
    to consider the client reprocessing flag when the client is
    blockedOnKeys:
    
    ```
    if (!c->flag.reprocessing_command) {
            /* If the client is re-processing the command, we do not set the timeout
             * because we need to retain the client's original timeout. */
            c->bstate->timeout = timeout;
        }
    ```
    
    However, this introduced a new issue. There are cases where the client
    will consecutive blocking of different types for example:
    ```
    CLIENT PAUSE 10000 ALL
    BZPOPMAX zset 1
    ```
    would have the client blocked on the zset endlessly if nothing will be
    written to it.
    
    **Credits to @uriyage for locating this with his fuzzer testing**
    
    The suggested solution is to only flag the client when it is
    specifically unblocked on keys.
    
    ---------
    
    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
    Co-authored-by: Binbin <binloveplay1314@qq.com>
    Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
    2 people authored and hpatro committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    9cd34db View commit details
    Browse the repository at this point in the history
  4. Fix bad slot used in sharded pubsub unsubscribe (#2137)

    This commit fixes two issues in `pubsubUnsubscribeChannel` that could
    lead to memory corruption:
    
    1. When calculating the slot for a channel, we were using getKeySlot()
    which might use the current_client's slot if available. This is
    problematic when a client kills another client (e.g., via CLIENT KILL
    command) as the slot won't match the channel's actual slot.
    
    2. The `found` variable was not initialized to `NULL`, causing the
    serverAssert to potentially pass incorrectly when the hashtable lookup
    failed, leading to memory corruption in subsequent operations.
    
    The fix:
    - Calculate the slot directly from the channel name using keyHashSlot()
    instead of relying on the current client's slot
    - Initialize 'found' to NULL
    
    Added a test case that reproduces the issue by having one client kill
    another client that is subscribed to a sharded pubsub channel during a
    transaction.
    
    Crash log (After initializing the variable 'found' to null, without
    initialization, memory corruption could occur):
    ```
    VALKEY BUG REPORT START: Cut & paste starting from here ===
    59707:M 24 May 2025 23:10:40.429 # === ASSERTION FAILED CLIENT CONTEXT ===
    59707:M 24 May 2025 23:10:40.429 # client->flags = 108086391057154048
    59707:M 24 May 2025 23:10:40.429 # client->conn = fd=11
    59707:M 24 May 2025 23:10:40.429 # client->argc = 0
    59707:M 24 May 2025 23:10:40.429 # === RECURSIVE ASSERTION FAILED ===
    59707:M 24 May 2025 23:10:40.429 # ==> pubsub.c:348 'found' is not true
    
    ------ STACK TRACE ------
    
    Backtrace:
    0   valkey-server                       0x0000000104974054 _serverAssertWithInfo + 112
    1   valkey-server                       0x000000010496c7fc pubsubUnsubscribeChannel + 268
    2   valkey-server                       0x000000010496cea0 pubsubUnsubscribeAllChannelsInternal + 216
    3   valkey-server                       0x000000010496c2e0 pubsubUnsubscribeShardAllChannels + 76
    4   valkey-server                       0x000000010496c1d4 freeClientPubSubData + 60
    5   valkey-server                       0x00000001048f3cbc freeClient + 792
    6   valkey-server                       0x0000000104900870 clientKillCommand + 356
    7   valkey-server                       0x00000001048d1790 call + 428
    8   valkey-server                       0x000000010496ef4c execCommand + 872
    9   valkey-server                       0x00000001048d1790 call + 428
    10  valkey-server                       0x00000001048d3a44 processCommand + 5056
    11  valkey-server                       0x00000001048fdc20 processCommandAndResetClient + 64
    12  valkey-server                       0x00000001048fdeac processInputBuffer + 276
    13  valkey-server                       0x00000001048f2ff0 readQueryFromClient + 148
    14  valkey-server                       0x0000000104a182e8 callHandler + 60
    15  valkey-server                       0x0000000104a1731c connSocketEventHandler + 488
    16  valkey-server                       0x00000001048b5e80 aeProcessEvents + 820
    17  valkey-server                       0x00000001048b6598 aeMain + 64
    18  valkey-server                       0x00000001048dcecc main + 4084
    19  dyld                                0x0000000186b34274 start + 2840
    ````
    
    ---------
    
    Signed-off-by: Uri Yagelnik <uriy@amazon.com>
    Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
    uriyage authored and hpatro committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    7e5ace1 View commit details
    Browse the repository at this point in the history
  5. Free module context even if there was no content written in auxsave2 (#…

    …2132)
    
    When a module saves to the Aux metadata with aux_save2 callback, the
    module context is not freed if the module didn't save anything in the
    callback.
    
    https://github.com/valkey-io/valkey/blob/8.1.1/src/rdb.c#L1277-L1284.
    Note that we return 0, however we should be doing the cleanup done on
    https://github.com/valkey-io/valkey/blob/8.1.1/src/rdb.c#L1300-L1303
    still.
    
    Fixes #2125
    
    Signed-off-by: Jacob Murphy <jkmurphy@google.com>
    Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
    murphyjacob4 authored and hpatro committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    d1e75ed View commit details
    Browse the repository at this point in the history
  6. Detect SSL_new() returning NULL in outgoing connections (#2140)

    When creating an outgoing TLS connection, we don't check if `SSL_new()`
    returned NULL.
    
    Without this patch, the check was done only for incoming connections in
    `connCreateAcceptedTLS()`. This patch moves the check to
    `createTLSConnection()` which is used both for incoming and outgoing
    connections.
    
    This check makes sure we fail the connection before going any further,
    e.g. when `connCreate()` is followed by `connConnect()`, the latter
    returns `C_ERR` which is commonly detected where outgoing connections
    are established, such where a replica connects to a primary.
    
    ```c
    int connectWithPrimary(void) {
        server.repl_transfer_s = connCreate(connTypeOfReplication());
        if (connConnect(server.repl_transfer_s, server.primary_host, server.primary_port, server.bind_source_addr,
                        server.repl_mptcp, syncWithPrimary) == C_ERR) {
            serverLog(LL_WARNING, "Unable to connect to PRIMARY: %s", connGetLastError(server.repl_transfer_s));
            connClose(server.repl_transfer_s);
            server.repl_transfer_s = NULL;
            return C_ERR;
        }
    ```
    
    For a more thorough explanation, see
    #1939 (comment).
    
    Might fix #1939.
    
    Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
    Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
    zuiderkwast authored and hpatro committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    b09e4c1 View commit details
    Browse the repository at this point in the history
  7. Correctly cast the extension lengths (#2144)

    Correctly use a 32 bit integer for accumulating the length of ping
    extensions.
    
    The current code may accidentally truncate the length of an
    extension that is greater than 64kb and fail the validation check. We
    don't currently emit any extensions that are this length, but if we were
    to do so in the future we might have issues with older nodes (without
    this fix) will silently drop packets from newer nodes. We should
    backport this to all versions.
    
    Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
    Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
    madolson authored and hpatro committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    0a3186a View commit details
    Browse the repository at this point in the history
  8. Incorporate Redis CVE for CVE-2025-27151 (#2146)

    Resolves #2145
    
    Incorporate the CVE patch that was sent to us by Redis Ltd.
    
    ---------
    
    Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
    Co-authored-by: Ping Xie <pingxie@outlook.com>
    Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
    2 people authored and hpatro committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    35c0138 View commit details
    Browse the repository at this point in the history
  9. Fix UBSAN run for hashtable unittest (#2126)

    The unit test declared an array of size 1 and used pointers to elements
    outside the array. This is fine because the pointers are never
    dereferenced, but undefined-sanitizer complains.
    
    Now, instead allocate a huge array to make sure all pointers into it are
    valid.
    
    Example failure:
    
    https://github.com/valkey-io/valkey/actions/runs/15175084203/job/42673713108#step:10:123
    
    Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
    Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
    zuiderkwast authored and hpatro committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    2f538e7 View commit details
    Browse the repository at this point in the history
  10. CI: Run unit tests in 32-bit build

    Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
    Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
    zuiderkwast authored and hpatro committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    c5629e1 View commit details
    Browse the repository at this point in the history
  11. Fix cluster myself CLUSTER SLOTS/NODES wrong port after updating port…

    …/tls-port (#2186)
    
    When modifying port or tls-port through config set, we need to call
    clusterUpdateMyselfAnnouncedPorts to update myself's port, otherwise
    CLUSTER SLOTS/NODES will be old information from myself's perspective.
    
    In addition, in some places, such as clusterUpdateMyselfAnnouncedPorts
    and clusterUpdateMyselfIp, beforeSleep save is added so that the
    new ip info can be updated to nodes.conf.
    
    Remove clearCachedClusterSlotsResponse in updateClusterAnnouncedPort
    since now we add beforeSleep save in clusterUpdateMyselfAnnouncedPorts,
    and it will call clearCachedClusterSlotsResponse.
    
    Signed-off-by: Binbin <binloveplay1314@qq.com>
    enjoy-binbin authored and hpatro committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    6174112 View commit details
    Browse the repository at this point in the history
  12. Fix replica can't finish failover when config epoch is outdated (#2178)

    When the primary changes the config epoch and then down immediately,
    the replica may not update the config epoch in time. Although we will
    broadcast the change in cluster (see #1813), there may be a race in
    the network or in the code. In this case, the replica will never finish
    the failover since other primaries will refuse to vote because the
    replica's slot config epoch is old.
    
    We need a way to allow the replica can finish the failover in this case.
    
    When the primary refuses to vote because the replica's config epoch is
    less than the dead primary's config epoch, it can send an UPDATE packet
    to the replica to inform the replica about the dead primary. The UPDATE
    message contains information about the dead primary's config epoch and
    owned slots. The failover will time out, but later the replica can try
    again with the updated config epoch and it can succeed.
    
    Fixes #2169.
    
    ---------
    
    Signed-off-by: Binbin <binloveplay1314@qq.com>
    Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
    Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
    Co-authored-by: Harkrishn Patro <bunty.hari@gmail.com>
    Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
    4 people committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    5cf92a8 View commit details
    Browse the repository at this point in the history
  13. CLIENT UNBLOCK should't be able to unpause paused clients (#2117)

    When a client is blocked by something like `CLIENT PAUSE`, we should not
    allow `CLIENT UNBLOCK timeout` to unblock it, since some blocking types
    does not has the timeout callback, it will trigger a panic in the core,
    people should use `CLIENT UNPAUSE` to unblock it.
    
    Also using `CLIENT UNBLOCK error` is not right, it will return a UNBLOCKED
    error to the command, people don't expect a `SET` command to get an error.
    
    So in this commit, in these cases, we will return 0 to `CLIENT UNBLOCK`
    to indicate the unblock is fail. The reason is that we assume that if
    a command doesn't expect to be timedout, it also doesn't expect to be
    unblocked by `CLIENT UNBLOCK`.
    
    The old behavior of the following command will trigger panic in timeout
    and get UNBLOCKED error in error. Under the new behavior, client unblock
    will get the result of 0.
    ```
    client 1> client pause 100000 write
    client 2> set x x
    
    client 1> client unblock 2 timeout
    or
    client 1> client unblock 2 error
    ```
    
    Potentially breaking change, previously allowed `CLIENT UNBLOCK error`.
    Fixes #2111.
    
    Signed-off-by: Binbin <binloveplay1314@qq.com>
    enjoy-binbin authored and hpatro committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    698208e View commit details
    Browse the repository at this point in the history
  14. Add packet-drop to fix the new flaky failover test (#2196)

    The new test was added in #2178, obviously there may be
    pending reads in the connection, so there may be a race
    in the DROP-CLUSTER-PACKET-FILTER part causing the test
    to fail. Add CLOSE-CLUSTER-LINK-ON-PACKET-DROP to ensure
    that the replica does not process the packet.
    
    Signed-off-by: Binbin <binloveplay1314@qq.com>
    enjoy-binbin authored and hpatro committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    51f8cd5 View commit details
    Browse the repository at this point in the history
  15. Release notes for 8.1.2

    Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
    Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
    hpatro committed Jun 11, 2025
    Configuration menu
    Copy the full SHA
    6d2dcda View commit details
    Browse the repository at this point in the history
Loading
0